|How bad are duplicate pages anyway really?|
If on the same site and not done on purpose
| 1:36 am on Apr 7, 2004 (gmt 0)|
The issue of duplicate pages comes up pretty frequently, and as far as I know Google will just index one. I hope that is the case and sites don't get penalized for having a few pages that are the same.
I've never had this problem before, but I am now creating pages from a database that really isn't constructed that well, and some products are showing up in more than one place on it. I can't figure out a way to keep from sometimes showing the same product in more than one of the category pages, and then I end up with some product detail pages being duplicated.
I will try to find a way to avoid it, but my programming skills aren't that great so I hope sites don't get hurt bad by a few pages being duplicated.
| 1:52 am on Apr 7, 2004 (gmt 0)|
No biggie really. They will index both (they always do). The second page found will just rank lower than the first is all... so what?
| 2:17 am on Apr 7, 2004 (gmt 0)|
About a year ago I had a site structured with all the product pages in the root directory (only about 50 or so) as follows:
root/widget1.htm, widget2.htm, widget3.htm, widget4.htm, index.html etc
But as I developed more and more product pages I decided to split them into categories as there were now so many, as follows:
root/blue widgets/widget1.htm, widget2.htm
/red widgets/widget3.htm, widget4.htm
So I uploaded my new 'blue widgets' folders with the relevant product pages but forgot to delete the exact same product pages that were in my root directory, so I now had two versions of every single product page on my site (Doh!):
root/blue widgets/widget1.htm, widget2.htm
/red widgets/widget3.htm, widget4.htm
I didn't even notice until a month later when my PR bar went grey and my entire site dropped out of the index. There was no other explanation for such a penalty - the new pages had been indexed, found to be duplicates of the existing ones, so zap, PR goes grey and you're outta here.
I immediately deleted the rogue pages in the root directory (which were now orphans anyway) and emailed Google explaining my oversight, insisting I wasn't trying to get 2 versions of all my product pages indexed, but all they said was I was welcome to send in a reinclusion request if I thought my site now met their guidelines.
I didn't bother as I'd read once you have a penalty it's difficult to shake off, so I put the exact same site on a new domain (obviously without the product pages in the root) and it's been fine ever since.
So the moral of this looooooooooong story is be VERY careful when it comes to duplicates, intentional or otherwise. In my experience Google can take a very hard line. :(
| 3:18 am on Apr 7, 2004 (gmt 0)|
soccer_star - I was hoping as I read your post that the duplicates presented no problem. I did something similar to what you did about a week ago, but the pages only stayed up over night.
I have two directories of products due to the large number of pages and upload them as a zip and decompress on the server. When decompressing one of the files, I misspelled the directory name it another directory was created. This new directory has no links to it from anywhere.
Problem is the Google spidered 10,000 pages that night. I hope Brett is correct. I believe I'll have a look at the log files to see what Google did that night.
| 8:24 am on Apr 7, 2004 (gmt 0)|
I don't know if duplicated pages are a big problem; if you try to search something related to php you'll fine hundreds of sites in the serps with the copy of the php man page. You can call them mirror, duplicated pages or other, but it's that
| 2:27 pm on Apr 7, 2004 (gmt 0)|
I've had a lot of duplicated content and have never been penalized. Change the URLs and of course your going to be deleted. Googles not going to keep dead links indexed. Give it time to recrawl the site!
| 7:55 pm on Apr 7, 2004 (gmt 0)|
I hope Brett is right! If not and duplicate pages is what caused the problems for soccer_star, at least in my case it won't be very many pages that would be duplicated, just a few, so maybe a site still wouldn't penalized for it.
black - I don't worry so much about near duplicate pages on different sites, especially, since I don't do that myself with my own sites.
I think I may have found a way to avoid it, at least some of the potential duplicates. I'll try it today or tomorrow and see if I can do it or not.
I can see how Google may think someone would be trying to get multiple listings for a search term by putting duplicate pages on one site. But since it can be by accident to me it would make more sense for them to just index one rather than penalize a site.
| 10:46 pm on Apr 7, 2004 (gmt 0)|
I think my problem was by the time I reorganised my site I had well over 1,000 product pages and about 30 non-product pages. So with all the product pages duplicated that meant 97% of my site was duplicates - I think that's what upset Google.
Like I said, I moved the exact same site to a different url, killed the old one completely and started again (a VERY painstaking process) and within 4 months I was back to where I was before. Since then no problems at all, which means it MUST have been the duplication as I was booted out within days of the duplicates being indexed.
Hopefully for you Trisha, if it's just a few pages rather than 97% of your pages that are duplicated you will be ok.
| 11:47 pm on Apr 7, 2004 (gmt 0)|
Are you sure you didn't have "orphaned pages" which just happened to be duplicates or very similar to other pages within your site?
Duplicates may not cause a problem ... I don't really know anymore as I (thankfully) haven't made that particular mistake recently. But orphans (pages with no links to them from anywhere within your site) certainly are a problem and can be considered doorway pages, for which you will most certainly be penalized.
| 12:58 am on Apr 8, 2004 (gmt 0)|
I have another issue with duplication. I have a website ranked well in Google and several other engines, but the index is pretty low on Yahoo. When I started this site, I purchased a similar domain and had the 'domain forward' feature added to it. So when you click on the 2nd domain, it opens my existing website, but all pages have the 2nd url in the address bar. That is the only difference.
In addition, this 2nd domain shows up for the same backlinks that my Original domain has. I have never optimized for the 2nd one. I purchased Inktomi submit for both last year. Recently I upgraded Domain #1 to Overture site match, but Domain #2 is still riding under Inktomi.
Now Domain #2 (in the last few weeks) has been showing up for all kinds of keywords on the 1st page of Yahoo. But Domain #1 is still not to be seen.
Has anyone ever heard of this? Also, will/can this hurt me in regards to 'duplicate sites', even though I only have 'one' site...and 2 domains? They seem to be treated as the same site. I am confused. Any ideas?
| 1:02 am on Apr 8, 2004 (gmt 0)|
They were orphans, yes. When I changed my directory structure the pages in my root folder became orphans and I simply forgot to delete them.
Do you think this is what annoyed Google, that they thought my orphaned pages were doorways? Interesting, I hadn't even considered that yet it makes sense.
I thought Google overreacted and were a bit harsh kicking me out of the index for duplicates, this probably explains it! :)
| 1:07 am on Apr 8, 2004 (gmt 0)|
I'd put money on it! I stupidly left two dead product pages (on two separate occassions ... doh :( ) on the server after having deleted all links to the page. At that time, Google was using the now infamous PR0 or -20 penalty. I got the -20 both times.
Its easy to do if you are juggling more than one task at a time and I usually have a phone hanging off each ear while eating my lunch and trying to update a page at the same time.
Since then, I make darned sure I delete the page first and then the links!