homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Visit PubCon.com
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

Google says duplicate but Google is wrong

 12:57 am on Jan 5, 2008 (gmt 0)

Has anyone found Google not indexing pages which seem like duplicates but are not?

I have a page about widgets which leads to loads of pages about subwidgets. The widgets page was indexed and ranked OK until I introduced two pages about widget finishing.

Now the the top widget page seems to have gone out of the index (a search for specific text on the page yields not found). The two widget finishing pages are also not in the index.

The pages are no way duplicates though invevitably lots of words are similar. They all feature key words - something like red widget - in either the meta title or h3 tags.

Every other page seems to be indexed ok (except for the sub sub widgets product pages but thats an old story...).

Is Google just being careless? Any ideas how I could put this right?


Robert Charlton

 3:31 am on Jan 5, 2008 (gmt 0)

frances - You don't say specifically what indication Google is giving you that these pages are duplicates, but your title is very precise: "Google says duplicate but Google is wrong".

Your message also suggests that the pages are similar enough to each other that you think Google might be incorrectly looking at them as dupes, and that they've all disappeared from the index.

How similar are these pages? Are the titles the same, or are they different? What does appear in the index when you search for the text strings from the widgets page?

Did the widget finishing pages ever appear in the index? Are there unique text strings on the finishing pages that are unique to them and not also in the widgets page, which you could search for to see if they've gotten indexed?

Google doesn't penalize for duplication, btw... it filters the duplicates out of the serps, but they do remain in the index. I'd think if Google regarded them as duplicates, at least one of the three pages would show up in the serps and the other two would drop out. Have you tried a site:domain search to see if they show up?

To see if a duplicate filter has been applied, do your search and then append &filter=0 to the Google url in your address bar, and then click Go and see if your pages appear. The text string should remove the duplication filter... and if the duplication is the problem, your pages should show up.

You might also check your widgets page in Copyscape, and look at the Copyscape caches of pages it indicates are duping your content.


 4:17 am on Jan 5, 2008 (gmt 0)

Thankyou for taking the trouble to look into this and sorry for not being more precise - or even accurate. When I said Google says - Google says nothing. I just assumed because I couldnt think of any other reason.

To answer the questions I should have stated:

The pages are about similar subjects, the meta tags are all different (though something like big red widget appears in two titles though not in the same place), the text on each page is unique - cant see any identical strings except for "big red widget" and "really red" - both of which are moderately competitive search terms.

When I search for specific phrases in google I get your search for "bla bla bla" did not match any documents.

Adding &filter=0 to the search query makes no difference - though loads of pages from the site returned (the phrase is in the top menu)

The url doesnt show in a site search or in webmaster tools

I dont know if the widget finishing pages ever appeared in the serps. Other pages I introduced at the same time are there.

So from what you say it looks like its not duplicate content.

But I can't believe its a supplemental results issue - the site has 7 top level menu categories. The widgets page is one of them so it has 100s of internal links and some (not many) external ones as well (ezine articles is one).

It seems unlikely that something is wrong with the site architecture/backend also, because all the other top level pages are ok and the widgets page is generated identically so I cant see how it can be that.

One other thing - dont know if it is relevant - in the Google sitelinks for this site, the other pages in the top menu get their own sitelink. Widgets are listed in the sitelinks, but the link is to the home page. The home page emphasises widgets no more than the other categories.

It isn't possible that Google has just got confused and will sort it out in time? The site only has a pr of 4 so maybe it's just not that bothered? Or are there any other possibilities?


 9:11 am on Jan 5, 2008 (gmt 0)

Similar topics or keywords do not a "duplicate" make. Substantial amounts of identical text result in a duplicate page. The problem is something else, possibly insufficient links to the page or the page simply dropping out temporarily for no particular reason (especially if it's fairly new).


 10:31 am on Jan 5, 2008 (gmt 0)

are you using sitemap.xml?
if so is this url in there and when did google last retrieve your sitemap?

have you checked your response status chain and your response headers to make sure everything looks correct?
(no 302's, correct doc type, etc)

have you validated your html [validator.w3.org]?

what mechanism are you using for navigation?
(anchor tags or flash/java/javascript/etc?)


 1:05 pm on Jan 5, 2008 (gmt 0)

Page got c.200 internal links and few external and it is old. It is in sitemap.xml and google retrieved that 2 days ago. Response headers fine. html validates perfectly. Navigation all html links.

Its in Yahoo and Live Search for its search terms. It still has toolbar pr but is nowhere else in Google.

I just can't think what it could be...


 11:06 pm on Jan 5, 2008 (gmt 0)

What do you get for these two searches?

site:domain.com -inurl:www



 1:17 am on Jan 6, 2008 (gmt 0)

site:domain.com -inurl:www returns

Your search - site:domain.com -inurl:www - did not match any documents.

The set up is a bit wierd and the company have been slow to let me change it - but I guess that can't be the problem either.

Any other suggestions?


 5:05 am on Jan 6, 2008 (gmt 0)

There are so many possibilities but for starters are all the meta descriptions and meta key words different on each page? That can be a problem area.

Robert Charlton

 6:24 am on Jan 6, 2008 (gmt 0)

Meta keywords are simply not going to affect how a site ranks or whether it's indexed.

There's some disagreement about identical meta descriptions with regard to supplemental status... and I don't want to pull us into that debate again, except to offer my opinion that identical meta descriptions would not cause pages to drop from the index.


 6:24 pm on Jan 6, 2008 (gmt 0)

Metadescriptions and keywords all different...


 11:40 pm on Jan 6, 2008 (gmt 0)

Thanks for the info on meta information Robert. That makes sense. But could repeated metas suppress the pages and in some cases put them in the supplemental category?

Sometimes it's hard to tell what is speculation and what is fact.


 12:13 am on Jan 7, 2008 (gmt 0)

Repeated meta descriptions have historically been a very real culprit for sending borderline urls to the Supplemental index - and differentiating the meta descriptions has been a quick remedy. Historically the supplemental urls were only indexed by "top level" factors, and that included title elements and meta descriptions most promominently.

Notice lately that the "snippets team" has become more visible, with blog posts and mentions, added information to GWT and so on? If you want a good listing, give the snippets team something solid to work with. The meta description is not part of the ranking algorithm, but it does get your url into the horse race. At least that's how I see it.

Global Options:
 top home search open messages active posts  

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved