| 3:31 am on Jan 5, 2008 (gmt 0)|
frances - You don't say specifically what indication Google is giving you that these pages are duplicates, but your title is very precise: "Google says duplicate but Google is wrong".
Your message also suggests that the pages are similar enough to each other that you think Google might be incorrectly looking at them as dupes, and that they've all disappeared from the index.
How similar are these pages? Are the titles the same, or are they different? What does appear in the index when you search for the text strings from the widgets page?
Did the widget finishing pages ever appear in the index? Are there unique text strings on the finishing pages that are unique to them and not also in the widgets page, which you could search for to see if they've gotten indexed?
Google doesn't penalize for duplication, btw... it filters the duplicates out of the serps, but they do remain in the index. I'd think if Google regarded them as duplicates, at least one of the three pages would show up in the serps and the other two would drop out. Have you tried a site:domain search to see if they show up?
To see if a duplicate filter has been applied, do your search and then append &filter=0 to the Google url in your address bar, and then click Go and see if your pages appear. The text string should remove the duplication filter... and if the duplication is the problem, your pages should show up.
You might also check your widgets page in Copyscape, and look at the Copyscape caches of pages it indicates are duping your content.
| 4:17 am on Jan 5, 2008 (gmt 0)|
Thankyou for taking the trouble to look into this and sorry for not being more precise - or even accurate. When I said Google says - Google says nothing. I just assumed because I couldnt think of any other reason.
To answer the questions I should have stated:
The pages are about similar subjects, the meta tags are all different (though something like big red widget appears in two titles though not in the same place), the text on each page is unique - cant see any identical strings except for "big red widget" and "really red" - both of which are moderately competitive search terms.
When I search for specific phrases in google I get your search for "bla bla bla" did not match any documents.
Adding &filter=0 to the search query makes no difference - though loads of pages from the site returned (the phrase is in the top menu)
The url doesnt show in a site search or in webmaster tools
I dont know if the widget finishing pages ever appeared in the serps. Other pages I introduced at the same time are there.
So from what you say it looks like its not duplicate content.
But I can't believe its a supplemental results issue - the site has 7 top level menu categories. The widgets page is one of them so it has 100s of internal links and some (not many) external ones as well (ezine articles is one).
It seems unlikely that something is wrong with the site architecture/backend also, because all the other top level pages are ok and the widgets page is generated identically so I cant see how it can be that.
One other thing - dont know if it is relevant - in the Google sitelinks for this site, the other pages in the top menu get their own sitelink. Widgets are listed in the sitelinks, but the link is to the home page. The home page emphasises widgets no more than the other categories.
It isn't possible that Google has just got confused and will sort it out in time? The site only has a pr of 4 so maybe it's just not that bothered? Or are there any other possibilities?
| 9:11 am on Jan 5, 2008 (gmt 0)|
Similar topics or keywords do not a "duplicate" make. Substantial amounts of identical text result in a duplicate page. The problem is something else, possibly insufficient links to the page or the page simply dropping out temporarily for no particular reason (especially if it's fairly new).
| 10:31 am on Jan 5, 2008 (gmt 0)|
are you using sitemap.xml?
if so is this url in there and when did google last retrieve your sitemap?
have you checked your response status chain and your response headers to make sure everything looks correct?
(no 302's, correct doc type, etc)
have you validated your html [validator.w3.org]?
what mechanism are you using for navigation?
| 1:05 pm on Jan 5, 2008 (gmt 0)|
Page got c.200 internal links and few external and it is old. It is in sitemap.xml and google retrieved that 2 days ago. Response headers fine. html validates perfectly. Navigation all html links.
Its in Yahoo and Live Search for its search terms. It still has toolbar pr but is nowhere else in Google.
I just can't think what it could be...
| 11:06 pm on Jan 5, 2008 (gmt 0)|
What do you get for these two searches?
| 1:17 am on Jan 6, 2008 (gmt 0)|
site:domain.com -inurl:www returns
Your search - site:domain.com -inurl:www - did not match any documents.
The set up is a bit wierd and the company have been slow to let me change it - but I guess that can't be the problem either.
Any other suggestions?
| 5:05 am on Jan 6, 2008 (gmt 0)|
There are so many possibilities but for starters are all the meta descriptions and meta key words different on each page? That can be a problem area.
| 6:24 am on Jan 6, 2008 (gmt 0)|
Meta keywords are simply not going to affect how a site ranks or whether it's indexed.
There's some disagreement about identical meta descriptions with regard to supplemental status... and I don't want to pull us into that debate again, except to offer my opinion that identical meta descriptions would not cause pages to drop from the index.
| 6:24 pm on Jan 6, 2008 (gmt 0)|
Metadescriptions and keywords all different...
| 11:40 pm on Jan 6, 2008 (gmt 0)|
Thanks for the info on meta information Robert. That makes sense. But could repeated metas suppress the pages and in some cases put them in the supplemental category?
Sometimes it's hard to tell what is speculation and what is fact.
| 12:13 am on Jan 7, 2008 (gmt 0)|
Repeated meta descriptions have historically been a very real culprit for sending borderline urls to the Supplemental index - and differentiating the meta descriptions has been a quick remedy. Historically the supplemental urls were only indexed by "top level" factors, and that included title elements and meta descriptions most promominently.
Notice lately that the "snippets team" has become more visible, with blog posts and mentions, added information to GWT and so on? If you want a good listing, give the snippets team something solid to work with. The meta description is not part of the ranking algorithm, but it does get your url into the horse race. At least that's how I see it.