| 4:21 am on Dec 5, 2008 (gmt 0)|
Yes, that is weird - especially the different snippets thing. And the canonical issue getting past the duplicate filtering is another anomaly. I'm glad this is polluting a lot of searches.
| 6:43 am on Dec 5, 2008 (gmt 0)|
Perhaps they showed www.example.com/ as the 1st result and the same page as example.com/ as the indent because both of those URLs (Google sees them as separate pages) rank very strongly for the search term. What if they did devalue the content of one of canonicals due to the duplicate content filter but after it was flagged as a duplicate it happened to recieve lots more backlinks from more authoritative, relevant sites than the one considered the original. The duplicate canonical may still rank as well or higher than the original due to to number and strength of its back links and the anchor text in its back links. The content, title tags, headers, etc on the page are the same. You would think the 2 URLs for the same page would rank differently from one another other only because of the number and strength of back links each has, anchor text used on each of the back links, and one URL being considered a duplicate and the other URL considered the original version of the content.
Did one snippet use the page's meta description and did the other have a snippet pieced together by Google using sentence fragments from the content on the page that contained the words in the search phrase? If so perhaps the reason they don't show the exact same snippet is that they consider it a bad user experience to duplicate the same snippet for a main listing and the corresponding indent. They view the two URLs as separate pages. They don't differentiate between 2 URLs for the same page and 2 URLs for 2 different pages. They probably do that anytime a main URL has the same meta description as the indent. This probably happens a lot since many sites have pages with duplicate meta descriptions. Some use the same meta description for all pages on their site.
Just a guess... but sounds semi-logical to me.
[edited by: ZydoSEO at 7:03 am (utc) on Dec. 5, 2008]
| 10:24 am on Dec 5, 2008 (gmt 0)|
yes deliberately not using the same snippet could be a reason. Today its even stranger as the first result which is the www also gets an indented listing and its a different page to the indent for the non-www. So questions here are how are the duplicate sites ending up 1-4. And if the reason for different snippets and indented pages is determined by the the other sites in the results that would be interesting too.
| 10:26 am on Dec 5, 2008 (gmt 0)|
|And the canonical issue getting past the duplicate filtering is another anomaly. I'm glad this is polluting a lot of searches. |
sorry if im having a really dumb day Tedster but are you saying its good that the canonical filter isnt working?
| 11:48 am on Dec 5, 2008 (gmt 0)|
I was once in a situation where someone had mistyped my url in anchor text. They had put swww instead or www. www.example.com/ ranked at #1 with sitelinks and with a second page directly underneath. Over on the 2nd page of results swww.example.com/ (resolved with a 200) came in at #11, with an indented listing beneath but was a different page to that shown at #2 on the first page.
Suggests to me that anchor text is involved here and perhaps the content of the linking pages determine what Google deems relevent for the snippet.
I guess another possibility here is that something deceptive is going on and Google is seeing something different? Or is it possible a page can change dramatically and Google observes two different snapshots of each canonical version taken at different times and this is how the site slips the duplicate content filter until reindex? Do let us know if any of these pages disappear. I'm also curious to know what part of the page Google is pulling the different snippets from on each?
[edited by: ChicagoFan67 at 12:05 pm (utc) on Dec. 5, 2008]
| 2:52 pm on Dec 5, 2008 (gmt 0)|
My guess is the canonical filter is actually working. For the version of the URL considered a duplicate, Google probably devalues (not to zero but by some percentage) the weight normally carried by the on-page content in the ranking algorithm. But there are many other factors that are used to determine rank. There are off-page factors that weigh heavily in the ranking algorithm like the strength of inbound links, the link text used in those links, etc.
To my knowledge there is not a Google 'penalty' for duplicate content per say where they ban the duplicate from the index or force it back to position 950. If you search for obscure text from a page you know has been duplicated on multiple pages on your site or duplicated on other sites, you will frequently see your content show up under multiple URLs (yours and theirs) on page one of the SERPs.
But who knows... Maybe I'm smokin' crack! ;)
| 4:07 pm on Dec 5, 2008 (gmt 0)|
I think I agree;
Was it a search that had few results to offer? That might account for the canonical result getting through; a simple lack of competing sites? When a 'canonical sinner' has many links, it is possible for both versions to have page rank.
Not sure about the snippet thing, unless anchor text had something to do with it?
| 11:21 pm on Dec 5, 2008 (gmt 0)|
I just checked and the non-www has now gone.
| 12:03 am on Dec 6, 2008 (gmt 0)|
|Not sure about the snippet thing, unless anchor text had something to do with it? |
Good possibility. I was also thinking about this, and how this SERPs anomaly might give us a clue to how Google tags urls for retrieval in the SERPs.
We've seen cases where exact text does not bring up the url even though it's clearly available for other searches. And clearly the snippets are related to the query terms. In this case, the lower ranking "no-www" version had a different tagging (lower PR?) and a different snippet associated with it. Backlink anchor text is a good guess for a reason there.
I hope we get to spot a few more anomalies over time - they always offer some infrastructure hints, but it takes more than one case to get the big "Aha!" moment.
| 12:15 am on Dec 6, 2008 (gmt 0)|
Two days ago, I pointed Matt Cutts to a dodgy business that was recently featured on a UK TV consumer protection show as being a scam, and where both of their .com and both of their .net domains; that is, both as a single and as a plural word in the hostname (so four sites in all); were all listed in the top 20 results.
[edited by: g1smd at 1:08 am (utc) on Dec. 6, 2008]
| 12:20 am on Dec 6, 2008 (gmt 0)|
Now I'm wondering if there's a connection with the Halloween gremlins Google struggled with, and the "ghost data-set". We were discussing this in the November SERPs Changes thread [webmasterworld.com].
| 9:16 am on Dec 6, 2008 (gmt 0)|
ok, im pretty sure the site had pr but now its showing as greyed for both versions despite being 4 years old.