g1smd - 7:53 pm on Apr 25, 2010 (gmt 0)
There's more! This concerns another site, one that has been online since last Autumn.
The pages were originally using
non-www, and all were internally linking to the
non-www. At that time there was no canonical
www->non-www 301 redirect in place.
At the end of the year, the internal links were all changed from
www and a canonical
non-www->www 301 redirect was added at the same time.
This was done because although a large number of the
non-www pages of the site were indexed,
non-www root was not indexed,
www root was indexed,
- all external incoming links pointed to the
www version anyway.
Now, four months later, the situation shows even more odd results:
site:example.com - 420~ results (all URLs listed are www)
site:www.example.com - 420~ www results
site:example.com -inurl:www - 850~900 non-www results (most without cache link).
WMT reports show Google pulling both
www URLs every day (but mostly www). Internal link reports show a large number of
www->www links and the number growing quite fast, and a much smaller number of
non-www->non-www links with the number shrinking slowly.
The number of
www->www internal links now listed by Google is much higher than the highest number ever listed for
non-www->non-www internal links.
The interesting point is that while the
-inurl:www" site search returns ~900 results, the
non-www WMT report lists less than 400 internal links concerning 350
non-www URLs. So, Google 'knows' that the
non-www URLs don't link out to anywhere (because they are now redirects) and knows that the URLs redirect, yet has three times more
non-www URLs showing in a
site: search than
www URLs which do return content.
The other point is that the
www site: search lists less than a quarter of the number of URLs listed in the "internal links" WMT report.
So, I'll guess that WMT results look only at the 'main' results and not the stuff in Supplemental (whatever that means these days) so numbers in the
site: search can be higher than WMT in that case; and that not everything listed in WMT will always appear in a
site: search and so the numbers can be lower than WMT in that case.
The disposition of the URLs in question is of key importance. For a very dynamic site with ever-changing content both factors might come into play. So, even if the numbers 'look right' there might still be 'issues'.