|Webmaster Tools - unrealistically high "Ever crawled" number|
I am exploring the webmaster tools a bit. In the index status/advanced I noticed a very high number for 'ever crawled'.
Site size approx 60 pages
approx 47 pages indexed
not selected 68
blocked by robots 0
ever crawled 590,000 +
crawl error approx 8 per day
The site was built in frontpage (I know, i'm migrating my sites to joomla).. The site had adsense, other affiliates and had a forum at one time.
My questions are
1. Am I correct in assuming that the 592,000+ 'ever crawled' could mean that that number of pages was crawled at one time in the past.
2. Is this a problem? if so, where else do I look.?
I'd really appreciate any thoughts...
Crawled seems to mean "requested" - so that would include even 404 responses. Unfortunately, clicking on the "learn more" link results in a Page Not Found message, so we can't really tell, officially at least.
I can tell you that almost every site I check also has impossible numbers here unless 404 responses are part of the picture. However, your number is a lot more impossible than the ones that I see.
- 592,000+ pages were crawled in all the time (history of your site).
- Mybe the problem is that you don't have a clue about where they came from. How many urls return the site:example.com command?
It appears to be every URL crawled whether valid or not. It includes all status codes returned.
I see a huge 'ever crawled' number on a site that previously had 'infinite duplicate content', but now that site shows just a few thousand URLs indexed. Interestingly that figure is approx 10x the number that appears for the site: search.
"ever crawled" data is cumulative over the life of the site which is why the numbers are so high.
Another possible gotcha with the data in this graph: the "not selected" number appears to include pages that used to exist but which now 301 redirect to another page.
I know this because one of my sites with 40,000 pages had 2 pages about each topic. I combined into one page about each topic and 301 redirected so that I have 20,000 pages on the site. As Googlebot found these redirects over the course of a few months, the "indexed" line fell from 40,000 to 20,000. However, the "not selected" line grew from 1,000 to 20,000 over the same period. A mirror image of the "indexed" line.
I wouldn't expect 301 redirects to be considered pages that were "not selected" for the index. I would expect "not selected" to be actual pages.
|clicking on the "learn more" link results in a Page Not Found message |
Which one? The 'learn more' link from the question mark next to "total indexed" on Index Status, or in the grey text under the graph (same link) currently takes me to
|The Basic tab displays the following data: |
* Ever crawled: The cumulative total of URLs on your site that Google has ever crawled. Not all crawled URLs get indexed, and Google may discover some URLs by other means such as inbound links from other sites. This number should increase over time as new pages are added to your site.
* Total indexed: The total number of URLs currently in Google's index. These URLs are available to appear in search results, along with other URLs Google may discover by other means. This number will change over time, as new pages are added and indexed, and old pages are removed. The number of indexed URLs is almost always significantly smaller than the number of crawled URLs, because it does not include URLs that have been identified as duplicates or non-canonical, or less useful, or that contain a meta noindex tag.
Apparently I don't rate an "ever crawled" because all I get is "total indexed" ::sob::
Hm. Wonder where those 29 pages went? (Difference between highest point on graph, and current number.) Maybe they got stolen by That Other Search Engine; their Total Indexed has been going back up.
"More information" for "not selected" takes you in turn to
"Less useful" is an infuriatingly fuzzy phrase isn't it?
Mine is 40 million. I need to take a closer look and get my head around the figures.
@lucy24 : The "Ever Crawled" is under the "Advanced" tab.. not sure if you have that or not but it should be right above the Total Indexed graph on the Index Status section of WMT.
Aha. Just when I'd got used to the idea that, in googlespeak, "advanced" means "show percentage change". Wrong page.
Does every single one of the "learn more" links lead to the same page?