Welcome to WebmasterWorld Guest from 54.205.89.199

Forum Moderators: Robert Charlton & aakk9999 & andy langton & goodroi

Message Too Old, No Replies

Webmaster Tools - unrealistically high "Ever crawled" number

     
3:14 am on Jul 25, 2012 (gmt 0)

Preferred Member from US 

10+ Year Member

joined:May 6, 2004
posts: 650
votes: 0


I am exploring the webmaster tools a bit. In the index status/advanced I noticed a very high number for 'ever crawled'.

IOW,

Site size approx 60 pages
approx 47 pages indexed
not selected 68
blocked by robots 0
ever crawled 590,000 +
crawl error approx 8 per day


The site was built in frontpage (I know, i'm migrating my sites to joomla).. The site had adsense, other affiliates and had a forum at one time.

My questions are

1. Am I correct in assuming that the 592,000+ 'ever crawled' could mean that that number of pages was crawled at one time in the past.
2. Is this a problem? if so, where else do I look.?


I'd really appreciate any thoughts...

thanks

chris
4:18 am on July 25, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:May 26, 2000
posts:37301
votes: 0


Crawled seems to mean "requested" - so that would include even 404 responses. Unfortunately, clicking on the "learn more" link results in a Page Not Found message, so we can't really tell, officially at least.

I can tell you that almost every site I check also has impossible numbers here unless 404 responses are part of the picture. However, your number is a lot more impossible than the ones that I see.
4:20 am on July 25, 2012 (gmt 0)

New User

10+ Year Member

joined:Mar 31, 2005
posts: 34
votes: 0


  1. 592,000+ pages were crawled in all the time (history of your site).
  2. Mybe the problem is that you don't have a clue about where they came from. How many urls return the site:example.com command?
7:00 am on July 25, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:July 3, 2002
posts:18903
votes: 0


It appears to be every URL crawled whether valid or not. It includes all status codes returned.

I see a huge 'ever crawled' number on a site that previously had 'infinite duplicate content', but now that site shows just a few thousand URLs indexed. Interestingly that figure is approx 10x the number that appears for the site: search.
7:26 am on July 25, 2012 (gmt 0)

Junior Member

5+ Year Member

joined:Jan 30, 2011
posts:53
votes: 0


"ever crawled" data is cumulative over the life of the site which is why the numbers are so high.
9:55 am on July 25, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:May 7, 2003
posts: 750
votes: 0


Another possible gotcha with the data in this graph: the "not selected" number appears to include pages that used to exist but which now 301 redirect to another page.

I know this because one of my sites with 40,000 pages had 2 pages about each topic. I combined into one page about each topic and 301 redirected so that I have 20,000 pages on the site. As Googlebot found these redirects over the course of a few months, the "indexed" line fell from 40,000 to 20,000. However, the "not selected" line grew from 1,000 to 20,000 over the same period. A mirror image of the "indexed" line.

I wouldn't expect 301 redirects to be considered pages that were "not selected" for the index. I would expect "not selected" to be actual pages.
7:31 pm on July 25, 2012 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:13268
votes: 363


clicking on the "learn more" link results in a Page Not Found message

Which one? The 'learn more' link from the question mark next to "total indexed" on Index Status, or in the grey text under the graph (same link) currently takes me to

[support.google.com...]
(emphasis mine)

The Basic tab displays the following data:

* Ever crawled: The cumulative total of URLs on your site that Google has ever crawled. Not all crawled URLs get indexed, and Google may discover some URLs by other means such as inbound links from other sites. This number should increase over time as new pages are added to your site.
* Total indexed: The total number of URLs currently in Google's index. These URLs are available to appear in search results, along with other URLs Google may discover by other means. This number will change over time, as new pages are added and indexed, and old pages are removed. The number of indexed URLs is almost always significantly smaller than the number of crawled URLs, because it does not include URLs that have been identified as duplicates or non-canonical, or less useful, or that contain a meta noindex tag.


Apparently I don't rate an "ever crawled" because all I get is "total indexed" ::sob::

Hm. Wonder where those 29 pages went? (Difference between highest point on graph, and current number.) Maybe they got stolen by That Other Search Engine; their Total Indexed has been going back up.

"More information" for "not selected" takes you in turn to

[support.google.com...]

"Less useful" is an infuriatingly fuzzy phrase isn't it?
8:54 pm on July 25, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member 5+ Year Member

joined:May 9, 2007
posts:876
votes: 0


Mine is 40 million. I need to take a closer look and get my head around the figures.
9:33 pm on July 25, 2012 (gmt 0)

New User

5+ Year Member

joined:Aug 18, 2011
posts:26
votes: 0


@lucy24 : The "Ever Crawled" is under the "Advanced" tab.. not sure if you have that or not but it should be right above the Total Indexed graph on the Index Status section of WMT.
12:39 am on July 26, 2012 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:13268
votes: 363


Aha. Just when I'd got used to the idea that, in googlespeak, "advanced" means "show percentage change". Wrong page.

Does every single one of the "learn more" links lead to the same page?
 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members