Are we sure that the Deep Crawl is limited to the 216s?

I've been lurking around this forum for close to a year now and, like everyone else here, have learned a great deal. Among the many things learned is that the IP addresses of the Fresh Bots start with 64, while the IP addresses of the Deep Crawl crawlers start with 216. This has been gospel here and I've witnessed many a newbie being dressed down for crying out that the Deep Crawl had started when, in fact, it was the Fresh Bots. In every case, the initial claim was met with a chorus of, "Are you SURE it's the 216s?"

Until today, I lived by the gospel of the 216s. Now, however, I'd like to formally call it into question based on three things I have observed at my site this morning. First, though, a bit of background.

My site is of the bibliographic variety and has millions of dynamically generated pages. It has a PR of 7 and, on average, gets about 100,000 pages read during the Deep Crawl and 50,000 pages read each month by the Fresh Bots. There was a time when I'd check the IP addresses to establish the difference between the two types of reads but, over time, it became clear that the domain names for the 216s took the form of crawl##.googlebot.com, while the domain names of the 64s took the form of crawler##.googlebot. Thus, I stopped paying attention to the IP addresses and took the domain names as a reliable indicator of what was happening.

With that said, let me now share with you some interesting observations from this morning. It begins with me noticing my site being hammered early this morning. A quick investigation revealed that it was googlebots of the crawl## variety, thus leading me to conclude that the Deep Crawl had begun. After reporting this here and having my claim questioned, I looked deeper and discovered that the IP addresses were of the 64.* variety, which could only mean two things: Google had changed its naming convention, or Google was now using the 64s to assist with the Deep Crawl.

I am inclined to choose the latter of these two possibilities for three reasons. The first is that the timing is right for a Deep Crawl. The second has to do with the intensity of the crawl: whereas Fresh Bots have traditionally maxed out at 1000 pages/hour at my site, the current crawl is 3000+ pages/hour. The third reason has to do with the length of crawl: the Fresh Bots have almost always left after an hour, whereas the current crawl of my site has been going on for several hours now.

In light of the foregoing, I am willing to risk being a heretic by saying that the gospel of the 216s may be false. That said, I will now sit back and wait for all of you to provide me with 101 reasons why I am a fool to say this. Can't wait!

Are we sure that the Deep Crawl is limited to the 216s?

Something to think about

uber_boy

bether2

yankee

uber_boy

mbennie

affiliateguy

bether2

ciml

pendanticist

BigDave

BigDave

HitProf

uber_boy

Jesse_Smith

bether2

ciml

Brett_Tabke

bether2

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week