Evolution of the new Mozilla Googlebot

Last month, I started a experiment. The goal was to try to learn the crawling patterns of the new Mozilla Googlebot (herein referred to as "MoziG"). The site I chose to monitor (herein referred to as "target") is the same one I have been mentioning in my most recent posts (concerning indexed page counts). Target has approx. 3000 pages, which are a mix or static & dynamic urls.

Today (just over four weeks later), please see below what I have found along the way:

03/17

Started experiment early today. MoziG (IP 66.249.72.201) visited target; looking for old pages (gone for almost a year). Late afternoon, starts crawling new category index page links.

03/18

No activity

03/19

Huge activity today. MoziG crawling more new category index page links. Noticing a new trend. MoziG crawling blocks of static url pages, then dynamic url pages (random # of pages in each block). Pages are being crawled by url length (shortest url to longest url).

03/20

Huge activity today. MoziG crawled only dynamic urls until 11:00 PM. Activity stopped for exactly 45 minutes. MoziG then started crawling only static urls (again, based on url length)

03/21

Same activity pattern as yesterday (including an exactly 45 minute break). But then back to dynamic urls for 5 minutes, then instantly switching to static urls (with no delay).

03/22

Total of 10 pages crawled, ending at 1:30 pm. Exactly 1 hour later, MoziG (IP 66.249.66.168) crawls 1 page (robots.txt). 3 hours later, MoziG (IP 66.249.72.200) tries to crawl 1 old page, then no more activity.

03-23 thru 03/26

Minimal activity. 10 pages or less crawled daily.

03/27

MoziG (IP 66.249.72.200) appears, grabs robots.txt. Exactly 12 hours later, MoziG (IP 66.249.71.40) appears, crawls robots.txt and main index. Exactly 1 hour later, MoziG (IP 66.249.72.200) returns, grabs robots.txt. No more activity.

03/28 thru 04/04

Non stop crawling. MoziG (IP 66.249.72.200) crawling both dynamic & static urls, based on url length (shortest to longest).

04/05 thru 04/07

MoziG (IP 66.249.72.200) crawling dynamic urls only.

04/08 thru 04/11

MoziG (IP 66.249.72.200) crawling only static urls until mid day 04/08. After that, very slow crawl of random pages.

04/12

MoziG (IP 66.249.72.200) crawling only pages with shortest & longest urls (meaning the very shortest & the very longest. No "in the middle").

04/13 thru 04/14

MoziG (IP 66.249.72.200) crawling random pages VERY slowly, until late 04/14. MoziG (IP 66.249.66.229) starting to appear.

Conclusions

1. MoziG has appeared from 5 different IPs in the last month.
2. Each IP shift (less the really quick visits by 66.249.66.186 & 66.249.71.40) brings an "upgraded" mode of crawling (eg. now has the ability to crawl static & dynamic urls without delay between)
3. As opposed to the old Googlebots, there is no real "freshbot" & "deepbot" with MoziG. Any one of them can be either.
4. The exact time differences mentioned for 03/22 and 03/27 I believe are important. When I say "exact", I mean exact +Ś- 5 seconds at the end (logs show this). Ideas anyone? Lab testing maybe?
5. The single thing that ALL IP versions of MoziG had in common is that they all CRAWLED PAGES IN ORDER FROM SHORTEST URL TO LONGEST (in character length; doesn't matter if static or dynamic).

Please comment.

Evolution of the new Mozilla Googlebot

catch2948

tedster

catch2948

ronburk

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week