Forum Moderators: Robert Charlton & goodroi
This kind of activity has preceded a SERPs Update in the recent past [webmasterworld.com].
The G Mozilla-bot (the one that mostly carries out this kind of behaviour) is restricted in speed on my site, so I find it difficult for an accurate appraisal. Any other confirmation/denials?
Up until today:
Googlebot 2.1 = 3,043 pages
Mozilla Googlebot = 2,559 pages
Despite all this spidering, I continue to lose pages in BD's index. From a high of 1,020 pages I'm now down to 312 (with about 100 of those being supplemental) - an all time low for the last two years.
Ever the optimist, I think this is a good sign:
- I'm hoping for a big refresh
- A massive jump in pages indexed in the BD centers
- A huge jump in positions
- So much traffic that my host hit me with bandwidth penalties
Now that March is here I've got access to the full stats for February. In what follows, understand that the figures actually only relate to 2 weeks, since the Google Mozilla-bot began to bang away at this site on 16 Feb.
The Google M-Bot has gone from a typical 50 hits/month to 1,000/day (200¦301¦304¦404) (with a further 40,000 attempted page-requests stopped by the site's Unruly-Bot Blocking routines). Everything else is normal, although it should be noted that, because the Adsense-bot often shared the same IP as the Mozzie-bot, it also got blocked.
AWStats:
Adsense-bot crawls are ~25,000/month.
Google-bot crawls are ~1,000/month.
Google Mozzie-bot crawls are ~60/month
503 Server-Busy hits are 50-1,000/month (varies enormously)
Adsense-bot crawls are 18,565.
Google-bot crawls are 1,141.
Google Mozzie-bot crawls are 11,511.
503 Server-Busy hits are 111,919.
Supplemental club: Big Daddy coming - Part 1 [webmasterworld.com] (msg#23 + others):
mozilla bot is crawling big time
Added very much later:
This page on Matt Cutts blog [mattcutts.com] gives the means to distinguish between a Big Daddy Data-Centre and a normal Data-Centre (search for [sf giants] - "giants.mlb.com" as first result means you have hit BigDaddy; this info is going to be important in the next post).
I am going to declare that the new Google-algorithm is both in-place and active. G have snuck it in behind everyone's backs whilst our attention was elsewhere.
Now, my site is a hyphenated domain name, on both .com and .co.uk:
Site position on a keyword1 search (BigDaddy/not-BigDaddy):
.com: site position: 99/31 (BD/not-BD)
.co.uk: site position: 14/661 (BD/not-BD)
The update is already in place. It is simply a question of how many of the G-IPs are featuring it at any one time, and that is increasing on a daily basis.
G Mozilla bot watch:
This wretched bot continues to attempt to rape my site. It is crawling as hard as the site bot-prevention routines will allow (1,000 pages per IP per day) and has racked up 52,535 503 Server-busy responses in 7 days. This behaviour has continued unabated for 21 days.
I have been daily recieving around 500-1000 page views from him, but this is the most intense I have ever seen any bot from Google eating up my sites..
anywho - goodmorning, good day or goodnight (whichever is apt for your neck of the woods) - i am off to bed ;-)
G Mozilla bot watch:
13,053 visits in 13 days (restricted by the site Unruly-bot prevention routines [webmasterworld.com] to 1,000/day, and thus 30,000 visits in the last month, which is slightly up on the 60/month of the Autumn + Winter period). 503 Server Busy responses are about 250,000 for the same period. Phew!
Algo Update watch:
The McDar tool is currently offline. Visitor numbers are up on my site yet again compared to last week (about 5% last week, another 5% yesterday making 10% total).
Indeed - one of the major problems of a homepage canonical problem is crawling depth (GG has confirmed this himself - although probably 2/3 years ago - cant find post at mo)
lammert - did those crawled pages make it into the index? - my site which has canonical problems also gets crawled by this Mozilla Googlebot thing - but only a very very few pages make the index.
my site ... also gets crawled by this Mozilla Googlebot thing - but only a very very few pages make the index.
Mozilla bot has crawled these supplementals now multiple times, they are still supplemental
It is, unfortunately, dense and not that easy to follow. In addition, it concerns URL-only SERPs rather than Supplemental. If I write down the headlines to those results, you may get the point:
I'll keep you updated.
G Mozilla bot watch:
30,000 (200 + 304) pages taken from my site in 35 days (adding 301 + 404 hits gives 21,634 for March, 11,511 for 15 days in Feb = 33,000 total).
The site Unruly-bot Prevention Routines [webmasterworld.com] limit each IP to 1,000/day (503 Server Busy responses are 111,919 + 138,011 = 250,000 for the same period above) and, therefore, the M-Bot has essentially now been pounding my site continuously on a day-in day-out basis for 35 days.
Other threads also indicate heavy crawls: (some examples): msg #:181 [webmasterworld.com] and msg #:195 [webmasterworld.com].
Algo Update watch:
I spotted a rise in visitor numbers on Monday Mar 6, although other Google-watch threads have settled on Wed March 8 (remember! you read it here first!). For my site this rise has continued slow-but-sure on a day-by-day basis. The increase is not large (yesterday was about 11% up on "normal" visitor numbers), though most welcome after 2 years of freefall.
Relative numbers referred to my site from Google and Yahoo! are interesting:
Visitor numbers rose yesterday yet again, marking 18 straight days of an upward graph. Amazing. The increase appears to be solely from increased Google-referrals (the March G:Y! ratio is now 16.7 : 1).
The most sensible interpretation of this seems to be that:
(in addition)
This bot continues to rain down on my site (38 days) and, since it looks like this will become 40 days and nights of being drenched by it's continuous attention, I am going to name this update the:
Noah Update
Others (particularly Brett) can make their own decision about that, but that is my personal name for it from now on.
In fact, one of the reasons for making this posting is to make note of the most marvellous coinage by seochristine of the word "searchquake" [webmasterworld.com], (msg#34) something which deserves to enter the lexicon of Google-isms. At the same link is a further report of a site reappearing on 8 March after tanking for 6 months.
PS
The attention from the Google Mozilla-bot stopped on Sunday. Exactly at 40 days! You could not make this up, could you?
RIP G-Bot:
The old, much-loved (?) G_Bot (identified by the UA "Googlebot/2.1 (+http://www.google.com/bot.html)") appears to have drowned [webmasterworld.com] (msg#18). It was last spotted swimming around my site on Sunday, March 26, although others have quoted March 28 [webmasterworld.com] as the last date. I would have spotted this sooner if some Bat-ty prat on a German-Uni IP had not been using a forged G_Bot UA whilst browsing my site. At least it will be easier to spot such deception in the future.
The King is dead; long live the King:
As noted above, the M_Bot stopped on my site on the same day as the G_Bot. In fact, it turned out to be only a pause - it was immediately back, full-tilt, and has not stopped (although the swim-rate compared to March is a little less - perhaps). Watch out for this bot! Whereas the old bot was slow and congenial, this new bot is vigorous and indifferent - it will take pages at a rate which may have your server (and bandwidth) groaning. Thankfully, it does accept compressed pages.
The new Princes, BTW, appear to be a phalanx of Nokias: I have spotted:
The Google Noah Update has been rolled-back:
This one is more contentious [webmasterworld.com], and I cannot fully understand what has happened.
As noted above, other threads have settled on December 27, March 8 and March 28 as significant moments in this rolling-update. The first date was insignificant for my site, but March 6 saw 2 keyphrases (actually single words) suddenly achieve vast importance in the G-SERPs for my site, and visitor numbers rose significantly all through March. On the third date that all stopped and, once again, in April they do not feature at all:
(from March AWStats results):