Many have written recently in this forum about getting dropped from the Google index, or their page links turning to only URL listings, etc. In some cases this is also accompanied by sudden failure of the Googlebot visits. This has sprouted multipage threads about hijacking, and a myriad of other possible conjectured causes.
It was recently brought to my attention by a WebmasterWorld member, that a post on another website pointed the finger at another perfectly logical cause. I'm writing today, because I strongly feel OUR site which experienced the symptoms described above was among this group and that others on here may be as well. Also hopefully, by posting this it will help others in the future who identify similar circumstances to react quicker.
The post from another forum and my subsequent e-mails with the poster, ALLEGES that sometime around the end of March until late last week one or more border routers at Valueweb (and possibly their Affinity affiliates as I will justify shortly) began FILTERING accesses to their client server from Googlebot. The post also alleges that early notifications (in the form of calls and e-mails) to customer support about the problem were ignored by customer support and never referred to higher tech support levels. Finally the post and my subsequent e-mail correspondences with the source allege that only after demanded conversations with the upper management was the 'screw-up' discovered, ADMITTED and corrected.
If this is true, and I suspect it may be, then the results that we, and apparently others have seen, of very popular and highly spidered pages dropping suddenly from the Google index, and the subsequent loss of traffic and income, may have been directly caused by this alleged error.
I am now stating my OWN FACTS which I have OBSERVED based on our own access logs. Our dedicated server is hosted by Skynetweb an Affinity partner (of which Valueweb is also a partner - I'm not sure of the exact relationship, but we were assigned a Valueweb user ID shortly after their merger). Our most popular half-dozen domains typically see at least a hundred Googlebot visits on any average day combined. On Feb 28 around 15:15 we saw our last visit from the 66.249.xx.xx series Googlebot and Mediabots on any of our servers. They did not re-start again until 3/9 15:07. During that intervening time we DID see occassional visits from the 70.180.xx.xx, 68.236.xx.xx Googlebot and 145.94.xx.xx image bots. This leads me to believe that SOMETHING began blocking visits by the most active Googlebots in the 66.249.xx.xx IP range (possibly automatically because of excessive accesses - although any halfway-competent technician SHOULD have set this range in the 'always allowed' bin from the very start of setting up the router) beginning Feb 28 until March 9 15:00 which just 'HAPPENS' to be the date and approximate time the conversation was had with the upper management at Valueweb.
Our servers which were affected are located in the 216.xx.xx.xx IP range, and the problem was also reported widely in the 207.xx.xx.xx block which Valueweb reportedly controls and hosts their OWN servers on. If others with this confirmed problem subit their IP range, we can perhaps map this range of influence for others.
In our situation it appears the lack of access caused Google to either report partial indexing results (URLs only) or drop MANY pages altogether. Those pages NOT spidered as frequently were not affected because Googlebot did not attempt to spider them during the blocked period. Another interesting side-effect is that pages which are linked from offsite and which are 301 redirected on our site, were showing up in the index as a URL only, apparently because the bot couldn't access them to realize they were 301'd. While the access was blocked beginning late Monday the 28th, the symptoms apparently did not to be apparent en masse until the following Monday (the 7th) when Google apparently updates their database each Monday based on the bot visits over the prior week. Re-submitting the pages via Google's submission page before the 9th (besides the fact it was broken for most of that time) also had little or no effect because the bot still could not visit submitted pages on these servers.
Submission is now working again, and either because of our re-submissions or because it would have visited anyway, it appears the bot is once again visiting frequently and has begun restoring some of the lost pages (at least ours) and descriptions this Monday. Hopefully most of the rest will follow NEXT Monday.
Lessons to be learned:
1. Watch for Googlebot visits regularly. If it stops for a significant period (>1 day) suspect something.
2. If rule 1 happens, and others ARE seeing visits, DEMAND your ISP investigate it IMMEDIATELY and follow up until Googlebot visits start again.
3. If they refuse to investigate.... Time to find a NEW ISP.
4. When setting up an IP access blocking filter, be sure always to allow through Googlebot!