Welcome to WebmasterWorld Guest from 54.145.173.147

Message Too Old, No Replies

Google stopped crawling many sites Jun 15 AM

   
3:34 pm on Jun 15, 2010 (gmt 0)



Hey,

does anyone have crawling issues starting this morning at around 4:00 to 4:30 (Central European Time Zone)?

I checked the crawling of big sites in different verticals, all mainly in the European market. On all Sites the crawling by Google decrased by around 98%.

Thank you
7:28 pm on Jun 15, 2010 (gmt 0)

5+ Year Member



My information site (forum) is hosted in the U.S., and since this morning (June 15th) my Google crawling rate has dropped to about 1% of the usual volume. I have never seen this in two and a half years of running this site. The traffic volume doesn't seem to be affected though.
7:39 pm on Jun 15, 2010 (gmt 0)



hi @scottsonline, can you confirm that you see it, too?

i see it on multiple domains in different verticals, hosted on different domains, different networks with different codes-bases.

it started between 4:00 and 4:30 european time zone. before this total drop (up to minus 98%) there was a periode of medium activity (but not that unusual) .

a first cool step would be that we confirm multiple sightings of the same phenomenon so the we can exclude single site penalties.
11:08 pm on Jun 15, 2010 (gmt 0)

5+ Year Member



I'm seeing this also; yesterday GB crawled approx 280,000 pages, so far today 1,600 pages.
9:00 am on Jun 16, 2010 (gmt 0)



ok, high

please check your logfiles for googlebot request for MBISetup.exe (or more generall .exe stuff)

66.249.66.84 - - [16/Jun/2010:07:15:42 +0200] "GET /ticker/MBISetup.exe HTTP/1.1" 301 99 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

and what you return to such request (what status code?).
also look for other "maleware googlebot check request" from google and what HTTP status you return for them.

this is a possible lead as there is a major time overlap.
9:35 am on Jun 16, 2010 (gmt 0)



something strange we found, it's our first lead

but what we now found is

66.249.66.88 - - [16/Jun/2010:08:04:04 +0200] "GET /ticker/MBISetup.exe HTTP/1.1" 301 99 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
66.249.66.244 - - [16/Jun/2010:08:04:05 +0200] "GET /en/MBISetup.exe HTTP/1.1" 301 99 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
66.249.66.88 - - [16/Jun/2010:08:04:05 +0200] "GET /en/mbisetup.exe HTTP/1.1" 404 1236 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

for explanation, the /ticker/MBISetup.exe request is the original one, the others are redirects based on our "URL canonicalization" - logic.

on the other domains:

a .com domain
66.249.66.4 - - [16/Jun/2010:03:25:26 +0200] "GET /ticker/MBISetup.exe HTTP/1.1" 404 4258 "ref=-" "ua=Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

on a .co.uk domain
66.249.66.129 - - [16/Jun/2010:03:59:08 +0200] "GET /ticker/MBISetup.exe HTTP/1.1" 404 3881 "ref=-" "ua=Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

similar on other domains .fr, .at, .de, .....

the thing is, these request (which are ongoing in a 10 minutes plus on some domains, on a 1 minute schedule on some others) began shortly before google stopped crawling these sites, and intensified during the exact moment when googlebot stopped crawling completely. (the crawling stop is still ongoing).

we verified that these IPs are actual googlebot IPs.
there are no "malware alerts" in google webmaster tools visible on any of these domains.

we believe that these request
"GET /ticker/MBISetup.exe HTTP/1.1"
are googlebot checking for maleware (which is not on our site) but something went wrong an googlebot sees our site as maleware - without flagging it?

this is very strange behavior and we see it over multiple domains (different servers, different datacenters, different codebase, different companies).

would be great if somebody checks their logfiles for similar requests - or has an explanation what is going on.
9:36 am on Jun 16, 2010 (gmt 0)

WebmasterWorld Senior Member sgt_kickaxe is a WebmasterWorld Top Contributor of All Time 5+ Year Member



3PM June 15th, Google crawling spiked and then stopped completely and traffic from Google was down substantially (95%+) except from Google images which remained the same. Many top pages were simply gone from the index. All returned to normal at 11pm June 15th.

No malware warnings, a complete scan of the source code shows nothing out of the norm, all file last updated dates are accurate.

very weird.

Perhaps we're all asking the wrong questions? Why is traffic down should be replaced with Where is traffic going since Google isn't experiencing less searches. Knowing where all the traffic sites like even WW lost is going would be eye opening I have no doubt.
9:49 am on Jun 16, 2010 (gmt 0)



I got 50 googlebot visits, down from over 1000 normally
10:00 am on Jun 16, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



As on the current site I look after don't have access to raw logs, I don't know when G crawls or does not. It is suprising how less stressed my life has become.

When I make changes, I note when rankings change and then check the cache to confirm G has the latest page.

Much less stressful!
10:08 am on Jun 16, 2010 (gmt 0)



@Mark_A i agree if it would be a single domain issue, but as it's multifold and multidomain it is something to investigate - and as we are talking about such a major change in the wrong direction - something to worry.
12:37 pm on Jun 16, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Same here, daily average for the last 7 days was 20,000 gbot hits. Yesterday was 150 total gbot hits. Site is a US based site so I don't think this is a European only event.
12:39 pm on Jun 16, 2010 (gmt 0)



To add: My site is not European
1:54 pm on Jun 16, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



seoisabusiness I am not seeing any malware related queries from gbot. I just grepped out our logs for multiple sites. I am seeing almost a complete drop in gbot activity for many sites on many ips. Some connected, some standalone and some under different whois.

Traffic is stable, no drops. Just a seizing up almost completely of gbot. Hopefully this is just a technical crawling glitch on Googles part.
1:55 pm on Jun 16, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Walkman, anything strange in the gbot queries that you are still getting?
2:15 pm on Jun 16, 2010 (gmt 0)



i confirmed one more domain which saw a similar drop in googlebot activity (from about 15.000 request per hour, to now near zero) in the same timespan, no malware related queries, though. someone suggested that googlebot is making out with bingbot - i doubt this theory but as a matter of fact is currently as good as any other theory....
2:35 pm on Jun 16, 2010 (gmt 0)

5+ Year Member



Same here everyone..we used to get over 15000 google bot hits and now its down tyo around 1000 a day. We are a tourism site and our server is in the US.
2:46 pm on Jun 16, 2010 (gmt 0)

10+ Year Member



Anyone know what the no-text62... means after "GET /city_widget_list/no-text6268299191049925497 HTTP/1.1" 404 256 "-" "Googlebot-Image/1.0"

The crawling from Gbot has been weird. It gets one random page
at a time , then goes away and comes back an hour or two later gets another and then goes away...and repeats like that all day.
3:00 pm on Jun 16, 2010 (gmt 0)



gatito,

My site is similar.

Detail in here [webmasterworld.com...]

From June 08, 2010 (New york Time Zone,-4)

Hosting in US,Multi-language content.
3:12 pm on Jun 16, 2010 (gmt 0)



found a similar but slightly different occurrence of malware request before crawling stopped of one of the domains, instead of /ticker/MBISetup.exe /downloads/Atlantis_Patch.exe was requested.

66.249.65.78 - - [15/Jun/2010:03:11:08 +0200] GET /s/downloads/Atlantis_Patch.exe HTTP/1.1 301 356 ref=- ua=Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
66.249.65.110 - - [15/Jun/2010:03:11:08 +0200] GET /s/downloads/atlantis_patch.exe HTTP/1.1 410 11805 ref=- ua=Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)

google requested /s/downloads/Atlantis_Patch.exe, server - trained to lowercase all URLs - returned an HTTP 301 for a lowercase URL, then the HTTP 404 followed. crawling down followed shortly afterwards, either it's related or not, i don't know - but it's very suspicious
4:13 pm on Jun 16, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



seoisabusiness, we have the exact same thing. Dropped from hundreds to thousands of gbot requests per hour to literally 1-5 requests per hour.

Would be nice if anyone from Google was reading this thread if they could let us know if this is technical in nature and just a temporary hiccup in crawling.

Server and sites come back perfect for DNS, other bots crawling yadayada.

Hmm.
4:24 pm on Jun 16, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



What does your webmaster tools crawl rate say? Or is it too early?

I suddenly see a lot of "redirect" errors on pages that do work... very strange!
4:35 pm on Jun 16, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



To early for wmt Pontifex.

This has to be some form of technical problem on Googles end. The biggest site we have that this has happened to is pr7 with half a million backlinks. 50 total gbot visits today so far down from the usual 1000-10000 by this point in the day.

Seeing some others have the same problem gives me both relief and anxiety:)
4:41 pm on Jun 16, 2010 (gmt 0)



Hi drall

Did you grep your logfiles for .exe googlebot requests - if you could confirm them then I think we have found a smoking gun
4:43 pm on Jun 16, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Checking right now, brb. 1-2 gig daily raw logs take a few to grep heh.
4:48 pm on Jun 16, 2010 (gmt 0)

5+ Year Member



I guess crawling rate lower threshold value is defined "crawled pages/day"; Not "crawled pages/hour"!
4:52 pm on Jun 16, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



nothing seoisabusiness, not one instance of that for the last 3 days.
4:58 pm on Jun 16, 2010 (gmt 0)



Hmmm... time for the next theory, anyone ?
5:08 pm on Jun 16, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I just dont see anything wrong, its like the switch got flipped early morning on the 15th around 1am and gbot just ran out of gas. Mediapartners bot and image bot are chugging along like normal.
6:02 pm on Jun 16, 2010 (gmt 0)

5+ Year Member



Also remember that ALL the google spiders crawl content for the "googlebot" so please check adsbot if you are buying adwords, which is the other major Googlebot and then check Mediapartners if you are using adsense on the pages. There are several googlebots and other bot names used by Google.

As these bots crawls are then used by googlebot to index content, so googlebot may not be crawling your site more as adsbot and the other google bots have already done the job.

So check your tool that you are using to make sure its monitoring all the google crawls as they are all the same and often should be treated the same within your reporting tools.
6:13 pm on Jun 16, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I started a thread here the other day called "Where's the Caffeine?". If anything google is less fresh than before when I look at the cache dates. I have a pagerank 6 site with millions of pageviews per month with the homepage showing a 10 day old cache date, and even worse on internal pages.

However, my traffic from google is up since the Mayday update, especially in the last week or so. I am not making a lot of sense out of this, but do hope google can get back on track with the freshness. We put a lot of work into updating the site each week.
This 57 message thread spans 2 pages: 57