Welcome to WebmasterWorld Guest from 54.198.229.157

Forum Moderators: Robert Charlton & aakk9999 & andy langton & goodroi

Message Too Old, No Replies

Google stopped crawling many sites Jun 15 AM

     
3:34 pm on Jun 15, 2010 (gmt 0)

New User

5+ Year Member

joined:June 15, 2010
posts:1
votes: 0


Hey,

does anyone have crawling issues starting this morning at around 4:00 to 4:30 (Central European Time Zone)?

I checked the crawling of big sites in different verticals, all mainly in the European market. On all Sites the crawling by Google decrased by around 98%.

Thank you
7:28 pm on June 15, 2010 (gmt 0)

Junior Member

5+ Year Member

joined:Feb 24, 2008
posts:93
votes: 0


My information site (forum) is hosted in the U.S., and since this morning (June 15th) my Google crawling rate has dropped to about 1% of the usual volume. I have never seen this in two and a half years of running this site. The traffic volume doesn't seem to be affected though.
7:39 pm on June 15, 2010 (gmt 0)

New User

5+ Year Member

joined:June 15, 2010
posts:10
votes: 0


hi @scottsonline, can you confirm that you see it, too?

i see it on multiple domains in different verticals, hosted on different domains, different networks with different codes-bases.

it started between 4:00 and 4:30 european time zone. before this total drop (up to minus 98%) there was a periode of medium activity (but not that unusual) .

a first cool step would be that we confirm multiple sightings of the same phenomenon so the we can exclude single site penalties.
11:08 pm on June 15, 2010 (gmt 0)

New User

5+ Year Member

joined:Feb 20, 2009
posts:4
votes: 0


I'm seeing this also; yesterday GB crawled approx 280,000 pages, so far today 1,600 pages.
9:00 am on June 16, 2010 (gmt 0)

New User

5+ Year Member

joined:June 15, 2010
posts:10
votes: 0


ok, high

please check your logfiles for googlebot request for MBISetup.exe (or more generall .exe stuff)

66.249.66.84 - - [16/Jun/2010:07:15:42 +0200] "GET /ticker/MBISetup.exe HTTP/1.1" 301 99 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

and what you return to such request (what status code?).
also look for other "maleware googlebot check request" from google and what HTTP status you return for them.

this is a possible lead as there is a major time overlap.
9:35 am on June 16, 2010 (gmt 0)

New User

5+ Year Member

joined:June 15, 2010
posts:10
votes: 0


something strange we found, it's our first lead

but what we now found is

66.249.66.88 - - [16/Jun/2010:08:04:04 +0200] "GET /ticker/MBISetup.exe HTTP/1.1" 301 99 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
66.249.66.244 - - [16/Jun/2010:08:04:05 +0200] "GET /en/MBISetup.exe HTTP/1.1" 301 99 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
66.249.66.88 - - [16/Jun/2010:08:04:05 +0200] "GET /en/mbisetup.exe HTTP/1.1" 404 1236 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

for explanation, the /ticker/MBISetup.exe request is the original one, the others are redirects based on our "URL canonicalization" - logic.

on the other domains:

a .com domain
66.249.66.4 - - [16/Jun/2010:03:25:26 +0200] "GET /ticker/MBISetup.exe HTTP/1.1" 404 4258 "ref=-" "ua=Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

on a .co.uk domain
66.249.66.129 - - [16/Jun/2010:03:59:08 +0200] "GET /ticker/MBISetup.exe HTTP/1.1" 404 3881 "ref=-" "ua=Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

similar on other domains .fr, .at, .de, .....

the thing is, these request (which are ongoing in a 10 minutes plus on some domains, on a 1 minute schedule on some others) began shortly before google stopped crawling these sites, and intensified during the exact moment when googlebot stopped crawling completely. (the crawling stop is still ongoing).

we verified that these IPs are actual googlebot IPs.
there are no "malware alerts" in google webmaster tools visible on any of these domains.

we believe that these request
"GET /ticker/MBISetup.exe HTTP/1.1"
are googlebot checking for maleware (which is not on our site) but something went wrong an googlebot sees our site as maleware - without flagging it?

this is very strange behavior and we see it over multiple domains (different servers, different datacenters, different codebase, different companies).

would be great if somebody checks their logfiles for similar requests - or has an explanation what is going on.
9:36 am on June 16, 2010 (gmt 0)

Senior Member

WebmasterWorld Senior Member sgt_kickaxe is a WebmasterWorld Top Contributor of All Time 5+ Year Member

joined:Apr 14, 2010
posts:3169
votes: 0


3PM June 15th, Google crawling spiked and then stopped completely and traffic from Google was down substantially (95%+) except from Google images which remained the same. Many top pages were simply gone from the index. All returned to normal at 11pm June 15th.

No malware warnings, a complete scan of the source code shows nothing out of the norm, all file last updated dates are accurate.

very weird.

Perhaps we're all asking the wrong questions? Why is traffic down should be replaced with Where is traffic going since Google isn't experiencing less searches. Knowing where all the traffic sites like even WW lost is going would be eye opening I have no doubt.
9:49 am on June 16, 2010 (gmt 0)

Senior Member

joined:Dec 29, 2003
posts:5428
votes: 0


I got 50 googlebot visits, down from over 1000 normally
10:00 am on June 16, 2010 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Nov 15, 2001
posts:1436
votes: 0


As on the current site I look after don't have access to raw logs, I don't know when G crawls or does not. It is suprising how less stressed my life has become.

When I make changes, I note when rankings change and then check the cache to confirm G has the latest page.

Much less stressful!
10:08 am on June 16, 2010 (gmt 0)

New User

5+ Year Member

joined:June 15, 2010
posts:10
votes: 0


@Mark_A i agree if it would be a single domain issue, but as it's multifold and multidomain it is something to investigate - and as we are talking about such a major change in the wrong direction - something to worry.
12:37 pm on June 16, 2010 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Mar 16, 2004
posts:854
votes: 0


Same here, daily average for the last 7 days was 20,000 gbot hits. Yesterday was 150 total gbot hits. Site is a US based site so I don't think this is a European only event.
12:39 pm on June 16, 2010 (gmt 0)

Senior Member

joined:Dec 29, 2003
posts:5428
votes: 0


To add: My site is not European
1:54 pm on June 16, 2010 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Mar 16, 2004
posts:854
votes: 0


seoisabusiness I am not seeing any malware related queries from gbot. I just grepped out our logs for multiple sites. I am seeing almost a complete drop in gbot activity for many sites on many ips. Some connected, some standalone and some under different whois.

Traffic is stable, no drops. Just a seizing up almost completely of gbot. Hopefully this is just a technical crawling glitch on Googles part.
1:55 pm on June 16, 2010 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Mar 16, 2004
posts:854
votes: 0


Walkman, anything strange in the gbot queries that you are still getting?
2:15 pm on June 16, 2010 (gmt 0)

New User

5+ Year Member

joined:June 15, 2010
posts:10
votes: 0


i confirmed one more domain which saw a similar drop in googlebot activity (from about 15.000 request per hour, to now near zero) in the same timespan, no malware related queries, though. someone suggested that googlebot is making out with bingbot - i doubt this theory but as a matter of fact is currently as good as any other theory....
2:35 pm on June 16, 2010 (gmt 0)

Junior Member

5+ Year Member

joined:Apr 29, 2010
posts:104
votes: 0


Same here everyone..we used to get over 15000 google bot hits and now its down tyo around 1000 a day. We are a tourism site and our server is in the US.
2:46 pm on June 16, 2010 (gmt 0)

Preferred Member

10+ Year Member

joined:Dec 1, 2003
posts:630
votes: 0


Anyone know what the no-text62... means after "GET /city_widget_list/no-text6268299191049925497 HTTP/1.1" 404 256 "-" "Googlebot-Image/1.0"

The crawling from Gbot has been weird. It gets one random page
at a time , then goes away and comes back an hour or two later gets another and then goes away...and repeats like that all day.
3:00 pm on June 16, 2010 (gmt 0)

New User

5+ Year Member

joined:June 14, 2010
posts:12
votes: 0


gatito,

My site is similar.

Detail in here [webmasterworld.com...]

From June 08, 2010 (New york Time Zone,-4)

Hosting in US,Multi-language content.
3:12 pm on June 16, 2010 (gmt 0)

New User

5+ Year Member

joined:June 15, 2010
posts:10
votes: 0


found a similar but slightly different occurrence of malware request before crawling stopped of one of the domains, instead of /ticker/MBISetup.exe /downloads/Atlantis_Patch.exe was requested.

66.249.65.78 - - [15/Jun/2010:03:11:08 +0200] GET /s/downloads/Atlantis_Patch.exe HTTP/1.1 301 356 ref=- ua=Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
66.249.65.110 - - [15/Jun/2010:03:11:08 +0200] GET /s/downloads/atlantis_patch.exe HTTP/1.1 410 11805 ref=- ua=Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)

google requested /s/downloads/Atlantis_Patch.exe, server - trained to lowercase all URLs - returned an HTTP 301 for a lowercase URL, then the HTTP 404 followed. crawling down followed shortly afterwards, either it's related or not, i don't know - but it's very suspicious
4:13 pm on June 16, 2010 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Mar 16, 2004
posts:854
votes: 0


seoisabusiness, we have the exact same thing. Dropped from hundreds to thousands of gbot requests per hour to literally 1-5 requests per hour.

Would be nice if anyone from Google was reading this thread if they could let us know if this is technical in nature and just a temporary hiccup in crawling.

Server and sites come back perfect for DNS, other bots crawling yadayada.

Hmm.
4:24 pm on June 16, 2010 (gmt 0)

Senior Member from DE 

WebmasterWorld Senior Member 10+ Year Member

joined:May 25, 2002
posts:926
votes: 0


What does your webmaster tools crawl rate say? Or is it too early?

I suddenly see a lot of "redirect" errors on pages that do work... very strange!
4:35 pm on June 16, 2010 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Mar 16, 2004
posts:854
votes: 0


To early for wmt Pontifex.

This has to be some form of technical problem on Googles end. The biggest site we have that this has happened to is pr7 with half a million backlinks. 50 total gbot visits today so far down from the usual 1000-10000 by this point in the day.

Seeing some others have the same problem gives me both relief and anxiety:)
4:41 pm on June 16, 2010 (gmt 0)

New User

5+ Year Member

joined:June 15, 2010
posts:10
votes: 0


Hi drall

Did you grep your logfiles for .exe googlebot requests - if you could confirm them then I think we have found a smoking gun
4:43 pm on June 16, 2010 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Mar 16, 2004
posts:854
votes: 0


Checking right now, brb. 1-2 gig daily raw logs take a few to grep heh.
4:48 pm on June 16, 2010 (gmt 0)

Junior Member

5+ Year Member

joined:Mar 2, 2010
posts:41
votes: 0


I guess crawling rate lower threshold value is defined "crawled pages/day"; Not "crawled pages/hour"!
4:52 pm on June 16, 2010 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Mar 16, 2004
posts:854
votes: 0


nothing seoisabusiness, not one instance of that for the last 3 days.
4:58 pm on June 16, 2010 (gmt 0)

New User

5+ Year Member

joined:June 15, 2010
posts:10
votes: 0


Hmmm... time for the next theory, anyone ?
5:08 pm on June 16, 2010 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Mar 16, 2004
posts:854
votes: 0


I just dont see anything wrong, its like the switch got flipped early morning on the 15th around 1am and gbot just ran out of gas. Mediapartners bot and image bot are chugging along like normal.
6:02 pm on June 16, 2010 (gmt 0)

Junior Member

5+ Year Member

joined:Apr 16, 2009
posts:51
votes: 0


Also remember that ALL the google spiders crawl content for the "googlebot" so please check adsbot if you are buying adwords, which is the other major Googlebot and then check Mediapartners if you are using adsense on the pages. There are several googlebots and other bot names used by Google.

As these bots crawls are then used by googlebot to index content, so googlebot may not be crawling your site more as adsbot and the other google bots have already done the job.

So check your tool that you are using to make sure its monitoring all the google crawls as they are all the same and often should be treated the same within your reporting tools.
6:13 pm on June 16, 2010 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Sept 28, 2001
posts:1380
votes: 0


I started a thread here the other day called "Where's the Caffeine?". If anything google is less fresh than before when I look at the cache dates. I have a pagerank 6 site with millions of pageviews per month with the homepage showing a 10 day old cache date, and even worse on internal pages.

However, my traffic from google is up since the Mayday update, especially in the last week or so. I am not making a lot of sense out of this, but do hope google can get back on track with the freshness. We put a lot of work into updating the site each week.
This 57 message thread spans 2 pages: 57