homepage Welcome to WebmasterWorld Guest from 54.224.202.109
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Subscribe to WebmasterWorld

Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

This 57 message thread spans 2 pages: < < 57 ( 1 [2]     
Google stopped crawling many sites Jun 15 AM
gatito




msg:4152974
 3:34 pm on Jun 15, 2010 (gmt 0)

Hey,

does anyone have crawling issues starting this morning at around 4:00 to 4:30 (Central European Time Zone)?

I checked the crawling of big sites in different verticals, all mainly in the European market. On all Sites the crawling by Google decrased by around 98%.

Thank you

 

drall




msg:4153772
 7:19 pm on Jun 16, 2010 (gmt 0)

Seems to be pretty widespread

[twitter.com...]

if you scroll way back to yesterday morning I can see the first confirms of this starting at the exact same timeframe as us.

Several threads ongoing over at Googles support forum as well.

Plex hasnt commented yet as far as I can see.

[edited by: drall at 7:26 pm (utc) on Jun 16, 2010]

Lame_Wolf




msg:4153773
 7:25 pm on Jun 16, 2010 (gmt 0)

Personally, I am not bothered. All my pages were cached ages ago. There have been no changes to the site. If they ain't crawling, then I am saving BW :)

drall




msg:4153791
 8:18 pm on Jun 16, 2010 (gmt 0)

Ok now I am getting really wierded out. I thought hmm maybe they are ramping up mediabot as the primary crawler as was mentioned since we have adsense on the sites this would make sense.

I go and grep out all of the mediapartners-google lines in my log and see a few of the tens of thousands that are on obviously different ip addies.

I trace them back and they lead to Microsoft Corp?

Microsoft is spoofing Mediapartners-Google? The crazy drop in indexing started less then 1 hour later. I see how your friend came up with Bingbot-Googlebot going out on a date. This is really getting strange now.

These IP addies are showing in my logfiles labeled as Mediapartners-Google

65.55.218.36
65.55.215.168

Less then 1 hour later almost all regular GBOT activity dies.

Sgt_Kickaxe




msg:4153809
 9:23 pm on Jun 16, 2010 (gmt 0)

Don't be worried drall, just observe. Observation makes good science of things.

More observations from my site
~ crawling is non existent right now save for the index page.
~ google cache of the index page is outdated by a week.
~ My top content is still ranked well and getting search visitors, less important content is getting none and rankings are gone for those.
~ The site isn't banned or de-indexed.

Eliminating what makes no sense leaves what must be going on even if that too makes no sense.

Right now I think a good majority of websites are getting "cherry picked" in that their best content is returning in search like always but how many more pages are being ranked well depends on site SIZE. Auto generated content doesn't count towards site size right now.
It seems to be percentage based, if size is deemed on topic, related, quality enough your site may be entitled to x% more top ranked pages. Sites like WW that are full of such content get more longtail traffic but not as much as before because the % is smaller than before. Crappy sites or small sites get their "BASE" quota and nothing more, they've lost the most longtail traffic.

What bothers me about this, if it's true, is that mashup sites also get their quota and it comes at the expense of the sites they scrape.

Gone are the days of auto-generated top rankings and going with 500 thin sites vs one good one, a good thing.

Speculations - nothing more. Incoming links overpower anything, work on those (carefully, lower quality pages could start replacing your best stuff if a quota is in play), link out freely and consider doing it WITHOUT nofollow tags since those are Google only. Relying solely on Google, well, look at the mess that's created.

drall




msg:4153834
 10:18 pm on Jun 16, 2010 (gmt 0)

Oh im not worried, Google put me on xanex years ago:)

Seeing a small sputtering to life the last hour. I think all the posts over at googles forums and here got someone to look into it.

I still dont understand how mediabot is coming from a microsoft ip.

J_RaD




msg:4153835
 10:23 pm on Jun 16, 2010 (gmt 0)

my crawls are down 50%

dstiles




msg:4153837
 10:24 pm on Jun 16, 2010 (gmt 0)

How about this explanation?

Google security has banned the use of Windows.

No one told them that googlebot is run from Windows machines.

Google bot-herders discovered bots weren't working and asked MS to cover for them.

:) :) :)

Well, it's as daft as a lot of things google is doing nowadays.

1EightT




msg:4153841
 10:29 pm on Jun 16, 2010 (gmt 0)

This caffeine roll out has been interesting. I heard from many customers that crawling had come to almost a complete stop yesterday (and down quite a bit today for most of them). I on the other hand had several hours where my large site was getting 100K hits or more from gbot. This is by far a new record for hits/hour from them, so i'm still up in the air as to what is going on.

manof




msg:4153863
 11:09 pm on Jun 16, 2010 (gmt 0)

Me too.

I've post here: [webmasterworld.com...]

My site has released for 4 months,in the last 15 days,Googlebot crawling 300,000-500,000 pages every day.

But,From 08/06/2010 I find Googlebot usually only crawling my homepage every 10-20 minutes:

"GET / HTTP/1.1" 200

Occasional,Googlebot will repeatedly crawling other 3-4 pages,such as:

............
"GET / HTTP/1.1" 200
"GET /my_*_*_*_mori HTTP/1.1" 200
"GET / HTTP/1.1" 200
............
............
............
"GET / HTTP/1.1" 200
"GET /my_*_*_*_mori HTTP/1.1" 200
"GET / HTTP/1.1" 200
............
............
............
"GET / HTTP/1.1" 200
"GET /my_*_*_*_mori HTTP/1.1" 200
"GET / HTTP/1.1" 200
............


Googlebot has been continuously crawling some pages that obviously does not belong to my site:

"GET /include/setup.exe HTTP/1.1" 404
"GET /author/*_klayiv_*/pisma_*/download.*.prc.zip HTTP/1.1" 404
"GET /usenext/1213093/*+*+5.0.15+*+Full+Version.exe.html HTTP/1.1" 404
"GET /download/projects/vcpp/*_screen.zip HTTP/1.1" 404

Then I changed my site's IP,Googlebot get the robots.txt,but it continues to crawl my site's home page only:

"GET /robots.txt HTTP/1.1" 200
"GET / HTTP/1.1" 200
............
............
............


I've checked google webmaster tools,did not find any abnormality.

Google indexing has dropped some,24 hours indexing is zero,and my site's traffic has dropped 1/3.

I search my site name in google,is first.

I search "site:example.com" in google, and the domain root, example.com, is not the first.

Marvin Hlavac




msg:4153910
 1:08 am on Jun 17, 2010 (gmt 0)

After about 40 hours of Google crawling at about 2% of the usual volume on my hobby site, I suddenly see the crawl rate returning to it's normal levels. Hopefully it will stay.

Anyone else sees the same?

Jeffhyde




msg:4153916
 1:33 am on Jun 17, 2010 (gmt 0)

Before any of this google change my site was ranking #1 for the majority of all our major keywords for months. We slowly moved to #12, now we have been at #16 for the past two or three weeks now on our major keyword.

So I made some changes on the content and I have noticed this lag in google bot activity as well. Our main landing page has not been cached since Jun 02. For 14 days we have been waiting but no updates. I have no clue what is going on.

nethead




msg:4153921
 1:47 am on Jun 17, 2010 (gmt 0)

Hey guys, We have been seeing pages drop on two of our sites for the last 60 days. We are down to 20,000 pages in google from 300k from 3 months ago. Has anyone else seen this happen recently?

tedster




msg:4153932
 2:34 am on Jun 17, 2010 (gmt 0)

Yes, I've been seeing numbers like that on some sites. But when I dig into the data it doesn't hold water. URLs that are not included as indexed anymore are still getting search traffic! I think it's a data bug and not a reality.

Brett_Tabke




msg:4153960
 3:52 am on Jun 17, 2010 (gmt 0)

we normally run 50-75k page views from gbot a day here. Two days ago it was 5k. Today 35k.

Sgt_Kickaxe




msg:4153961
 3:53 am on Jun 17, 2010 (gmt 0)

The google cache version of one of my index pages was 7 days old this morning, now it's dated May 1st. I'm going in the wrong direction here, lol. Google's making quite a mess of things.

I agree with that last post tedster but I will caution that it depends on time of day and day of week. Midnight to 3 am I'm 9999 with no traffic for some of my best keywords and then top of page 1 for the next few hours with associated traffic... My 3rd party tracking graphs show a strong ON/OFF pattern. My top keywords however, steady as a brick (for now).

edit: Brett, we posted at the same time, those numbers are eye opening! I have to tell you my faith in what Google is doing has never been lower, I'm not just saying that either, a mashup site is now one of my toughest competitors and it's not even their content!

ayalon




msg:4154079
 8:32 am on Jun 17, 2010 (gmt 0)

It looks like everything is back to normal. At least for all my sites i see regular googlebot traffic again from today...

seoisabusiness




msg:4154085
 8:47 am on Jun 17, 2010 (gmt 0)

hi first of all

it seems like GOOGLEBOT IS BACK - please confirm. today 17th June 2010 at 01:00am (+/- 1 hour for the different domans) CET (central european time) googlebot showed up again.

crawling in the same rate before the outage as before, verified over 12 domains.

only strange thing is the IP of some (only some) of these requests is 109.169.26.7 which does not verify with this method [google.com...] as googlebot. - which could scrappers.

also we found out that the remaining one percent on googlebot rewquests during the downtime were in fact either googlebot mobile, googlebot image and googlebot which fetches the simtemap.xml and googlebot on the home page request. every normal googlebot activity was actually 0.

but hey, googlebot is now back - i like.

this was the first webmaster world thread i actually posted actively - nice experience, but
- i don't think it makes sense to mix up crawling with indexing with caching with search refereed traffic with ranking with caffeine discussions - these are all interesting aspects of the google economy, but as they are measured differently (or not at all like caffeine) and so should be discussed separately. just my 2 c.

does anybody else see 109.169.26.7 requests (a few hundred per hour) with googlebot headers?

Sgt_Kickaxe




msg:4154147
 11:33 am on Jun 17, 2010 (gmt 0)

Back to normal - since 12:01 am until now anyway, the old average for that time period is back.

drall




msg:4154165
 12:40 pm on Jun 17, 2010 (gmt 0)

Back to normal it seems since 1am.

hugh




msg:4154169
 1:09 pm on Jun 17, 2010 (gmt 0)

Isn't a drop in individual crawl rates expected with their new emphasis on lots more crawling in parallel?

fabulousyarn




msg:4154186
 1:58 pm on Jun 17, 2010 (gmt 0)

Im seeing the same thing in the cache aging - its been 14 days for us, and we work hard at new content, as well!

judy

nethead




msg:4154190
 2:07 pm on Jun 17, 2010 (gmt 0)

seems like googlebot is back...had about 5000 hits in the last 8 hours.

manof




msg:4154568
 4:33 am on Jun 18, 2010 (gmt 0)

My site has not returned to normal.

nethead




msg:4154599
 6:26 am on Jun 18, 2010 (gmt 0)

we had a good start but ours seems to be slower yet again. On our main site we had about 5000 hits at night but after 8am (PST) we only had 3000 which is pretty low.Seems like its goin away or working part time... Yahoo had over 15000 and our norm used to around 15000 from Google while yahoo was only around 8000. Seems like yahoo is picking up where Google is slaking :)

Mark_A




msg:4154662
 9:28 am on Jun 18, 2010 (gmt 0)

I don't have raw logs to confirm details, but we put changes online on Monday and this morning (Friday) google's cache for our pages has changed and the resultant rankings have also changed.

internetheaven




msg:4155198
 10:22 am on Jun 19, 2010 (gmt 0)

Flat zero crawl on many UK sites from the beginning of June.

Many new sites still not indexed and new pages on old sites (PR4-6's with thousands of backlinks each) not being indexed either.

Of course, the new twitter accounts for all those new sites have been indexed ... and the MFAs we created have been indexed fine too ... Caffeine seems to be faster indexing of web noise.

g1smd




msg:4155261
 2:45 pm on Jun 19, 2010 (gmt 0)

The google cache version of one of my index pages was 7 days old this morning, now it's dated May 1st.

I have seen that happen many times over the last few years. Google reverts to older data for a short while just before the latest crawl data appears. It is likely you'll show a brand new less-than-24hours-old cache date again in just a few days time.

This 57 message thread spans 2 pages: < < 57 ( 1 [2]
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved