Knocked out of Google by hosting company - (deprecated) Google News Archive forum at WebmasterWorld - WebmasterWorld

Forum Moderators: open

Message Too Old, No Replies

Knocked out of Google by hosting company

hutchins13

7:26 pm on Oct 14, 2003 (gmt 0)

10+ Year Member

I had a site that dropped out of Google after the October 2002 update. I spent countless days... actually months... trying to determine the cause. I even emailed Google several times and always got the response that the site is not being penalized.

This year about the end of August, my site was down and I had a discussion with a person at the hosting company who indicated that the server was down because of a foreign spider. He indicated that he blocked the spider and the server was restored to normal operation.

This got me thinking about the Google problem so I asked if he had blocked Googlebot. To my astonishment, the hosting company was blocking Googlebot because it had caused server performance issues. This completely shocked me. I've never heard of a hosting company blocking a spider so important to most, if not all, of the 200 sites on this server. Their thinking is that it's best to have the server up and running... apparently at the expense of killing half (or more) of traffic going to all of these sites.

My site was moved to a new server, and within days, it was back in Google's index. Almost a year of lost traffic and sales resulted because of the hosting company.

If you are having troubles getting into Google's index or have fallen out for some unknown reason, you might want to check with your hosting company to see if they are blocking spiders. I know there are at least 199 other web sites that are blocked from Google because of this hosting company.

dnbjason

3:57 pm on Oct 15, 2003 (gmt 0)

10+ Year Member

Man... people can be so stupid! There goes that hosting companies business.

Yidaki

4:31 pm on Oct 15, 2003 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

>Almost a year of lost traffic and sales

Send them the bill - at least that's what i'd do definitely!

Shak

5:20 pm on Oct 15, 2003 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

learn from your mistakes and move on.

Shak

jpavery

5:43 pm on Oct 15, 2003 (gmt 0)

10+ Year Member

We have a stand-alone server... last month Googlebot brought our site to its knees... opening millions and millions of carts.

We knew the importance of Google, of course,... so we rebooted the server when ever the load got to heavy... this cleared the millions of shopping carts... but did not kick googlebot out.

It was a little suprising since this is the first month we have ever had a problem.

JP

WayneStPaul

5:55 pm on Oct 15, 2003 (gmt 0)

Had a similar problem once upon a time. Now all of my e-commerce sites have a robots.txt that tells all spiders not to follow any of the links that add items to a cart (and thus create carts). If you have a large site I highly recomend this.

oilman

5:59 pm on Oct 15, 2003 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

logfiles, logfiles, logfiles - you didn't notice googlebot in your logfiles for nearly a year and the blame is to be solely directed at your host?

dirkz

6:10 pm on Oct 15, 2003 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Googlebot is such a precious species, I can't believe they kill it.

Yidaki

6:12 pm on Oct 15, 2003 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

>logfiles, logfiles, logfiles

Allthough i absolutely agree that it's somehow also your own fault if you don't notice it, i would blame my hoster for blocking bots without letting me and others now about it.

Fortunately i have my own dedicated network.

>follow any of the links that add items to a cart

erm, yes that'd be silly if you'd allow this. ;)

stuntdubl

6:47 pm on Oct 15, 2003 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

I've had some hosting nightmares as well. Pay a little bit more and be sure the company "knows the deal". Also, always have a plan B to jump ship in case they start to sink. Don't ever expect a hosting company to be accountable for their actions either.

hutchins13

6:48 pm on Oct 15, 2003 (gmt 0)

10+ Year Member

oilman, The page requests from Googlebot did show up in the logs every month.

----------------------------------------
Address: crawl16.googlebot.com
Browser: Googlebot 2.1 (http:/www.googlebot.com/bot.html)
Protocol: HTTP/1.0

GET 20.75k /index.cfm
----------------------------------------

Yidaki

6:53 pm on Oct 15, 2003 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

>oilman, The page requests from Googlebot did show up in the logs every month.

Huh? Then it's not your hoster who caused the trouble. If the requests show in the logs, the pages HAVE BEEN fetched by googlebot.

oilman

6:53 pm on Oct 15, 2003 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

>>the page requests from Googlebot did show up in the logs every month

ok - so Google was crawling your site? or was it just requesting the the index page and getting denied?

hutchins13

6:56 pm on Oct 15, 2003 (gmt 0)

10+ Year Member

Apperently it was requesting the page and was then blocked.

plasma

7:25 pm on Oct 15, 2003 (gmt 0)

10+ Year Member

We have a stand-alone server... last month Googlebot brought our site to its knees... opening millions and millions of carts.

Then you're doing something wrong.
Just don't open a cart if it's a spider :)
(And don't create SID for it)

hot_tubs

7:48 pm on Oct 15, 2003 (gmt 0)

10+ Year Member

Wouldn't it be an easy fix to get rid of the text links for adding items to the basket and replacing them with regular buttons?

WebGuerrilla

12:37 am on Oct 16, 2003 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

----------------------------------------
Address: crawl16.googlebot.com
Browser: Googlebot 2.1 (http:/www.googlebot.com/bot.html)
Protocol: HTTP/1.0
GET 20.75k /index.cfm

Anyone reading this thread who can't access the raw log file on their server should start looking for a new hosting company now. If you were looking at the actual log file, you would have been able to see how the server responded to the request.

My guess is they were serving Google a 403 everytine they requested a page.

seofreak

2:21 am on Oct 16, 2003 (gmt 0)

10+ Year Member

OUCH! that's horibble stuff. That's why i love my host .. they inform about everything .. even if they are about to reboot.

hutchins13

2:30 am on Oct 16, 2003 (gmt 0)

10+ Year Member

Here is the raw log file data:

2003-08-02 13:01:03 64.68.85.28 W3SVC69 80 GET /index.cfm - 200 21231 188 Googlebot/2.1+(+http://www.googlebot.com/bot.html) -

I just thought the WebTrends report was more legible.

I don't exactly know how they where blocking Googlebot, but they said they were. They moved the site to a different server (yes... same hosting company until after the end of the year... too busy now) and the site was back in Google's index.

I know nothing about server administation. Maybe someone here can explain how they could do this.

jady

2:41 am on Oct 16, 2003 (gmt 0)

10+ Year Member

Thanks for the story! WOW! As a hosting firm employee here in Florida I can assure you that this is not normal nor right to do! Maybe this is a good thing for everyone to confirm with their host or prospective hosting firm - make sure they dont do stupid stuff!

I mean Googlebot is very active on our servers but does not cause any bandwidth issues and if it were to, I would rather cut down the users on the server in question rather than block our good friend! Now I have heard it all.. <G>

BlueSky

3:53 am on Oct 16, 2003 (gmt 0)

10+ Year Member

That log entry shows a 200 not a 403 -- he pulled at least that page okay.

AthlonInside

6:07 am on Oct 16, 2003 (gmt 0)

WebmasterWorld Senior Member

10+ Year Member

Look forward, don't look back!

You have pay a great price to learn this lesson, so work smarter for the next year to earn back double what you have lost!

IanTurner

12:35 pm on Oct 16, 2003 (gmt 0)

WebmasterWorld Administrator

10+ Year Member

Top Contributors Of The Month

The interesting thing is that as a hosting company I would think that I was perfectly justified in disallowing any IPs that I thought were abusing the network (I'm not saying I would do it to Googlebot, Slurp, Scooter or any other major SE spider - but thats only because I know about these things)

A junior tech at any hosting company without background info might be in a position to block any SE spider almost on a whim. Especially as there are plenty of spiders out there that need that sort of treatment.

So it makes it vitally important to webmasters to ensure that they are checking their logs to see that the visits are being responded to correctly.

jady

1:17 pm on Oct 16, 2003 (gmt 0)

10+ Year Member

I am just curious if that hosting company has the server that THEIR website is on blocked by GoogleBot. Something is telling me that they dont.. :)

asinah

4:24 pm on Oct 16, 2003 (gmt 0)

10+ Year Member

If you see actually a 200 response code I don't think they blocked it. If it is a UNIX box your provider may have programmed something like:

SetEnvIf User-Agent ^Googlebot keep_out
deny from 64.68.85.*
deny from env=keep_out

and he may have used a sub categories in your directories.

Get a linux dedicated server with a lot of storage space for a logfile.