Welcome to WebmasterWorld Guest from 54.198.147.221

Forum Moderators: open

Message Too Old, No Replies

Knocked out of Google by hosting company

     
7:26 pm on Oct 14, 2003 (gmt 0)

Junior Member

10+ Year Member

joined:Sept 24, 2001
posts:113
votes: 0


I had a site that dropped out of Google after the October 2002 update. I spent countless days... actually months... trying to determine the cause. I even emailed Google several times and always got the response that the site is not being penalized.

This year about the end of August, my site was down and I had a discussion with a person at the hosting company who indicated that the server was down because of a foreign spider. He indicated that he blocked the spider and the server was restored to normal operation.

This got me thinking about the Google problem so I asked if he had blocked Googlebot. To my astonishment, the hosting company was blocking Googlebot because it had caused server performance issues. This completely shocked me. I've never heard of a hosting company blocking a spider so important to most, if not all, of the 200 sites on this server. Their thinking is that it's best to have the server up and running... apparently at the expense of killing half (or more) of traffic going to all of these sites.

My site was moved to a new server, and within days, it was back in Google's index. Almost a year of lost traffic and sales resulted because of the hosting company.

If you are having troubles getting into Google's index or have fallen out for some unknown reason, you might want to check with your hosting company to see if they are blocking spiders. I know there are at least 199 other web sites that are blocked from Google because of this hosting company.

3:57 pm on Oct 15, 2003 (gmt 0)

Junior Member

10+ Year Member

joined:May 19, 2003
posts:83
votes: 0


Man... people can be so stupid! There goes that hosting companies business.
4:31 pm on Oct 15, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Apr 8, 2002
posts:2013
votes: 0


>Almost a year of lost traffic and sales

Send them the bill - at least that's what i'd do definitely!

5:20 pm on Oct 15, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member shak is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:June 28, 2002
posts:4154
votes: 0


learn from your mistakes and move on.

Shak

5:43 pm on Oct 15, 2003 (gmt 0)

Junior Member

10+ Year Member

joined:Sept 30, 2002
posts:128
votes: 0


We have a stand-alone server... last month Googlebot brought our site to its knees... opening millions and millions of carts.

We knew the importance of Google, of course,... so we rebooted the server when ever the load got to heavy... this cleared the millions of shopping carts... but did not kick googlebot out.

It was a little suprising since this is the first month we have ever had a problem.

JP

WayneStPaul

5:55 pm on Oct 15, 2003 (gmt 0)

Inactive Member
Account Expired

 
 


Had a similar problem once upon a time. Now all of my e-commerce sites have a robots.txt that tells all spiders not to follow any of the links that add items to a cart (and thus create carts). If you have a large site I highly recomend this.
5:59 pm on Oct 15, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Mar 10, 2000
posts:2151
votes: 0


logfiles, logfiles, logfiles - you didn't notice googlebot in your logfiles for nearly a year and the blame is to be solely directed at your host?
6:10 pm on Oct 15, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:July 28, 2003
posts:925
votes: 0


Googlebot is such a precious species, I can't believe they kill it.
6:12 pm on Oct 15, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Apr 8, 2002
posts:2013
votes: 0


>logfiles, logfiles, logfiles

Allthough i absolutely agree that it's somehow also your own fault if you don't notice it, i would blame my hoster for blocking bots without letting me and others now about it.

Fortunately i have my own dedicated network.

>follow any of the links that add items to a cart

erm, yes that'd be silly if you'd allow this. ;)

6:47 pm on Oct 15, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:July 22, 2002
posts:1807
votes: 1


I've had some hosting nightmares as well. Pay a little bit more and be sure the company "knows the deal". Also, always have a plan B to jump ship in case they start to sink. Don't ever expect a hosting company to be accountable for their actions either.
6:48 pm on Oct 15, 2003 (gmt 0)

Junior Member

10+ Year Member

joined:Sept 24, 2001
posts:113
votes: 0


oilman, The page requests from Googlebot did show up in the logs every month.

----------------------------------------
Address: crawl16.googlebot.com
Browser: Googlebot 2.1 (http:/www.googlebot.com/bot.html)
Protocol: HTTP/1.0

GET 20.75k /index.cfm
----------------------------------------

6:53 pm on Oct 15, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Apr 8, 2002
posts:2013
votes: 0


>oilman, The page requests from Googlebot did show up in the logs every month.

Huh? Then it's not your hoster who caused the trouble. If the requests show in the logs, the pages HAVE BEEN fetched by googlebot.

6:53 pm on Oct 15, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Mar 10, 2000
posts:2151
votes: 0


>>the page requests from Googlebot did show up in the logs every month

ok - so Google was crawling your site? or was it just requesting the the index page and getting denied?

6:56 pm on Oct 15, 2003 (gmt 0)

Junior Member

10+ Year Member

joined:Sept 24, 2001
posts:113
votes: 0


Apperently it was requesting the page and was then blocked.
7:25 pm on Oct 15, 2003 (gmt 0)

Preferred Member

10+ Year Member

joined:Mar 13, 2003
posts:630
votes: 0


We have a stand-alone server... last month Googlebot brought our site to its knees... opening millions and millions of carts.

Then you're doing something wrong.
Just don't open a cart if it's a spider :)
(And don't create SID for it)

7:48 pm on Oct 15, 2003 (gmt 0)

New User

10+ Year Member

joined:Aug 5, 2003
posts:10
votes: 0


Wouldn't it be an easy fix to get rid of the text links for adding items to the basket and replacing them with regular buttons?
12:37 am on Oct 16, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:June 26, 2000
posts:2176
votes: 0


----------------------------------------
Address: crawl16.googlebot.com
Browser: Googlebot 2.1 (http:/www.googlebot.com/bot.html)
Protocol: HTTP/1.0

GET 20.75k /index.cfm

Anyone reading this thread who can't access the raw log file on their server should start looking for a new hosting company now. If you were looking at the actual log file, you would have been able to see how the server responded to the request.

My guess is they were serving Google a 403 everytine they requested a page.

2:21 am on Oct 16, 2003 (gmt 0)

Full Member

10+ Year Member

joined:May 22, 2003
posts:278
votes: 0


OUCH! that's horibble stuff. That's why i love my host .. they inform about everything .. even if they are about to reboot.
2:30 am on Oct 16, 2003 (gmt 0)

Junior Member

10+ Year Member

joined:Sept 24, 2001
posts:113
votes: 0


Here is the raw log file data:

2003-08-02 13:01:03 64.68.85.28 W3SVC69 80 GET /index.cfm - 200 21231 188 Googlebot/2.1+(+http://www.googlebot.com/bot.html) -

I just thought the WebTrends report was more legible.

I don't exactly know how they where blocking Googlebot, but they said they were. They moved the site to a different server (yes... same hosting company until after the end of the year... too busy now) and the site was back in Google's index.

I know nothing about server administation. Maybe someone here can explain how they could do this.

2:41 am on Oct 16, 2003 (gmt 0)

Preferred Member

10+ Year Member

joined:Feb 9, 2002
posts:426
votes: 0


Thanks for the story! WOW! As a hosting firm employee here in Florida I can assure you that this is not normal nor right to do! Maybe this is a good thing for everyone to confirm with their host or prospective hosting firm - make sure they dont do stupid stuff!

I mean Googlebot is very active on our servers but does not cause any bandwidth issues and if it were to, I would rather cut down the users on the server in question rather than block our good friend! Now I have heard it all.. <G>

3:53 am on Oct 16, 2003 (gmt 0)

Preferred Member

10+ Year Member

joined:Aug 11, 2003
posts:495
votes: 0


That log entry shows a 200 not a 403 -- he pulled at least that page okay.
6:07 am on Oct 16, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Feb 17, 2003
posts:687
votes: 0


Look forward, don't look back!

You have pay a great price to learn this lesson, so work smarter for the next year to earn back double what you have lost!

12:35 pm on Oct 16, 2003 (gmt 0)

Moderator from GB 

WebmasterWorld Administrator ianturner is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:July 19, 2001
posts:3607
votes: 37


The interesting thing is that as a hosting company I would think that I was perfectly justified in disallowing any IPs that I thought were abusing the network (I'm not saying I would do it to Googlebot, Slurp, Scooter or any other major SE spider - but thats only because I know about these things)

A junior tech at any hosting company without background info might be in a position to block any SE spider almost on a whim. Especially as there are plenty of spiders out there that need that sort of treatment.

So it makes it vitally important to webmasters to ensure that they are checking their logs to see that the visits are being responded to correctly.

1:17 pm on Oct 16, 2003 (gmt 0)

Preferred Member

10+ Year Member

joined:Feb 9, 2002
posts:426
votes: 0


I am just curious if that hosting company has the server that THEIR website is on blocked by GoogleBot. Something is telling me that they dont.. :)
4:24 pm on Oct 16, 2003 (gmt 0)

Full Member

10+ Year Member

joined:July 11, 2003
posts:291
votes: 0


If you see actually a 200 response code I don't think they blocked it. If it is a UNIX box your provider may have programmed something like:

SetEnvIf User-Agent ^Googlebot keep_out
deny from 64.68.85.*
deny from env=keep_out

and he may have used a sub categories in your directories.

Get a linux dedicated server with a lot of storage space for a logfile.