dnbjason

msg:161884 | 3:57 pm on Oct 15, 2003 (gmt 0) |
Man... people can be so stupid! There goes that hosting companies business.
|
Yidaki

msg:161885 | 4:31 pm on Oct 15, 2003 (gmt 0) |
>Almost a year of lost traffic and sales Send them the bill - at least that's what i'd do definitely!
|
Shak

msg:161886 | 5:20 pm on Oct 15, 2003 (gmt 0) |
learn from your mistakes and move on. Shak
|
jpavery

msg:161887 | 5:43 pm on Oct 15, 2003 (gmt 0) |
We have a stand-alone server... last month Googlebot brought our site to its knees... opening millions and millions of carts. We knew the importance of Google, of course,... so we rebooted the server when ever the load got to heavy... this cleared the millions of shopping carts... but did not kick googlebot out. It was a little suprising since this is the first month we have ever had a problem. JP
|
WayneStPaul

msg:161888 | 5:55 pm on Oct 15, 2003 (gmt 0) |
Had a similar problem once upon a time. Now all of my e-commerce sites have a robots.txt that tells all spiders not to follow any of the links that add items to a cart (and thus create carts). If you have a large site I highly recomend this.
|
oilman

msg:161889 | 5:59 pm on Oct 15, 2003 (gmt 0) |
logfiles, logfiles, logfiles - you didn't notice googlebot in your logfiles for nearly a year and the blame is to be solely directed at your host?
|
dirkz

msg:161890 | 6:10 pm on Oct 15, 2003 (gmt 0) |
Googlebot is such a precious species, I can't believe they kill it.
|
Yidaki

msg:161891 | 6:12 pm on Oct 15, 2003 (gmt 0) |
>logfiles, logfiles, logfiles Allthough i absolutely agree that it's somehow also your own fault if you don't notice it, i would blame my hoster for blocking bots without letting me and others now about it. Fortunately i have my own dedicated network. >follow any of the links that add items to a cart erm, yes that'd be silly if you'd allow this. ;)
|
stuntdubl

msg:161892 | 6:47 pm on Oct 15, 2003 (gmt 0) |
I've had some hosting nightmares as well. Pay a little bit more and be sure the company "knows the deal". Also, always have a plan B to jump ship in case they start to sink. Don't ever expect a hosting company to be accountable for their actions either.
|
hutchins13

msg:161893 | 6:48 pm on Oct 15, 2003 (gmt 0) |
oilman, The page requests from Googlebot did show up in the logs every month. ---------------------------------------- Address: crawl16.googlebot.com Browser: Googlebot 2.1 (http:/www.googlebot.com/bot.html) Protocol: HTTP/1.0 GET 20.75k /index.cfm ----------------------------------------
|
Yidaki

msg:161894 | 6:53 pm on Oct 15, 2003 (gmt 0) |
>oilman, The page requests from Googlebot did show up in the logs every month. Huh? Then it's not your hoster who caused the trouble. If the requests show in the logs, the pages HAVE BEEN fetched by googlebot.
|
oilman

msg:161895 | 6:53 pm on Oct 15, 2003 (gmt 0) |
>>the page requests from Googlebot did show up in the logs every month ok - so Google was crawling your site? or was it just requesting the the index page and getting denied?
|
hutchins13

msg:161896 | 6:56 pm on Oct 15, 2003 (gmt 0) |
Apperently it was requesting the page and was then blocked.
|
plasma

msg:161897 | 7:25 pm on Oct 15, 2003 (gmt 0) |
| We have a stand-alone server... last month Googlebot brought our site to its knees... opening millions and millions of carts. |
| Then you're doing something wrong. Just don't open a cart if it's a spider :) (And don't create SID for it)
|
hot_tubs

msg:161898 | 7:48 pm on Oct 15, 2003 (gmt 0) |
Wouldn't it be an easy fix to get rid of the text links for adding items to the basket and replacing them with regular buttons?
|
WebGuerrilla

msg:161899 | 12:37 am on Oct 16, 2003 (gmt 0) |
---------------------------------------- Address: crawl16.googlebot.com Browser: Googlebot 2.1 (http:/www.googlebot.com/bot.html) Protocol: HTTP/1.0 GET 20.75k /index.cfm |
| Anyone reading this thread who can't access the raw log file on their server should start looking for a new hosting company now. If you were looking at the actual log file, you would have been able to see how the server responded to the request. My guess is they were serving Google a 403 everytine they requested a page.
|
seofreak

msg:161900 | 2:21 am on Oct 16, 2003 (gmt 0) |
OUCH! that's horibble stuff. That's why i love my host .. they inform about everything .. even if they are about to reboot.
|
hutchins13

msg:161901 | 2:30 am on Oct 16, 2003 (gmt 0) |
Here is the raw log file data: 2003-08-02 13:01:03 64.68.85.28 W3SVC69 80 GET /index.cfm - 200 21231 188 Googlebot/2.1+(+http://www.googlebot.com/bot.html) - I just thought the WebTrends report was more legible. I don't exactly know how they where blocking Googlebot, but they said they were. They moved the site to a different server (yes... same hosting company until after the end of the year... too busy now) and the site was back in Google's index. I know nothing about server administation. Maybe someone here can explain how they could do this.
|
jady

msg:161902 | 2:41 am on Oct 16, 2003 (gmt 0) |
Thanks for the story! WOW! As a hosting firm employee here in Florida I can assure you that this is not normal nor right to do! Maybe this is a good thing for everyone to confirm with their host or prospective hosting firm - make sure they dont do stupid stuff! I mean Googlebot is very active on our servers but does not cause any bandwidth issues and if it were to, I would rather cut down the users on the server in question rather than block our good friend! Now I have heard it all.. <G>
|
BlueSky

msg:161903 | 3:53 am on Oct 16, 2003 (gmt 0) |
That log entry shows a 200 not a 403 -- he pulled at least that page okay.
|
AthlonInside

msg:161904 | 6:07 am on Oct 16, 2003 (gmt 0) |
Look forward, don't look back! You have pay a great price to learn this lesson, so work smarter for the next year to earn back double what you have lost!
|
IanTurner

msg:161905 | 12:35 pm on Oct 16, 2003 (gmt 0) |
The interesting thing is that as a hosting company I would think that I was perfectly justified in disallowing any IPs that I thought were abusing the network (I'm not saying I would do it to Googlebot, Slurp, Scooter or any other major SE spider - but thats only because I know about these things) A junior tech at any hosting company without background info might be in a position to block any SE spider almost on a whim. Especially as there are plenty of spiders out there that need that sort of treatment. So it makes it vitally important to webmasters to ensure that they are checking their logs to see that the visits are being responded to correctly.
|
jady

msg:161906 | 1:17 pm on Oct 16, 2003 (gmt 0) |
I am just curious if that hosting company has the server that THEIR website is on blocked by GoogleBot. Something is telling me that they dont.. :)
|
asinah

msg:161907 | 4:24 pm on Oct 16, 2003 (gmt 0) |
If you see actually a 200 response code I don't think they blocked it. If it is a UNIX box your provider may have programmed something like: SetEnvIf User-Agent ^Googlebot keep_out deny from 64.68.85.* deny from env=keep_out and he may have used a sub categories in your directories. Get a linux dedicated server with a lot of storage space for a logfile.
|
|