Welcome to WebmasterWorld Guest from 54.166.189.88

Forum Moderators: goodroi

Message Too Old, No Replies

Webmasterworld's Robots.txt

A couple of questiosn for Brett

     
2:13 pm on Dec 27, 2002 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Feb 4, 2002
posts:1314
votes: 0


Brett, if you've got the time:

1. Why ban Googlebot-Image? Is it just to save time because this is a text-based site?

2. Why ban User-agent: Mozilla/4.0 (compatible; MSIE 4.0; Windows NT) and other such browser identification strings? They aren't spidering, are they?

3. I can see why you disallow Forum 9 (Foo). But why Forums 19 (Community Center) and 29 (Commercial exchange)?

4. Are all the bots listed well-behaved or do you also have to ban some of them via rewrites?

Thanks!

3:10 pm on Dec 27, 2002 (gmt 0)

Administrator from US 

WebmasterWorld Administrator brett_tabke is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 21, 1999
posts:38070
votes: 16


1- what possible use could an image bot do for us other than use bandwidth?

2- some bots do support robots.txt and do use standard agents. It catches a few.

3- off topic referrals and local related discussions.

4- I spend an hour to two hours a day tracking down bad bots and banning them, their ip, or there entire isp. It isn't bad when you have a few dozen or few hundred pages; but, when you get into 70-80k pages, bots can tear a system up in no time flat without sufficient tracking and response systems. It is a very serious problem that threatens the entire system on a daily basis. We are having a discussion about moving to a subscription based system and stopping rogue bots would be my #1 reason for doing so.

4:25 pm on Dec 27, 2002 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Feb 4, 2002
posts:1314
votes: 0


Thanks for the info, Brett. It must be endlessly frustrating!

Community Center has quite a few on-topic threads. Maybe they've started in the wrong place and a moderator will move them to a more relevant forum.

Is subscription-only the best way to stop bots? (I'll be brief here as it is obviously being discussed at length elsewhere). A simple mechanism to trip them up at the doorway might be a sign-on screen which has human-readable instructions how to sign on as a guest -- such as typing in a displayed random string.

I was looking for a complete bot ban list (not that I have the problems you do -- I'm not that successful, and what you have should make you happy -- it's a problem of success) so I'm pleased to see that the comments in your robots.txt let other people use it freely.

5:00 pm on Dec 27, 2002 (gmt 0)

Administrator from US 

WebmasterWorld Administrator brett_tabke is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 21, 1999
posts:38070
votes: 16


Well, the local cat here does sometimes get personal stuff posted in it and I noticed some of the referrals were way off our topic area here.

Sure, you can use the robots.txt anywhere.

 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members