Forum Moderators: open

Message Too Old, No Replies

top 10 list of useful bots to allow on your site?

instead of a trap how about only the good ones

         

amznVibe

3:09 pm on Jan 24, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



So is there any decent approach possible for the hundreds of unknown bad robots that DO obey robots.txt like this annoying person? [219.163.188.218...]

They will never trip a robots.txt trapdoor :( I cannot seem to think of any way, other than an exclusive list of bots that should be allowed on the site and hiding a link that is NOT protected by robots.txt and evaluate who trips it.

On that note, is there maybe a top 10 good bot list around here with IP + user-agent?
Would it be bad practice or be difficult to only allow the following bots/spiders? (no particular order)

Google
InfoSeek
Excite
Fast/AllTheWeb
Alta Vista
Lycos
Inktomi
WiseNut
Ask Jeeves/Teoma
Northern Light
Alexa
Gigablast
(okay so its a little bigger than 10 :) )
I know there are huge ip lists out there of all known robots/spiders but maybe we can first make a concise list the good bots we really SHOULD allow in. Then just like we have a good trap.pl script around here maybe we can build an "onlygoodbots.pl"?

pendanticist

3:28 pm on Jan 24, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



amznVibe,

A little bit of creative site searching pulled up this gem:

Spider Trap needs GOOD Spider list [webmasterworld.com].

Pendanticist.

amznVibe

3:43 pm on Jan 24, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Wow, you are a searching master! Thanks for tying these threads together.
But I can also see that thread never got fully developed or moved further than theoretical discussion.
We really DO need a spider trap forum eh? :)

Am I the only one who wishes Brett's site search was a little more powerful and flexible?
I often find myself using the google toolbar for "search site" instead :(

carfac

3:59 pm on Jan 24, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



amznVibe:

I would note that while I would agree that the 'bot's on your list are "good", not all of them respect or obey robots.txt.

dave

amznVibe

4:27 pm on Jan 24, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



ah if that's correct, then it's even a better reason to have a "good bots only" list/filter instead of a robots.txt spider trap...
I'm gonna take a stab at coding one with at least the user agent
(knowing that people can fake that in a heartbeat, but it will still slow many bad bots down)