Forum Moderators: open
Here are the ones I allow without much in the way of checking on them:
Ask Jeeves
ExactSeek
FAST-WebCrawler
FAST FirstPage retriever
Fluffy the spider
Gigabot
Googlebot
Googlebot-Image
ia_archiver
Libby
Lycos_Spider
MARTINI
Mercator
MSNBOT
NutchOrg
Openfind data gatherer
polybot
Pompos
Robozilla
Scooter
Scrubby
Slurp
surfsafely
Teoma
Teradex Mapper
THUNDERSTONE
Vagabondo
Zealbot
Zyborg
Which are most important depends on your market and location, but Googlebot, FAST, Slurp(Inktomi), Scooter(AltaVista), and Ask Jeeves/Teoma are the must-haves for many.
But if you ask 100 webmasters, you'll get a hundred different answers about "which do you want?"
Jim
There are actually spiders you DON'T want crawling your site?
There are actually people you DON'T want in your home? ;)
ktd
I realize your realtively new here.
If you read the link I provided in the second mail of this thread "A Close to Perfect htaccess" you will see that there are more than a few spiders which are not desirable by many webmasters.
In addition this thread offers some explanations:
[webmasterworld.com...]
As does an earlier reply of mine to sanuk in another thread tonight.
Don
> Why wouldn't you want these spiders crawling your site? What harm do they do? Can they actually hurt your rankings?
Unless they place an extreme load on your server, no, they can't generally hurt your rankings. However, they might:
There are hundreds of other potential problems. Some of the banned user-agents are *very* nasty - either by intent, or because they are very badly-coded.
I allow Google, Slurp, and the few others listed above. Other than that, I want to know who they are and what they want before they come in.
However, it is an established tenet of this forum that you may do differently on your site if you wish. Your site may be in a completely different market segment than mine, so who am I to tell you what to do? However, the title of this thread was: Which spiders do you want... and I answered for me.
HTH,
Jim
Those are the robots that are shown crawling my site in my site stats. Do you guys see anything i should be worried about?
larbin is one I absolutely won't allow. Try the WebmasterWorld site search, using those user-agent names as your search phrase, or just look through the back threads on this forum. Some of those others are "iffy" depending on what kind of site you have.
Some of us are permissive, and some have an almost zero-tolerance policy for 'bots which do not identify themselves properly, or do not fetch and obey robots.txt. There is also a nice free script posted here for detecting and automatically blocking rogue 'bots - try a search for "Ban malicious visitors Perl Script."
Jim