homepage Welcome to WebmasterWorld Guest from 54.234.228.64
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Marketing and Biz Dev / Cloaking
Forum Library, Charter, Moderator: open

Cloaking Forum

    
list of bots
stevelibby




msg:678332
 10:27 am on Mar 9, 2006 (gmt 0)

hi
where can i get a list of bot user_agents?

 

volatilegx




msg:678333
 4:10 pm on Mar 9, 2006 (gmt 0)

Here's a pretty good one: [jafsoft.com...]

However, if you are cloaking, I recommend doing it by spider IP addresses instead of user agent, or maybe do it as a combination.

milanmk




msg:678334
 6:32 pm on Mar 9, 2006 (gmt 0)

The Web Robots Database:-

[robotstxt.org...]

Pfui




msg:678335
 10:25 pm on Mar 11, 2006 (gmt 0)

I'm always amazed how many people ask for a list of robots / crawlers / spiders because a handy list simply does not exist, neither would one be current for more than 24 hours.

Every single day I see new robots in my logs (...and when I research them, I see new sites with scraped copies of others' lists of robots...boo-hiss). On my sites, the unbridled proliferation of bots and crawlers from every country imaginable is both intriguing, and irritating because it's costly in terms of bandwidth used::value returned. So I block all but a handful.

So anyway, here's a linked list of my most reliable 'robot research' sources. (Many of the URLs are the first of many pages of data, usually listed/linked alphabetically.) These are original sites compiling and offering their own site's data just as they've been doing for years. The webmasters (and their programs) are doing the obsessively hard work and terrifically good work, too. I tip my hat to every single one.

The Best:

PSYCHEDELIX.COM --
List of User-Agents (Spiders, Robots, Crawler, Browser) [psychedelix.com]
www.psychedelix.com/agents/index.shtml

The Best of the Rest:

SUMMARY.NET -- Known Robots [summary.net] (live site demo)
summary.net:7000/~demo/report/33

Stefan Helbing's Table of bad robots [helbing.nu]
www.helbing.nu/badrobots/index.en.php

KLOTH.NET -- List of Bad Bots [kloth.net] (alas, not as current as it once was)
www.kloth.net/internet/badbots.php

John A Fotheringham's Search engine robots that visit your web site [jafsoft.com]
www.jafsoft.com/searchengines/webbots.html

AWM-Webmaster.com -- Browser, Spider, Robots und Crawlers [awm-webmaster.com]
awm-webmaster.com/webmaster-infoarchiv/user-agents1.html

Note: You could go slightly goofy trying to combine the preceding pages' entries in .htaccess, etc., in order to block or otherwise handle the worst one by one. (I know. I tried:) There are simply too danged many of the evil spawn...

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Marketing and Biz Dev / Cloaking
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved