Welcome to WebmasterWorld Guest from 54.80.185.137

Forum Moderators: Ocean10000 & incrediBILL & keyplyr

Message Too Old, No Replies

Ultimate short list of banned bots

Wanna contribute?

     
1:29 pm on Jul 25, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:June 20, 2003
posts:1741
votes: 0


This is a short list of my recent banned bots, some are very well known threats but most of these were added yesterday and wasn't crawlin my site sites ago, so perhaps you could take a look and see if they are visiting your website too or if is there some "good robots" in this list that doesn't dreserve to be there ;)

aipbot
BecomeBot
Cerberian Drtrs
COMBINE
ConveraCrawler
Custom Bot/Robot #20
e-collector
Faxobot/1.0
Faxobot
Fetch API Request
FAST Enterprise Crawler
Html Link Validator (www.lithopssoft.com)
iaea
ichiro
INDEXU Spider Link Checker
IRLbot
linkwalker
LinksManager
Microsoft URL Control
microsoft.url
NaverBot
Nutch
NutchOrg
OmniExplorer_Bot
Spam Bot
Test.Com
T-H-U-N-D-E-R-S-T-O-N-E
Twiceler
Xenu Link Sleuth

12:10 am on July 28, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 11, 2001
posts:5485
votes: 3


Utilizing balam's link and suggstion?

A search of the Webmaster World pages at google with forum11 (this forum) returns the following:

[google.com...]

should you desire to add a particular bot, UA or IP?
just add after forum11 in the search box

+and the name

Some more Forum tools:

Valid Search Engine?
[webmasterworld.com...]

IIS and Global.asa
[w3schools.com...]

dbm Maps
[webmasterworld.com...]

Reduce harvests
[webmasterworld.com...] Msg#16

Throttle runaways
[webmasterworld.com...]

Block Methods (Scroll past opening Advertisements)
[diveintomark.org...]

Regular Expressions
[etext.lib.virginia.edu...]
[gnosis.cx...]

Close To Perfect I
[webmasterworld.com...]
Close To Perfect II
[webmasterworld.com...]
Close To Perfect III
[webmasterworld.com...]

Concise htaccess
[webmasterworld.com...]

robots.text on a diet
[webmasterworld.com...]

Search Tools
[webmasterworld.com...]

3:07 am on July 29, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:June 20, 2003
posts:1741
votes: 0


balam: you waste your time judging other's posts and btw your post is not only bored but also useless to the purpouse of the thread.

widlerness: thanks

Test.com is still unknown to me and probably others.

4:09 am on July 29, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 11, 2001
posts:5485
votes: 3


balam: you waste your time judging other's posts and btw your post is not only bored but also useless to the purpouse of the thread.

Actually!
I thought this part (below) was rather funny and on a couple of passing thoughts, almost submitted to remined others that it WAS tongue-in-cheek without the emoticon.

I'd like to point out that "Googlebot", "Yahoo! Slurp" & "msnbot" are the three worst spiders you could have visiting your site. Ban them now, before it's too late and the damage is done!

BTW silver,
ALL those links came from the Close to Perfect htaccess thread :)

This forum has been primarily htaccess (in addition to SESID) at least since I've been here (My profile says 2001, however I was previously registered under another screen name.)
I even used balam's google link to do a search on "forum11+IIS" and there wasn't much. There was a IIS inquiriy in the "Close to Perfect" that went unanswered.

Prior to this forum going down, it was a bundle of activity and many of the once participants here have not returned. (balam was a regular at one time.)

The moderated forum (no pun as the forum does exists,)is of a lesser reaching claw than the old forum. In the old forum, submissions were not delayed and a spider could be stopped in its tracks.
Nor was separating private from commercial IP ranges an issue in the old forum. Today, even though a private IP may be doing massive crawls, we are apparently limited by Charther, moderation and delay.
In all fairness though, the old forum was shut down because some posters were submitting their competitors as crawls.

That Bret decided to bring this forum back is commendable.

My question is?

If one of the 29 UA's you submitted came from a private IP range?
How could the poster provide the IP range without violating forum rules and yet sharing accurate information?

Don

3:23 pm on July 29, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Mar 22, 2001
posts:2450
votes: 0


Guess it's my turn to weigh in here.

If one of the 29 UA's you submitted came from a private IP range?
How could the poster provide the IP range without violating forum rules and yet sharing accurate information?

You can't. If an IP address appears to be that of a private individual, please don't post it. If you need to post it, please do not post the last last group of numbers in the IP address.

This forum was closed for liability reasons and was only reopened when I promised to enforce this strict guideline.

Remember, the primary purpose of the forum is to identify search engine spiders. Identifying other types of spiders (including building ban-lists) is a secondary function of this forum.

That being said, I have no problem with starting new threads listing "bannable bots", assuming those threads don't turn into flame-fests or some such. There are pre-existing threads with this information, but it takes a lot of digging to get to the meat of the info in them because they are so long. It would be nice to have one comprehensive thread of bad-bots with no extraneous posts in it, but I am probably dreaming.

6:10 pm on July 29, 2005 (gmt 0)

Preferred Member

10+ Year Member

joined:May 27, 2003
posts:503
votes: 0


I'm only here for a couple of minutes - to see who I'm peeving off now [webmasterworld.com] - but this thread fulfills most of volatilegx's dream: WebmasterWorld forum11 Updated and Collated Bot List [webmasterworld.com]
8:10 pm on July 29, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Mar 22, 2001
posts:2450
votes: 0


Yeah, bull's list is even in the forum's library [webmasterworld.com], but it doesn't cull the 'good' bots from the 'bad'.

Of course, everybody's definitions of bad and good will be different. In my opinion, any bot attached to a public search engine (which doesn't cache or has opt-out cacheing) is good. Anything else is bad.

8:34 am on Aug 2, 2005 (gmt 0)

Preferred Member

10+ Year Member

joined:Mar 22, 2005
posts:373
votes: 0


it's virtually impossible to follow a 25 pages span thread

it isn't that bad!
it's an excellent thread, well worth spending a bit of time on

This 17 message thread spans 2 pages: 17