Welcome to WebmasterWorld Guest from 54.235.1.148

Forum Moderators: goodroi

Message Too Old, No Replies

Spiders that ignore or skip robots.txt

Can bad spiders be identified?

     
9:21 pm on Feb 2, 2004 (gmt 0)

New User

10+ Year Member

joined:Jan 28, 2004
posts:19
votes: 0


I realize this question boarders on the Spider ID forum that was closed. Although my question is a general one.
Is there a way in the robots.txt file to ID spiders that ignore or even skip it?
Is one thing to be able to id them when they go to robots.txt first. But what about the ones that skip it?

Any suggestions or direction on where to look for info would be helpful.

thanks....

9:26 pm on Feb 2, 2004 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Feb 21, 2003
posts:2355
votes: 0


Most people setup a spider trap to catch bad bots:

[webmasterworld.com...]

9:38 pm on Feb 2, 2004 (gmt 0)

Preferred Member

10+ Year Member

joined:Sept 28, 2002
posts:505
votes: 0


Hi David,

one basic idea is to set up a new directory /bottrap/,
set a hidden link (probably using a 1x1 transparent gif or some other link invisible for the casual user) on your main page,
write the following into your robots.txt
User-agent: *
Disallow: /bottrap/
and wait watching who is accessing the /bottrap/, either by looking thru your log manually, or by setting up a script /bottrap/index.php sending you an automatic alert.

Regards,
R.

2:05 pm on Feb 3, 2004 (gmt 0)

New User

10+ Year Member

joined:Jan 28, 2004
posts:19
votes: 0


Thank you both for the information.

regards....

 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members