Welcome to WebmasterWorld Guest from 107.20.75.63

Forum Moderators: phranque

Message Too Old, No Replies

Help this robot is unstopable!

     
11:54 pm on Jun 4, 2004 (gmt 0)

Junior Member

10+ Year Member

joined:Oct 17, 2003
posts:124
votes: 0


Greetings. I received lots of hits from these robots:

Unknown robot (identified by 'spider')
Unknown robot (identified by 'crawl')

and I can't seem to stop them. I tried using my htaccess:

RewriteCond %{HTTP_USER_AGENT} ^spider [OR]
RewriteCond %{HTTP_USER_AGENT} ^crawl [OR]
RewriteCond %{HTTP_USER_AGENT} ^robot.*$ [OR]
RewriteCond %{HTTP_USER_AGENT} ^.*WebCopier.*$
RewriteRule .\.([gG][iI][fF]¦[jJ][pP][gG])$ - [F]

but it was no good. The robots strikes everyday. Please advice, thank you.

11:59 pm on June 4, 2004 (gmt 0)

Senior Member

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Mar 31, 2002
posts:25430
votes: 0


Your stats program is not giving you the actual user-agent name. Look at your raw logs and get the user-agent and IP addresses. Otherwise, you won't be able to block it.

BTW, that RewriteRule is seriously-bad. Try:


RewriteRule \.(gif¦jpe?g¦png)$ - [F,NC]

(Change the broken pipe "¦" characters to solid pipe characters before use.)

Jim

12:10 am on June 5, 2004 (gmt 0)

Junior Member

10+ Year Member

joined:Oct 17, 2003
posts:124
votes: 0


Hi, thanks for the reply. How do I check the real user agent name under the raw long in that case? Which search term should I go after?

Also, for the rewriterule, is it possible to not only block images but completely block the access? thank you.

12:18 am on June 5, 2004 (gmt 0)

Senior Member

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Mar 31, 2002
posts:25430
votes: 0


> Which search term should I go after?

Search for the partial name that your stats gives you.

You can block whatever you want, but identifying the user-agents is the first step. To block all accesses, use


RewriteRule .* - [F]

with the RewriteCond pattern specifying the bad-bot's user-agent name.

Jim

9:58 am on June 6, 2004 (gmt 0)

Junior Member

10+ Year Member

joined:Oct 17, 2003
posts:124
votes: 0


hi, i found the spider's name to be:

Xaldon WebSpider 2.7.b6

So my htaccess should contain:

RewriteCond %{HTTP_USER_AGENT} ^.*Teleport.*$ [OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Wget.*$ [OR]
RewriteCond %{HTTP_USER_AGENT} ^.*WebSpider.*$ [OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Xaldon.*$[OR]
RewriteCond %{HTTP_USER_AGENT} ^.*WebCopier.*$
RewriteRule .* - [F]

Am I right? thanks.

12:57 pm on June 7, 2004 (gmt 0)

Full Member

10+ Year Member

joined:Jan 9, 2003
posts:227
votes: 0


In this thread:
[webmasterworld.com...]
the syntax is:
RewriteCond %{http_user_agent} ^xaldon\ webspider [OR]

Aparently this software is for off-line browsing. Their website had no info on blocking it. (I used Google's translation of the German so I could've missed something.)