Welcome to WebmasterWorld Guest from 54.226.62.251

Forum Moderators: phranque

Message Too Old, No Replies

Help this robot is unstopable!

     

expert_21

11:54 pm on Jun 4, 2004 (gmt 0)

10+ Year Member



Greetings. I received lots of hits from these robots:

Unknown robot (identified by 'spider')
Unknown robot (identified by 'crawl')

and I can't seem to stop them. I tried using my htaccess:

RewriteCond %{HTTP_USER_AGENT} ^spider [OR]
RewriteCond %{HTTP_USER_AGENT} ^crawl [OR]
RewriteCond %{HTTP_USER_AGENT} ^robot.*$ [OR]
RewriteCond %{HTTP_USER_AGENT} ^.*WebCopier.*$
RewriteRule .\.([gG][iI][fF]¦[jJ][pP][gG])$ - [F]

but it was no good. The robots strikes everyday. Please advice, thank you.

jdMorgan

11:59 pm on Jun 4, 2004 (gmt 0)

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Your stats program is not giving you the actual user-agent name. Look at your raw logs and get the user-agent and IP addresses. Otherwise, you won't be able to block it.

BTW, that RewriteRule is seriously-bad. Try:


RewriteRule \.(gif¦jpe?g¦png)$ - [F,NC]

(Change the broken pipe "¦" characters to solid pipe characters before use.)

Jim

expert_21

12:10 am on Jun 5, 2004 (gmt 0)

10+ Year Member



Hi, thanks for the reply. How do I check the real user agent name under the raw long in that case? Which search term should I go after?

Also, for the rewriterule, is it possible to not only block images but completely block the access? thank you.

jdMorgan

12:18 am on Jun 5, 2004 (gmt 0)

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member



> Which search term should I go after?

Search for the partial name that your stats gives you.

You can block whatever you want, but identifying the user-agents is the first step. To block all accesses, use


RewriteRule .* - [F]

with the RewriteCond pattern specifying the bad-bot's user-agent name.

Jim

expert_21

9:58 am on Jun 6, 2004 (gmt 0)

10+ Year Member



hi, i found the spider's name to be:

Xaldon WebSpider 2.7.b6

So my htaccess should contain:

RewriteCond %{HTTP_USER_AGENT} ^.*Teleport.*$ [OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Wget.*$ [OR]
RewriteCond %{HTTP_USER_AGENT} ^.*WebSpider.*$ [OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Xaldon.*$[OR]
RewriteCond %{HTTP_USER_AGENT} ^.*WebCopier.*$
RewriteRule .* - [F]

Am I right? thanks.

saoi_jp

12:57 pm on Jun 7, 2004 (gmt 0)

10+ Year Member



In this thread:
[webmasterworld.com...]
the syntax is:
RewriteCond %{http_user_agent} ^xaldon\ webspider [OR]

Aparently this software is for off-line browsing. Their website had no info on blocking it. (I used Google's translation of the German so I could've missed something.)

 

Featured Threads

Hot Threads This Week

Hot Threads This Month