homepage Welcome to WebmasterWorld Guest from 54.198.8.124
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Visit PubCon.com
Home / Forums Index / WebmasterWorld / Webmaster General
Forum Library, Charter, Moderators: phranque & physics

Webmaster General Forum

    
Help this robot is unstopable!
expert_21




msg:358269
 11:54 pm on Jun 4, 2004 (gmt 0)

Greetings. I received lots of hits from these robots:

Unknown robot (identified by 'spider')
Unknown robot (identified by 'crawl')

and I can't seem to stop them. I tried using my htaccess:

RewriteCond %{HTTP_USER_AGENT} ^spider [OR]
RewriteCond %{HTTP_USER_AGENT} ^crawl [OR]
RewriteCond %{HTTP_USER_AGENT} ^robot.*$ [OR]
RewriteCond %{HTTP_USER_AGENT} ^.*WebCopier.*$
RewriteRule .\.([gG][iI][fF]¦[jJ][pP][gG])$ - [F]

but it was no good. The robots strikes everyday. Please advice, thank you.

 

jdMorgan




msg:358270
 11:59 pm on Jun 4, 2004 (gmt 0)

Your stats program is not giving you the actual user-agent name. Look at your raw logs and get the user-agent and IP addresses. Otherwise, you won't be able to block it.

BTW, that RewriteRule is seriously-bad. Try:

RewriteRule \.(gif¦jpe?g¦png)$ - [F,NC]

(Change the broken pipe "¦" characters to solid pipe characters before use.)

Jim

expert_21




msg:358271
 12:10 am on Jun 5, 2004 (gmt 0)

Hi, thanks for the reply. How do I check the real user agent name under the raw long in that case? Which search term should I go after?

Also, for the rewriterule, is it possible to not only block images but completely block the access? thank you.

jdMorgan




msg:358272
 12:18 am on Jun 5, 2004 (gmt 0)

> Which search term should I go after?

Search for the partial name that your stats gives you.

You can block whatever you want, but identifying the user-agents is the first step. To block all accesses, use

RewriteRule .* - [F]

with the RewriteCond pattern specifying the bad-bot's user-agent name.

Jim

expert_21




msg:358273
 9:58 am on Jun 6, 2004 (gmt 0)

hi, i found the spider's name to be:

Xaldon WebSpider 2.7.b6

So my htaccess should contain:

RewriteCond %{HTTP_USER_AGENT} ^.*Teleport.*$ [OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Wget.*$ [OR]
RewriteCond %{HTTP_USER_AGENT} ^.*WebSpider.*$ [OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Xaldon.*$[OR]
RewriteCond %{HTTP_USER_AGENT} ^.*WebCopier.*$
RewriteRule .* - [F]

Am I right? thanks.

saoi_jp




msg:358274
 12:57 pm on Jun 7, 2004 (gmt 0)

In this thread:
[webmasterworld.com...]
the syntax is:
RewriteCond %{http_user_agent} ^xaldon\ webspider [OR]

Aparently this software is for off-line browsing. Their website had no info on blocking it. (I used Google's translation of the German so I could've missed something.)

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / WebmasterWorld / Webmaster General
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved