homepage Welcome to WebmasterWorld Guest from 54.161.155.142
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

    
How can I block Cyveillancebot(63.148.99.****)
It doesn't look like a useful bot...
itisgene




msg:1527720
 3:36 am on Mar 25, 2004 (gmt 0)

I saw 63.148.99.**** are coming to one of my sites and grabs files. I checked it from web and found that it is Cyveillancebot. Since the site is new, it is not quite listed in search engines and get the traffic yet. Is there any way that I can block this? But it doesn;t have UA name to block.

from the web, I got this

******************
Cyveillancebot uses IP addresses in the range of 63.148.99.224 - 63.148.99.255, and may use others (but unconfirmed). Here's a list of other 'media enforcer' bots, servers et al.

Cyveillancebot ignores robot.txt, as far as anyone can tell. Cyveillancebot spoofs its identity, naming itself various flavors of Windows browsers:

63.148.99.232 - - [02/May/2003:13:01:37 -0700] "Mozilla/4.0 (compatible; MSIE 5.05; Windows NT 3.51)"
63.148.99.232 - - [02/May/2003:13:01:37 -0700] "Mozilla/4.0 (compatible; MSIE 5.05; Windows NT 5.0)"
63.148.99.232 - - [02/May/2003:13:01:58 -0700] "Mozilla/4.0 (compatible; MSIE 5.05; Windows NT 3.51)"
63.148.99.232 - - [02/May/2003:13:01:58 -0700] "Mozilla/4.0 (compatible; MSIE 5.05; Windows NT 3.51)"
63.148.99.232 - - [02/May/2003:13:02:57 -0700] "Mozilla/4.0 (compatible; MSIE 5.05; Windows NT 4.0)"

**************************

Any way to block it?

Thanks,

 

jbgilbert




msg:1527721
 4:19 am on Mar 25, 2004 (gmt 0)

see this thread for solution
[webmasterworld.com...]

jbgilbert




msg:1527722
 4:24 am on Mar 25, 2004 (gmt 0)

I also found this, but have not used it before:

<Limit GET>
order deny,allow
deny from 155.212. 199.171.167. .aol.com 207.51.72.139 grog.ric.edu
</Limit>

Claim is it works for partial IP address and domain names.

jdMorgan




msg:1527723
 4:45 am on Mar 25, 2004 (gmt 0)

That code only denies access if the request is a GET or HEAD, and not if it is a POST, EDIT, SEARCH, DELETE, etc. I strongly suggest you use a <Files> container instead:
[code]
<Files *>
order deny,allow
deny from 155.212. 199.171.167. .aol.com 207.51.72.139 grog.ric.edu
</Files>
Also, be aware that using a hostname such as .aol.com in code like this invokes a reverse-DNS lookup -- a request from your server to its DNS server -- for each HTTP request received by your server, and that is very slow. Therefore, it is preferable to use IP addresses only if at all possible.

Jim

itisgene




msg:1527724
 4:04 pm on Mar 25, 2004 (gmt 0)

Thanks, guys.
I think the posting jbgilbert mentioned is for Apache server. I am using Windows server with .ASP.
So, it is not that usefule fo me.

for jdMorgan's suggestion,
************************
<Files *>
order deny,allow
deny from 155.212. 199.171.167. .aol.com 207.51.72.139 grog.ric.edu
</Files>
*************************
Do I put this on robots.txt of normal web pages?
Sorry, I haven't used robots.txt that much.
I can include it as SSI in asp files with the specific IP addresses, if it is needed.

Thanks,

jdMorgan




msg:1527725
 4:43 pm on Mar 25, 2004 (gmt 0)

No, sorry -- All of this is for Apache, and none of it has to do with robots.txt.

Robots.txt only works with cooperative robots. The bad ones either don't check it, or they do check it, but ignore it.

Jim

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved