homepage Welcome to WebmasterWorld Guest from 54.211.80.155
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

    
Cloaked robot from Ask.com
No robots.txt, requests and malformed URLs
jdMorgan




msg:3098499
 2:50 am on Sep 27, 2006 (gmt 0)

Log sample:

65.214.45.100 - - [25/Sep/2006:21:21:37 -0400] "GET /mypage1.html%09 HTTP/1.0" 403 666 "-" "Mozilla/4.0 (compatible: MSIE 5.5; Windows NT 4.0)"

65.214.45.100 - - [26/Sep/2006:21:33:04 -0400] "GET /mypage2.html%09%20192.168.131.187 HTTP/1.0" 403 666 "-" "Mozilla/4.0 (compatible: MSIE 5.5; Windows NT 4.0)"

Reverse DNS points to crawler100.ask.com via UUNET.

I think they'd better get Jeeves back on retainer; Those URL-paths are malformed. The second one was /mypage2.html<tab><space>my site's IP address. Maybe this is intended as a 404-check, but that ain't the way to do it proper, guv'nah.

Very bad form on this one, gentlemen...

Jim

 

wilderness




msg:3099349
 5:47 pm on Sep 27, 2006 (gmt 0)

Jim,
I've been getting them for a while.

The majority of mine are pages and images.

The visits are not often and in most instances a single page, however I recall instances where it was multiple pages (with images).

keyplyr




msg:3099733
 10:59 pm on Sep 27, 2006 (gmt 0)

LOL - your forbidden page weighs 666.

jdMorgan




msg:3099765
 11:41 pm on Sep 27, 2006 (gmt 0)

Yes, intentional... :)

Jim

bull




msg:3113710
 6:25 am on Oct 9, 2006 (gmt 0)

add IP range: 65.119.214.*

with UAs
Wget/1.8.1
or
Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 4.0)

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved