homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

Cloaked Spider!?
...something strange went through my site!

 9:20 pm on Apr 3, 2003 (gmt 0)

Hi all,

My website has just been hit by some kind of distributed spider - it doesn't identify itself as a spider, in fact it's using several different user_agents and tries to pretend to be several people browsing (a form of cloaking!? ;), each one at a different IP, not all the IPs are from the same block - and it didn't honour my robots.txt either.

So why do I think it's a spider...?

Well, in the space of 20 mins the IPs in question systematically requested -/in perfect sequential order/- all of my top level categories, and then proceeded to go through the sub-cats and a whole load of product pages -- even hitting most of the 'email a friend' pages for these also (kind of gives away the fact it's a spider).

I imagine a networked cluster of PCs running on multiple dial-up accounts, ravaging the web for email addresses ..or something...

Anyway, the IPs and user_agents I observed were:- Mozilla/4.77 [en] (X11; U; Linux 2.2.19 i686) Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0) Mozilla/4.0 (compatible; MSIE 4.0; Windows 95) Mozilla/4.0 (compatible; MSIE 5.5; Windows 98; Win 9x 4.90)

I'd post some of my log so you could see the pattern of requests, but I know posting URLs (partial or otherwise) for commercial sites isn't allowed.

[Shrug] ..just thought I'd pass the info on, even if it's not really of much use to anyone.




 9:53 pm on Apr 3, 2003 (gmt 0)

Another rude dash through a site by the notorious "Web Content International".

You can find out all you need to know about them with the site search at the top of this page.


 12:41 am on Apr 4, 2003 (gmt 0)

Thanks very much for the info - I'll hop off and take a look right now...


Global Options:
 top home search open messages active posts  

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved