Welcome to WebmasterWorld Guest from

Forum Moderators: DixonJones & mademetop

Message Too Old, No Replies Robot in Germany?

spidered entire site. Very strange



9:58 am on Oct 19, 2004 (gmt 0)

10+ Year Member

Can anyone tell me anything about an oddly behaved (apparent) spider using this dns:

It was referred to my site from an or ordinary organic link I have on a similar site. Then it proceeded to browse like a regular visitor .. but ..

It sucked in every last page I have, including the images. I put up one image per page usually. The pattern was totally mechanical.

Index page => menu #1
Menu #1 => page A, then downloads image A.
Back to Index.

Index page => menu #1
Menu #1 => page B, then downloads image B.
Back to Index.

This repeated until all pages on menu #1 were exhausted. Next back to index.html, and the whole silly waltz repeated with menu #2. Then Menus 3 and 4. Finally it went nuts on my sitemap and found all my oddball pages.

This was extremely methodical, with hits coming from 3 to 10 seconds apart. NO human browses like that.

It never looked at robots.txt. There was no referral string or identifier. Spiders usually only suck up my html files. Image bots come later for those.

I've never seen this before, my only clue was the dns.

Any clues? - Larry


10:14 am on Oct 19, 2004 (gmt 0)

10+ Year Member

The IP is registered to FreeNet so it might well be a dialup IP. Did it fetch robot.txt? What was the UserAgent? Maybe it was just IE downloading webpages for offline reading?


11:30 am on Oct 19, 2004 (gmt 0)

10+ Year Member

I went thru my entire access_log file. Whatever this is, it did NOT call for robots.txt. I have a robots.txt, which has no restrictions on spidering.

For User agent I only saw "-" meaning unspecified.

It doesn't want to identify itself, and almost tries to mimic a surfer. Only the complete relentless spidering and rate of downloads give it away. - Larry


Featured Threads

Hot Threads This Week

Hot Threads This Month