Forum Moderators: open

Message Too Old, No Replies

itsapic.com crawler

         

keyplyr

6:51 pm on Oct 11, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



itsapic.com crawler is designed to respect the robots.txt exclusion directives and META robots tags, and collect material at a measured, adaptive pace unlikely to disrupt normal website activity.

It ripped through entire site and several k of image files in a few minutes, requesting up to 10 files per second. This is "measured?"

208.43.227.** - - [11/Oct/2008:10:26:01 -0400] "GET /sitemap.xml HTTP/1.0" 403 918 "-" "Java/1.6.0_07"

208.43.227.** - - [11/Oct/2008:10:26:01 -0400] "GET /robots.txt HTTP/1.0" 200 4724 "http://mysite.com/" "Mozilla/5.0 (compatible; itsapic.com_crawler/0.01 +http://itsapic.com/crawler.html; crawler@itsapic.com)"

This system is currently in private B

Looks like an Image Finder service, that may or may not become free to the public.

Coming from a SoftLayer Technologies range. I don't know why I didn't already have this range blocked.

wilderness

11:47 pm on Oct 11, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



keyplr,
I've the range open as well.
Perhaps we've just never been bothered from the range ;)

Thanks for the heads up.

Don

edited by wildernss:

BTW the crawl portion would have caught a UA deny on my end.

Staffa

9:12 am on Oct 12, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I have the whole of the 208.nn.nn.nn range denied for a long time.
Nothing good ever comes from there.

blend27

7:08 pm on Oct 12, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



-- I don't know why I didn't already have this range blocked. --

Most likely cause it is a fairly new range:

RegDate: 2008-04-22
Updated: 2008-04-22

All the SoftLayerNess is here:

[ws.arin.net...]

wilderness

9:20 pm on Oct 12, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



All the SoftLayerNess is here:

blend,
Although I'll readily admit to previously denying the entire locateable ranges of some providers (Road Runner on one occasion) denied.

The act itself seems very extreme.
However. . .tomorrow I may feel different ;)

Don

blend27

3:41 pm on Oct 14, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Interesting,

this BOT also provides a valid Referer, where it found your link. The site has a
" onmouseover="window.status='http://www.mysite.com/'; return true;".

Other than that I would assume that it follows redirects.