homepage Welcome to WebmasterWorld Guest from 54.198.140.182
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

    
the art-loving iPad
lucy24




msg:4517698
 4:28 am on Nov 10, 2012 (gmt 0)

You know that nebulous feeling that something is Not Right, but nothing is jumping up and hitting you in the face?

After some serious work with Regular Expressions, I've established that since late September, a seemingly normal iPad has been helping itself to my jpgs. Make that: a botnet all dressed up as iPads.

Mozilla/5.0 (iPad; CPU OS 6_0 like Mac OS X) AppleWebKit/536.26 (KHTML, like Gecko) Version/6.0 Mobile/10A403 Safari/8536.25

except for a few recent ones that had upgraded to
Mozilla/5.0 (iPad; CPU OS 6_0_1 like Mac OS X) AppleWebKit/536.26 (KHTML, like Gecko) Version/6.0 Mobile/10A523 Safari/8536.25

Both UAs are identical to my own iPad in the same time period.

No two IPs alike, and never more than one image on a visit-- except for one that was locked out and therefore tried eight times. (Can't for the life of me figure out why it was blocked, but I'm not complaining.) Come from all over the globe; in fact the only thing they have in common is that I've never met any of the IP ranges before.

Some images seem to be more popular, but no discernible pattern. Always jpg, no png or gif. In particular: no apple-touch-icons, which are png.

They're not returning to pick up images they saw on some earlier visit-- unless every single one used a completely different IP (I look for a.b.c. only), which stretches credulity.

Can anyone think of a perfectly legitimate explanation? Otherwise I'm inclined to rewrite them all to my one-pixel transparent gif. Request for jpg, no referer, user-agent iPad.

 

keyplyr




msg:4517746
 8:59 am on Nov 10, 2012 (gmt 0)

Possibly a game used by iPad owners. I block some games or audio players that look for album cover art that were scraping my musician photos.

Or just an image scraper that was written using that iPad UA. I've never been able to figure out how they know the files names or where to look.

wilderness




msg:4517766
 11:27 am on Nov 10, 2012 (gmt 0)

lucy,
I've had some extensive activity in which the "party" returns within 15-30 minutes after the initial visit to grab an single-image that was linked on the page they previously visited.

(some of my pages have links from thumbnails to larger images, while others don't, even in a directory which uses a large quantity of thumbnails).

I don't care for the method, however the only reasonable solution was to deny the UA based upon the IP and that seemed an never-ending task.

I've yet to have one return in the 15-30 minute time frame and grab multiple images.

not2easy




msg:4517775
 1:24 pm on Nov 10, 2012 (gmt 0)

Maybe "Pinners" using Pinterest or some similar "service"?

dstiles




msg:4517843
 9:16 pm on Nov 10, 2012 (gmt 0)

There used to be (possibly still is) a random grid of hot-linked images on a Linux desktop - can't remember the name of it now but it used favicons and was a real pain to block (ended up moving all favicons out of the home directory).

Have to say this isn't Lucy's problem but maybe similar?

lucy24




msg:4518201
 4:21 am on Nov 12, 2012 (gmt 0)

Postscript:

And no sooner do I block-- or rather rewrite-- the jpg + iPad + no-referer configuration, than this other UA with identical behavior comes into sharp focus:

Google/2.5.1.13455 CFNetwork/609 Darwin/13.0.0

Until recently I'd had "Darwin" blocked. I unblocked it because it seemed to be involved in some kind of legitimate activity. Cursory glance through raw logs brings up an assortment of

{something-or-other-here}/{numbers} CFNetwork/{more numbers} Darwin/{version number}

When the something-or-other is MobileSafari, it's a request for the apple-touch-icon. I know this is legit, because I find it connected to my own iPad visits.

:: off to edit htaccess ::

keyplyr




msg:4518381
 6:26 pm on Nov 12, 2012 (gmt 0)

I block all flavors of CFNetwork. It's an image grabber used on the various Apple appliances, highly abused. I only allow it to get the apple icons.

wilderness




msg:4518386
 6:36 pm on Nov 12, 2012 (gmt 0)

ditto to cfn.

from 2004:
64.110.237.zz - - [19/Sep/2004:07:27:54 -0700] "GET /MyFolder/MyImage.jpg HTTP/1.0" 200 1833 "-" "CFNetwork/1.1"

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved