homepage Welcome to WebmasterWorld Guest from 54.227.41.242
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

This 43 message thread spans 2 pages: < < 43 ( 1 [2]     
Facebook's Bots
Pfui




msg:4370126
 3:57 pm on Oct 3, 2011 (gmt 0)

Facebook bot-running from named and bare (no-rDNS) IPs isn't new --

69.171.229.246
facebookexternalhit/1.0 (+http://www.facebook.com/externalhit_uatext.php)
robots.txt? NO

out-sw248.tfbnw.net
facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)
robots.txt? NO

69.171.228.245
facebookplatform/1.0 (+http://developers.facebook.com)
robots.txt? NO

-- but totally cloaked bot-running is:

69.171.240.249
AsyncHttpClient 1.0
10/0n 08:14:47

69.171.240.245
AsyncHttpClient 1.0
10/0n 08:14:46

robots.txt? NO

Got more?

 

keyplyr




msg:4414135
 3:24 am on Feb 4, 2012 (gmt 0)


@lucy24

This is the icon library that Apple prducts will request:

apple-touch-icon.png
apple-touch-icon-precomposed.png
apple-touch-icon-57x57.png
apple-touch-icon-57x57-precomposed.png
apple-touch-icon-72.72.png
apple-touch-icon-72x72-precomposed.png
apple-touch-icon-114x114.png
apple-touch-icon-114x114-precomposed.png
apple-touch-icon-144x144.png
apple-touch-icon-144x144-precomposed.png


The iMacs, iPhones, iPods, iPads, el al will use these for icons on their Home screens to link to your site. If you do not have these icons in your top-level directory, the agent will take a snap-shot of the page and use that image instead. I've seen these snap-shots and many aren't pretty, which prompted me to create the library so I could control what represents my site on these devices.



cyberdyne




msg:4418462
 7:47 pm on Feb 16, 2012 (gmt 0)

apple-touch-icon-72.72.png


apple-touch-icon-72.72.png / apple-touch-icon-72x72.png

Keyplyr, is this a typo or is it correct?

Also, any idea what size these are expected to be?

apple-touch-icon.png
apple-touch-icon-precomposed.png


Thanks

keyplyr




msg:4418484
 8:55 pm on Feb 16, 2012 (gmt 0)

Yup, typo - sorry


The apple-touch-icon.png is 57x57
The apple-touch-icon-precomposed.png is 72x72

I believe as these icons evolved and were used for larger displays, they started including the dimensions in the file name. These were the first two.

cyberdyne




msg:4418488
 9:04 pm on Feb 16, 2012 (gmt 0)

Many thanks, icons created.

wilderness




msg:4418570
 1:08 am on Feb 17, 2012 (gmt 0)

The iMacs, iPhones, iPods, iPads, el al will use these for icons on their Home screens to link to your site.


Am I to understand that ten icons are required to stop the screen shots?

keyplyr




msg:4418609
 2:32 am on Feb 17, 2012 (gmt 0)

Am I to understand that ten icons are required to stop the screen shots?

No. Only if people like your site and wish to bookmark/link. You're probably safe - LOL.

wilderness




msg:4418614
 2:55 am on Feb 17, 2012 (gmt 0)

They may not like the design of my site, however, and considering the widget content is simply not available any where else, they have no choice but to bookmark ;)

lucy24




msg:4418638
 4:27 am on Feb 17, 2012 (gmt 0)

I think the question was: if you have one icon in this family, will the device scale it appropriately and use it-- or will it only recognize the specific icon that's made in its own size?

And, er, if that wasn't what the question was, it is now ;)

What does "precomposed" mean? Looks like they come in pairs.

keyplyr




msg:4418644
 5:21 am on Feb 17, 2012 (gmt 0)

if you have one icon in this family, will the device scale it

What does "precomposed" mean?


No iDea... iLive without iStuff and iCouldn't be iHappier.

I wouldn't think they get scaled or why are there different sizes?

I just know these Apple contraptions request all these icons. The larger the display, the larger the icon requested I presume. The UA attribute that does the request is CFNetwork/*

Just a FYI - CFNetwork/ can be used easily to get any image on your site, so I have something similar to this in place:

RewriteCond %{HTTP_USER_AGENT} CFNetw [NC]
RewriteRule !^apple-touch-icon(.*)?\.png$ - [F]

lucy24




msg:4418657
 6:27 am on Feb 17, 2012 (gmt 0)

When you said
(.*)?

you meant, of course,

(-\d+x\d+)?(-precomposed)?

:)

Wait, no you didn't. You meant

RewriteCond %{HTTP_USER_AGENT} CFNetw [NC]
RewriteCond %{REQUEST_URI} !apple-touch-icon
RewriteRule \.(png|jpe?g|gif)$ - [F]

:) :)

keyplyr




msg:4418658
 6:36 am on Feb 17, 2012 (gmt 0)




The rule, as written, works. It is part of a larger, more restrictive rule with other UAs and conditions. But stand alone, it does exactly what we've been talking about.

cyberdyne




msg:4418871
 5:22 pm on Feb 17, 2012 (gmt 0)

That /../ pattern has long been a tell of scrapers on my sites. And not wanting to code Yet Another Workaround for a major, the pattern remains blocked. Luckily there's no Fb prob: People simply don't see any graphic when they make a link. And they still make 'em.

The above got me thinking about protecting against this activity, so I subsequently found the following and thought I would share it.

Here's another rule set that blocks many HTTP-scanners, maybe someone will find it useful:


RewriteEngine On
RewriteCond %{QUERY_STRING} [^?]*\? [OR]
RewriteCond %{QUERY_STRING} (\.\./|\.\.\\) [OR]
RewriteCond %{QUERY_STRING} (///) [OR]
RewriteCond %{THE_REQUEST} "^(GET|POST) /?https?:" [OR]
RewriteCond %{THE_REQUEST} "^(GET|POST|HEAD) //"
RewriteRule (.*) $1 [F]


The first RewriteCond checks if the query string has more than one question mark (this pattern is used in some attacks; moreover, extra question marks should be encoded tp %3F), the second one tries to prevent directory traversal attacks (for both Windows and Linux hosts), the third one disallows three or more slashes in the query string (common pattern in many attacks), the fourth and the fifth ones stops proxy checkers.

Ref:
[perishablepress.com...]

keyplyr




msg:4418994
 10:22 pm on Feb 17, 2012 (gmt 0)

Most good hosts block all that.

This 43 message thread spans 2 pages: < < 43 ( 1 [2]
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved