homepage Welcome to WebmasterWorld Guest from 54.211.235.255
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

    
facebookexternalhit
Face book image spider
urbanadventureorg




msg:3552154
 10:39 pm on Jan 18, 2008 (gmt 0)

I was getting a few hits to images on my site from facebookexternalhit/1.0
These were a bit of a concern, as hotlinking from my site has been a big issue, and costs me money. I checked the accompanying page <snip> and got much as what is posted here: [webmasterworld.com...]

"Facebook allows its users to send links to interesting web content to other Facebook users. Part of how this works on the Facebook system involves the temporary display of certain images or details related to the web content, such as the title of the webpage or the embed tag of a video. Our system retrieves this information only after a user provides us with a link. You may have found this page because a Facebook user sent a link from your website to other Facebook users. If you have any questions or concerns about any links or content sent by one of our users, please contact us at legal@facebook.com."

The only problem is, what they say is not quite true. In fact it is essentially bull dust. I set up a Facebook page and fetched images from my web site. I was able to post these images with no link to my web site. Essentially it is under the radar hot linking of images, and in the case of my web site, it is against my copyright policy, and illegal. So I wrote to legal@facebook.com asking for a fair usage fee or instructions on how to block their spider. No reply from them yet.

Also, it does not seem to obey robots.txt

I block them like this which seems to work effectivly.

Options +FollowSymlinks
RewriteEngine on
RewriteCond %{HTTP_USER_AGENT} ^facebookexternalhit.*$ [OR]
RewriteCond %{HTTP_REFERER}!^$
RewriteCond %{HTTP_REFERER}!^http://(www\.)?example.com(/)?.*$ [NC]
RewriteRule .*\.(gif¦jpg¦jpeg¦bmp)$ http://example.com/nohotlink.jpe [R,NC]

[edited by: volatilegx at 9:45 pm (utc) on Jan. 21, 2008]
[edit reason] removed URL [/edit]

 

volatilegx




msg:3553970
 9:47 pm on Jan 21, 2008 (gmt 0)

Thanks for the info, urbanadventureorg, and welcome to WebmasterWorld :)

Pfui




msg:3554011
 10:35 pm on Jan 21, 2008 (gmt 0)

They're also .tfbnw.net --

A.k.a.

Facebook, Inc
156 University Ave, 3rd Floor
Palo Alto, CA 94301

E.g.:

out031.sctm.tfbnw.net
facebookexternalhit/1.0 (+http://www.facebook.com/externalhit_uatext.php)

A.k.a.

204.15.20.158

A.k.a.

Thefacebook.com
156 University Ave, 3rd floor
Palo Alto, CA 94301

NetRange: 204.15.20.0 - 204.15.23.255
CIDR: 204.15.20.0/22

I'm not happy with them so I block their UA and Host (.tfbnw.net). One of these days I'll grep their IPs and probably have to block those, too.

keyplyr




msg:3554265
 5:37 am on Jan 22, 2008 (gmt 0)

I agree with what's been said so far. I also found it necessary to block the UA.

wilderness




msg:3554305
 7:02 am on Jan 22, 2008 (gmt 0)

Some of these pests links may provide unusual benefits as longs as the bandwith isn't too excessive and your images are not inline linked.

There is a crossword puzzle blog that the creator occassionally inserts words of my widget topics into his puzzles.
The result provides people/visitors to an insight that they were not previously aware of (as well as to my websites).

I could have easily denied or redirected these referrals, however I thought it was a good thing.

keyplyr




msg:3554368
 9:53 am on Jan 22, 2008 (gmt 0)

Some of these pests links may provide unusual benefits...

Right you are.

If the referrer is not a variant of my domain, the hot-linked image is switched via mod_rewrite to an image which says:

Shame on me!
I am attempting to steal an image from:
www.my-domain.com

Then a script switchs the hot-linked image to a 43 byte clear gif. It still displays the "shame on me..." to those who display their refer, but users who hide their refer see nothing.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved