Forum Moderators: phranque

Message Too Old, No Replies

Facebook Thumbnails

facebook thumbnails htaccess

         

cyberdyne

12:45 pm on Feb 17, 2012 (gmt 0)

10+ Year Member



I feel like a complete free-loader asking you guys yet another coding question but I assure you I have tried to do this without help but clearly need a little push in the right direction and you're always so helpful it's difficult to resist. ;-)

I want FaceBook to be permitted to use only the image I designate as a thumbnail in FB mail, etc. They've recently used images which are completely unrepresentative of my site.

I tried using <meta property:"og: ....> tags (despite the fact they break validation) but they don't appear to be working completely as one or two images other than the specified are being used.
Does FB cache your images? If so, this may be why certain images other than that which I've specified are being used from a recently scraped page. If not, then my code clearly isn't working.

I also tried using robots to Disallow 'facebookexternalhit' from my image directory and only permitting it to use one image, via the ^above meta tags, from a designated folder, but I then realised that of course FB doesn't respect robots!

So I'm considering using htaccess to force a specific image upon FB when they request one. I have been playing with the following code, bundled together from snippets of other rules, but as yet it does not appear to be working.

Pointers very much appreciated.
Thank you in advance.


# PERMIT FACEBOOK TO USE ONLY THE DESIRED IMAGE
RewriteCond %{REQUEST_URI} ^.*\.(jpe?g|gif|png)$ [NC]
RewriteCond %{REMOTE_HOST} ^https?://(www\.)?facebook\.com(/)?.*$ [NC,OR]
RewriteCond %{REMOTE_HOST} ^https?://(www\.)?tfbnw\.net(/)?.*$ [NC,OR]
RewriteCond %{REMOTE_ADDR} ^66\.220\.146\. [NC,OR]
RewriteCond %{REMOTE_ADDR} ^69\.63\.181\. [NC,OR]
RewriteCond %{REMOTE_ADDR} ^69\.171\.2 [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^facebookexternalhit [NC]
RewriteRule .*\./dir/image.png [F]

wilderness

2:30 pm on Feb 17, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



RewriteCond %{REMOTE_HOST} ^https?://(www\.)?facebook\.com(/)?.*$ [NC,OR]
RewriteCond %{REMOTE_HOST} ^https?://(www\.)?tfbnw\.net(/)?.*$ [NC,OR]


cyberdyne,
FWIW, you should be able to combine these two on a single line, however CAUTION these types of requests are CPU intensive and could slow down you entire site.

I'd suggest a simple-htaccess within your image folder and only denying FB (and perhaps some other image pests), while allowing most.
Based upon IP and UA alone. It will result in a much quicker process and separate your other server requests from these.



Is FB making image requests (at least consistenly) from anything other than the 66.220 IP?

cyberdyne

2:40 pm on Feb 17, 2012 (gmt 0)

10+ Year Member



Understood, thanks, will give that a go.

Yes, FB (U-A: facebookexternalhit/1.0 (+http://www.facebook.com/externalhit_uatext.php) requests regularly from:

69.171.224.24(5, 7, 8)
69.171.23(1, 2, 3, 4, 5 ,7)

lucy24

9:55 pm on Feb 17, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



69.171.23? Ouch, haven't met that one. I've got the ranges flagged as
66.220.144.0/20
69.63.144.0/21
69.171.224.0/19
But I block by UA. You haven't much choice if you want to prevent hotlinking, since FB's whole recommendation system is based on actively facilitating hotlinks. They gather all the images; a human user picks the one they like.

Mine's a simple and brutal

RewriteCond %{HTTP_USER_AGENT} facebookexternalhit [NC]
RewriteRule \.(jpe?g|gif|png)$ - [F]

cyberdyne

10:16 pm on Feb 17, 2012 (gmt 0)

10+ Year Member



Sorry Lucy, my bad. That was meant to read:

69.171.224.245
69.171.224.247
69.171.224.248

69.171.234.1
69.171.234.2
69.171.234.3
69.171.234.4
69.171.234.5
69.171.234.7

Also, today:
69.171.228.244
69.171.228.245
69.171.228.246
69.171.228.247
69.171.228.248
69.171.228.249
69.171.228.250
69.171.228.251

69.171.229.244
69.171.229.245
69.171.229.247
69.171.229.248
69.171.229.249
69.171.229.251

I actually implemented a block similar to yours in my /images/ directory htaccess this evening but permitted one image of my choosing. So far that and the <meta> tags seem to be working well together, apart from what they already seemingly had cached from one page in particular.