Forum Moderators: phranque

Message Too Old, No Replies

Blocking FaceBook crawler / FB IP's

Will I interfere with link-sharing?

         

SumGuy

1:55 am on Dec 13, 2020 (gmt 0)

5+ Year Member Top Contributors Of The Month



When someone shares a link on Facebook or in a Messenger conversation, the Facebook Crawler will crawl the HTML of the shared page to fetch data such as the page title, meta description, and thumbnail image, which are used to generate the preview.

This mechanism was abused (until recently?) by crawlers and scrapers using working FB account tokens to pull data from websites - notably e-commerce sites with product listings who have bot-blocking in place but white-list FB IP's.

Since I block Fecebook and occasionally see hits (facebookexternalhit) from FB IP addresses, I would imagine that some of these hits are indeed people sharing a URL to one of my pages. My question is, what would be the FB user experience if when sharing links FB is not able to generate a preview? Is the URL otherwise transmitted from the sender to the receiver? Or is the message sent - possibly without the URL? (I am not and never have been a fecebook user so I can't test this myself and I'm only vaguely aware of how FB works from a user pov).

tangor

3:06 am on Dec 13, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



(just had this experience)

FB throws a 403 warning to the FB user, which scares the heck out of them

HOWEVER, if the fb user wants to see the content all they have to do it click a second time and will get the pass through to your site.

Your raw logs will look ugly.

Unfortunately, fb is not going to change as they want EVERYTHING possible so, if you want fb traffic you either take a stand or bend over. Can't advise on what you do. :)

tangor

3:09 am on Dec 13, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Note: the 403 warning is from fb to their users! If you get the traffic by an fb user clicking through, it looks normal ... the ugly is fbexternalhit keeps coming and you see all kinds of "forbidden".

SumGuy

3:43 am on Dec 13, 2020 (gmt 0)

5+ Year Member Top Contributors Of The Month



So clicking on a shared link doesn't spawn an external browser on the user's device, but instead is rendered within the FB app?

graeme_p

7:49 am on Dec 13, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Could you server different data to FB? SO they get the open graph data FB needs but not the rest of the page?

tangor

8:00 am on Dec 13, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



but instead is rendered within the FB app?


Correct.

Keeps the user on fb and is blackmail to webmasters to allow their "link checking" bots in.

Since there are only about 1,000 in the ENTIRE WORLD that want to hit one of my sites I blocked fb ...

And then had all kind of terror crazed folks from that 1,000 send me emails asking WTF?

I poked some holes to let fb back in, but dang it, they are not playing fair even now and much as I love that piddling little I get from SOME fb users, it just isn't worth it. About to nuke it once again.

NOTE TO OTHERS: if you are wondering why your fb interactions aren't going as you wish make sure you do NOT block any fb ranges ... OR THE FB PROXIES which I am still trying to map out as they are not the same as their published ip ranges. But that's a different story.

dstiles

11:23 am on Dec 13, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



but instead is rendered within the FB app?

Not sure how that would work with the current Content-Security-Policy setup applied to (supposedly) most https site now. That policy (and others) blocks your content from appearing in other web sites.

I allow the fb bot but it's no longer, in my view, cosntriced to urls entered in facebook. I see quite a few for sites which are extremely unlikley to appear in normal postings.

The proxies - are they, or just free-lance scrapers working through the fb list?

SumGuy

1:19 am on Dec 14, 2020 (gmt 0)

5+ Year Member Top Contributors Of The Month



Ok, the URL or link might be rendered in the FB app, but does the interaction happen through an FB proxy IP or does the app have direct internet access to the URL?