Forum Moderators: open

Message Too Old, No Replies

Facebook?

Search engine, what?

         

jake66

9:09 am on Dec 30, 2009 (gmt 0)

10+ Year Member



This IP: 69.63.178.250

has been hitting me regularly and got snapped up by my filters. I can't find any info about it online at all.

Does anyone know what it's for? Safe to whitelist? (I imagine any Facebook IP wouldn't be harmful, but I have my doubts about some of the apps)

Pfui

6:34 pm on Dec 30, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



1.) What was the User-agent, please?

2.) I've had hits from neighboring 69.63.182.122 related to Facebook and using atypical (one new, ugly) UAs. More details here:

php-openid/2.1.1 (php/hphp) curl/7.19.4 : Something new from Facebook comes... [webmasterworld.com]

3.) Historically, my hits from Facebook Hosts/IPs have been link-checking. In the aftermath of the above-mentioned unknowns, I've 403'd all hits. Doing so has had no apparent effect on Facebookers continuing to include (or follow) links.

jake66

4:11 am on Dec 31, 2009 (gmt 0)

10+ Year Member



facebookexternalhit/1.0 (+http://www.facebook.com/externalhit_uatext.php)

The only real mentions of this I found on the web were from the honeypot project: [projecthoneypot.org...]

Pfui

8:47 am on Dec 31, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I don't know what Facebook is doing with all of its UAs and servers nowadays, or why. What I do know is that on Christmas Eve, they concertedly hit one file eight times using four servers and two UAs; and six of the hits were in the exact same second:

out.250.01.snc1.facebook.com
facebookexternalhit/1.0 (+http://www.facebook.com/externalhit_uatext.php)
21:56:09

out003.05.snc1.tfbnw.net
Facebook share follower
21:57:00
21:57:00

out002.05.snc1.tfbnw.net
Facebook share follower
21:57:00
21:57:00

out001.05.snc1.tfbnw.net
Facebook share follower
21:57:00
21:57:00
21:57:00

Oh, and not a robots.txt URI in the bunch.

Regardless, the first 403 would've been a sufficient signal to better-behaved/programmed apps. Of course, it's up to you what you want to do when they hit you.

jake66

9:18 pm on Dec 31, 2009 (gmt 0)

10+ Year Member



Bizarre! I'm wondering if these are from the post a link to a friend thingy.

When I send a friend a link on Facebook, they (Facebook) take a snapshot of the page I'm sending and post it in the bottom of my message.

However, if they're hitting robots.txt... makes it all the more odd for it's behavior. Why would a screenshot link attachment be reading robots.txt?

dstiles

9:38 pm on Dec 31, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



So they are stealing content and making use of it illegally? I wonder if they include my 403 form "Please fill in if our rejection is innapropriate". :)

keyplyr

12:36 am on Jan 1, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I've allowed, then blocked, then allowed, now blocked again due to hot-linking. Tired of seeing it in my logs.

Image requests from remote referrers are swapped to a small size .png displaying rhetoric about ethical behavior regarding bandwidth theft, however serving that file 5k Xs an hour to a mass of cost-indifferent teenage girls is not what I had in mind.

Related thread: [webmasterworld.com...]

Pfui

4:17 am on Jan 1, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



@jake66: Just to clarify... None of the Facebook-related hits to my server/sites ever request robots.txt. Never have.

@dstiles: Certain WHOIS sites showing obtained-without-permission thumbs feature my 403 all too prominently. Nifty thing is, because my 403 displays the visitor's UA and IP, larger shots show bots' IPs and UAs, too. :)

dstiles

11:09 pm on Jan 1, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



So do mine. :)

Pfui

3:27 am on Mar 7, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



out-w244.tfbnw.net
php-openid/2.1.1 (php/5.2.5.hiphop) curl/7.19.4

robots.txt? NO

Almost two months ago, above, I wrote: "I don't know what Facebook is doing with all of its UAs and servers nowadays, or why." I still don't.

From the FYI links below, HipHop's a cool code thing. From my POV, that HipHop hit's just another mostly-cloaked Host* running a no-robots.txt-requesting, no-info-URL, uncool bot thing.

-----
*tfbnw... Uhh-huh. Let's see:

The FaceBook NetWork?
True Friends Belong in the NorthWest?
True/False: Burlington Northern with Warren?
Territorial Force Baits Net Watchdogs?

: )
-----
FYI, from WW's PHP Server Side Scripting forum:

Facebook release HipHop for PHP
"With HipHop we've reduced on our Web servers by 50%"
[webmasterworld.com...]

HipHop for PHP: Move Fast
[developers.facebook.com...]

jake66

4:01 am on Mar 12, 2010 (gmt 0)

10+ Year Member



I did some tests with this recently after wiping my logs..

Whenever I posted a link to my someone via Facebook, and a thumbnail tried to load - at nearly the exact same second I got a hit in my logs from this IP.

It's a thumbnail scraper. It also grabs a word-snippet from the page to show beside the thumbnail.

If that IP does anything besides that, I can't tell.