Forum Moderators: open

Message Too Old, No Replies

CFNetwork again.

More junk from Yahoo?

         

GaryK

7:33 pm on Nov 9, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Safari/5526.11.2 CFNetwork/339.5 Darwin/9.5.0 (i386) (MacBookPro2,2)
209.131.62.nnn
nat-dip6.fw.corp.yahoo.com
OrgName: Yahoo
OrgID: YHOO
Address: 701 First Ave
City: Sunnyvale
StateProv: CA
PostalCode: 94089
Country: US

Took favicon.ico three times and left.

I've seen CFNetwork mentioned on WebmasterWorld before, but I don't think I've ever seen it coming from a Yahoo corporate IP Address.

Samizdata

9:10 pm on Nov 9, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



CFNetwork is sometimes used by Safari on Mac OSX - often (but not always) just for favicons.

I recently saw a case where each normal page request was followed by a CFNetwork call to the same page.

Serving a 403 did not seem to do any harm.

Nice to know that someone at Yahoo can still afford a MacBook Pro.

...

GaryK

9:17 pm on Nov 9, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Nice to know that someone at Yahoo can still afford a MacBook Pro.

Must belong to Jerry. ;)

I appreciate the heads-up about CFNetwork. I went back to my log files analyzer and found this UA string crawling from the same IP Address at the same exact time:

Mozilla/5.0+(Macintosh;+U;+Intel+Mac+OS+X+10_5_5;+en-us)+AppleWebKit/525.18+(KHTML,+like+Gecko)+Version/3.1.2+Safari/525.20.1

keyplyr

8:06 am on Nov 10, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Its been coming from many IP addresses lately, always for favicon. I used to have it blocked as it can be used as a DL tool, but have recently had to allow it since I spent all that time creating my nifty little icon.

GaryK

5:16 pm on Nov 23, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Its been coming from many IP addresses lately, always for favicon

It hit my sites again last week from numerous IP address, just like you stated. The pattern seems to be: 1) get favicon.ico with a UA that has CFNetwork in it, 2) get most of the other files with some variant of a Safari UA, 3) get the files I offer for download with a curl-based UA.

One of the IP Addresses belonged to Apple:

OrgName: Apple Computer, Inc.
OrgID: APPLEC-3
Address: 20740 Valley Green Drive, MS32E
City: Cupertino
StateProv: CA
PostalCode: 95014
Country: US

What I don't understand, any maybe someone can help me with this, is why does it take all the other files using different UAs when what it really seems to want are browser project files? If all it did were download my project files I wouldn't have any problem with it. I can't ban curl because too many people use it legitimately, so I'm really stuck here. It apparently does not good to ban CFNetwork because of one favicon.ico file. I can't ban the Safari UA because it matched legitimate Safari UAs.

caribguy

6:22 am on Nov 25, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



This one apparently belongs to a "image collector's web spider & search agent"

Pandora/1.9.8 CFNetwork/339.5 Darwin/9.5.0 (i386) (MacPro3,1)

Samizdata

7:31 pm on Nov 28, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Pandora/1.9.8 CFNetwork/339.5 Darwin/9.5.0 (i386) (MacPro3,1)

Pandora (previously "Netscrape") can be set to search for and download most media files.

In its default state it scrapes via Google Image Search, but it has many configuration options and can be set to scrape DOC, PDF, WMV, SWF and many more file-types from specific sites, to whatever degree the user chooses.

And, of course, user-agent spoofing is built-in:

Some web servers are choosy about customers. Pandora can identify itself as something else, if desired.

This charming product is promoted by Apple on their website (though they do not make it).

...