Welcome to WebmasterWorld Guest from 54.172.221.7

Forum Moderators: phranque

Question about favicon.ico

Requesting it (or not) is reliable indicator bot/human?

     
1:55 am on Nov 14, 2018 (gmt 0)

Junior Member

Top Contributors Of The Month

joined:Sept 8, 2016
posts:60
votes: 0


I've done some searching on this, can't find anything definitive. I see hits to my site to request individual pdf files where the favicon.ico is usually also requested, but sometimes not. When a page-hit happens (and all the correct accessory files are requested) the favicon.ico is, again, usually requested.

My gut feeling is that the default behavior for most (all?) browsers is to grab favicon.ico (if it exists) for any site you visit the first time, and maybe (?) don't ask for it again after that. I don't know if there is an *easy* way to turn off requests for it on any given browser, and if the typical user would do that. So all this makes me wonder if I can draw any conclusions when I *don't* see it being requested (but otherwise the hit looks legit).
4:06 am on Nov 14, 2018 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15374
votes: 725


Most humans request the favicon.

Most robots don't.

That's as definitive as we can get.
4:59 am on Nov 14, 2018 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:11522
votes: 177


have you examined the HTTP Response headers for the favicon.ico requests?
are you supplying the proper headers to support user agent caching of this resource?
5:53 am on Nov 14, 2018 (gmt 0)

Junior Member

Top Contributors Of The Month

joined:Sept 8, 2016
posts:60
votes: 0


> have you examined the HTTP Response headers for the favicon.ico requests?

Until I transition my site fully to https I am not able to log browser language or character set (because of IIS4). So that leaves me with user-agent and referrer (and requesting IP and method - ie get or head) as a way to judge if the requester is human or bot. Requests for favicon.ico give same logging info in that regard to the other files being requested during the hit.

> are you supplying the proper headers to support user agent caching of this resource?

I don't think I have any code in any html files that pertains to caching. I don't make use of htaccess files (I never see browsers request them - should they be?). I see both code 200 and 304 - which tells me that some browsers are just checking for the existence of files and not actually downloading them, probably because they've visited the site in the past.
8:14 am on Nov 14, 2018 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:11522
votes: 177


do you ever see a 304 response to a request for favicon.ico?
10:12 am on Nov 14, 2018 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Dec 27, 2004
posts:1968
votes: 68


...(because of IIS4)
Are you sure it is IIS4? if so, are you saying you are on NT box?

Oh never mind the above question, i just read [webmasterworld.com...]
1:26 am on Nov 15, 2018 (gmt 0)

Junior Member

Top Contributors Of The Month

joined:Sept 8, 2016
posts:60
votes: 0


After removing anything with "bot" in the user-agent (and removing all google and bing IP's) I get about 11,000 hits to favicon.ico (this is from jan-1 2015 to the present). Of those, practically all of them have http response code 200.

11 have code 206 (from 5 different IP's) and 80 have code 304.

While on the subject - I searched the logs for "Google Favicon". It turns up in a few different user-agents, such as this one:

Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.75 Safari/537.36 Google Favicon

That agent first appears on March 11/2016 and continues right up to a few days ago. It grabs mostly favicon.ico and default.html and sometimes 1 or 2 other files.

Prior to March 11/2016 I see this:

Mozilla/5.0 (Windows NT 6.1; rv, 6.0) Gecko/20110814 Firefox/6.0 Google Favicon

which runs from Dec. 8 / 2015 to march 8 / 2016.

Prior to that, I see (just) this:

Google favicon

All told, about 1300 hits (from Jan 1 / 2015 to the present) from the Google Favicon. Do those hits mean anything? Is a Google Favicon hit indicative of something?
6:05 am on Nov 15, 2018 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15374
votes: 725


Is a Google Favicon hit indicative of something?
I don't think anyone has worked out a comprehensive list of everything Google's faviconbot is for. An obvious one is GSC: stop by and it will collect the favicons for all your sites. If you list websites in your G+ profile, those also trigger favicon requests whenever someone views your profile.

The one that I've yet to figure out is when the faviconbot visits my test site, which isn't indexed and of course doesn't have a GSC account because why would it. It isn't common, compared to “real” sites, but why is it happening at all? (Quick detour to raw logs reveals that they just started doing it in late September, and every single request comes in at HTTP, redirected to HTTPS, meaning that the front page is requested twice, the favicon only once. On average, every 2-3 days. Hmmm.)

Up until a few years ago, google's faviconbot sent no UA at all; you could only identify it by IP and behavior. And then, yup, FF/6. The current incarnation is at least something plausible.