Forum Moderators: phranque
This subject may have been touched upon before - but I have'nt found anything that can shed light on what is going on.
Browsing through my log-files I saw multiple hits - maybe 1000+ over the last week from the same ip-address doing nothing else than requesting my favicon-file. No other pages on my website are being requested - just the favicon-file. Could it be some inventive cleaver way of, say copying content, without this being logged by the server-logfile?
Why is someone only requesting this same file multiple times?
203.161.***.*** - - [26/Mar/2006:12:54:08 +0200] "GET /favicon.ico HTTP/1.1" 404 217 "-" "Mozilla/4.0 (compatible; GoogleToolbar 4.0.513.2948-big; Windows XP 5.1; MSIE 6.0.2900.2180)"
2.) I sure wish I had a clue about why you're seeing a gazillion favicon.icn file requests. In recent months I dealt with something similar [webmasterworld.com] -- scores of Microsoft servers (not msnbot) hitting one of my favicon files scores and scores of times every single day. The UA wasn't Google's Toolbar, but rather variations of torturous, PC-specific agents like:
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; .NET CLR 1.1.4322; MSN 9.0;MSN 9.1; MSNbVZ02; MSNmen-us; MSNcOTH; MPLUS)
However, I have seen users with GoogleToolbar aboard request scores of favicon files, too -- and just those. E.g., from an Arizona-based host:
Mozilla/4.0 (compatible; GoogleToolbar 4.0.513.2948-big; Windows XP 5.1; MSIE 6.0.2900.2180)
Look familiar? Ayep. Same as yours.
3.) Rumor has it each favicon request corresponds to a bookmark or other special reminder. But in my experience, and despite my happily believing that my pages are more addicting than Swiss chocolate via IV, there's simply no way any individual has that many bookmarks to a single site. Also I have special favicon files for each directory and the hits were always and only to the top level file.
4.) In terms of solutions, I ended up painstakingly working my way through the various 'parts' of the UAs to try and eliminate the one responsible for favicon collecting (per the post linked above). I never found it. So I ended up rewriting the Microsoft addresses, a range of IPs and proxies, and sending them to a special 'if you're seeing this, e-mail me' page.
In the month or so since then, only one person touched base with me, leading me to believe that either MS-based caching servers were routinely gathering favicons or -- something. Actually, I still haven't a clue, doggoneit. Good luck!
I notice tanx that your visitor got a 404 for favicon.
I had a similar sounding issue with a .png file that was mentioned in a template comment, but not actually linked from the page. An apparently legit request (usually from a google referrer) would be fulfilled, and then the same IP would request this non-existent file with varying degrees of tenacity for weeks afterward, eventually generating thousands of requests daily.
At first I had it down as some kind of scraper and 403'd it, but it kept coming and it was using more bandwidth getting a 403 than 404. Eventually in exasperation I fed it a rewrite to a 1px gif and that was the end of it after 12 or so months! All I get now is the occasional aparently legit request.
The IP was from India, assigned to the national ISP, and my tenuous conclusion was that it was just a very, very badly set up proxy that left happy when it got SOMETHING. I was with this ISP for 3 years and "badly set up" would have been no surprise - at one time we were all flooded with spam after the broadcast address "allusers@theisp.net" was left open for spammers to send to the entire network!
The partial IP Tanx lists appears to belong to APNIC so it may be another badly set up Asian proxy, possibly Vietnam? If you add a favicon maybe it will stop.