Forum Moderators: open
I've been receiving these for a long time but suddenly the number has increased. A lot of the hits are to 404's after it's got the wrong path for favicon. As far as I've checked they are all coming from AOL blocks, either user IPs or proxy/cache.
Is this an AOL toolbar of some kind? Or is it an AOL (browser-based?) bookmark manager?
Even if you use a META tag to show an alternate location for a site's favicon, many browsers (AOL, Firefox, Netscape, etc) will still look in the base directory. That has always been standard. Only IE will not look there after it has a copy cached.
That is not the point.
It is precisely the point!
This is the second or third instance I've seen you mention of problems associated with not having the ICO placed properly in the root (how else would keyplr be aware of your location problem as well).
The insigniciant loss of bandwith more than offsets the problems created by improper placement of the file, IMO!.
As an aside (I'm not nitpicking words, rather providing clarification for others)!
I tried banning them in robots.txt but they still come calling.
Additions into robots.txt are a request to honorable bots to comply with your wishes.
Additions to htaccess are forced compliance, whether denial, access or redirect.
Don
The postings you refer to are about the toolbars/whatever that have been falling into my traps. I have been trying to clarify what the UAs are and what they are looking for. I really don't care about the fact they generate a 404 as long as I know what is causing it - is it legit, as in the case of the AOL thing here, or is it some scraper.
The fact that the 404s are being triggered by icon searches is really neither here nor there. I was merely explaining what was happening in an attempt to clarify the situation. Had this one been checking for pages as well as icons it may have elicited a different answer.
I'm sorry about the sloppy terminology re: "banning in robots.txt". I am fully aware of the purpose of robots.txt and do not have the luxury of htaccess: I have to make do with a home-grown kludge because it's IIS. I accept both of these things, though I may occasionally grumble about them.
... do not have the luxury of htaccess: I have to make do with a home-grown kludge because it's IIS.
Why do you have to? There are thousands of reputable hosting companies offering Linux hosting at very competitive rates. It isn't hard to move a web site - just might take a little time depending how complex it is. But then you can sit back and enjoy the benefits of being on a Linux box.
As Don said, robots.txt is a request for legit bots to honor your preferences, the word "preference" being the dynamic. Some legit (I use the word loosely) robots still will ignore it.