Forum Moderators: DixonJones
I've seen 406s few times from one specific bot (which used to hunt down multimedia files), but it was very rare...
Reading through a few sites (for the most part Google is your friend) the general consensus is that 406 is generated by the server when a resource is requested by the client, but the resource type is not covered by the list of items the client claims to accept/understand (using the "Accept:" field in the request header).
Search for "406 Not Acceptable" here (interesting read);
[w3.org...]
For example if I were to create a bot that only understood HTML and GIF I could set it's Accept: to "Accept: text/html, image/gif" and in theory if it were to accidentally request a resource with a MIME type of anything other than these two, for example a PNG then the server could return a 406 error code because it has been told my bot wasn't capable of interpretting those filetypes.
- Tony
When I first saw the thing this morning I checked WW forum3 to see if anyone else had spotted it. Others had.
[webmasterworld.com...]
Msg #21 and 23.
GG posted after those messages, didn't acknowledge the posts, and because he seems to read through threads well, it seemed a bit odd.
I think it was a legit googlebot. I only had 2 hits from it out of 40 or so normal bot visits during the same 24 hrs so it was no big deal.
This is what it looked like with the specifics muddied up.
2004-03-05 03:37:30 64.68.89.144 GET /robots.txt 200 1671 139 www.site.org Googlebot/Test -
2004-03-05 03:37:30 64.68.89.144 GET /b**rows_icc_030510.htm 406 4085 134 www.site.org Googlebot/Test -
2004-03-05 03:57:58 64.68.88.18 GET /or**nteering_021125_20.htm 406 4085 138 www.site.org Googlebot/Test -
<edit>
[edited by: Stefan at 5:21 am (utc) on Mar. 6, 2004]
'whois -h rwhois.exodus.net:4321 64.68.89.144'
%rwhois V-1.5:001ab7:00 rwhois.exodus.net (Exodus Communications)
network:Class-Name:network
network:Auth-Area:0.0.0.0/0
network:Network-Name:64.68.88.0
network:IP-Network:64.68.88.0/21
network:Organization;I:Google Inc.-BGPconfig-SC3DC3
network:Name;I:Google Inc.
network:Email;I:dns-admin@GOOGLE.COM
network:Street;I:2400 E. Bayshore Pkwy
network:City;I:Mountain View , CA 94043
GeorgeGG
I'm running on Apache 1.3.28 with MultiViews and DirectoryIndex turned on. Requests for the directory work (200 returned) but Googlebot's requests for a php file, without the extension result in a 406.
contact is a directory with index.php:
crawler11.googlebot.com - - [10/Mar/2004:00:13:46 -0800] "GET /contact/ HTTP/1.0" 200 1344 "-" "Googlebot/2.1 (+http://www.googlebot.com/bot.html)"
new is actually new.php:
crawler14.googlebot.com - - [10/Mar/2004:00:13:25 -0800] "GET /new HTTP/1.0" 406 421 "-" "Googlebot/2.1 (+http://www.googlebot.com/bot.html)"
Anyone get any work arounds for this? I don't want to drop from the index!
Giles
I've replicated the error with a home brew browser which only accepts text/html. 406 errors are generated because the server doesn't consider that it can honour the request.
Because the request excludes the file extension it triggers the mod_negotiation code of MultiViews. But the target file is a dynamic one (php in my case) with a directive:
AddType application/x-httpd-php .php
So mod_negotiation views the file as being of MIME type x-httpd-php which does not match the GET request accept list, hence the 406 Not Acceptable.
I've had a mail from someone who got round the problem by dropping MultiViews and instead using mod_rewrite to add the desired .php ending. That doesn't suit my needs so I kept looking.
You CAN keep MultiViews and dynamic files with restrictive accept lists. I used the type-map to allow me to specify the file content type:
In .htaccess:
MultiViewsMatch Handlers #Only needed for Apache 2
AddHandler type-map .var
For each php file, fubar.php, create a shadowing fubar.var containing:
URI: fubar
URI: fubar.php
Content-type: text/html
So now on requesting fubar, MultiViews will scan the directory and give preference to fubar.var The type map code of mod_negotiation will then serve up fubar.php as text/html
This works fine with my test harness - now just got to watch for Googlebot's response. Interesting to note that both Apache & Googlebot are behaving correctly - we're just used to most browsers being excessive in what they claim we'll accept.