Forum Moderators: open
Slurp tried to take the following page from my site this morning - this page however has never ever existed and slurp therefore got a 404 - slurp also attempted to fetch a few more pages that never existed - why would this happen?
66.196.73.82 - - [20/Sep/2003:05:09:30 +0200] "GET /fast_pay_day_loan/football.htm HTTP/1.0" 404 1985 www.my-site.co.uk "-" "Mozilla/5.0 (Slurp/cat; slurp@inktomi.com; http:*//www.inktomi.com/slurp.html)" "-"
1) the domain belonged to somebody else, who previously *did* have a file / path of that name.
2) shared hosting - where you are sharing the same IP address of another bloke with a file / folder of that name. Host goofs up, and then the spider grabs your site / with somebody else's path.
3) Old domain hosted on your non shared IP address - One host I used long ago forgot to turn off the previously allocated domain name for the IP address that now was used for *my* domain. So, even though it wasn't a shared hosting environment, the spiders went ahead requesting /otherstuff.html which I never had.
For some reason, Slurp / Ask Jeeves, etc have a much longer memory for outdated files that they shouldn't request any more (at least, in my experience).
Best thing you can do is return a 410 'gone' header for the file and then serve up your error page.