Forum Moderators: open
How do I go about banning the ia_archiver bot from an IIS server with 98% ASP pages other than the standard robots.txt file?
Thanks,
[webmasterworld.com...]
I did notice there are at least two versions of ia_archiver.
Another possibility is that a rogue bot is "spoofing" its name as ia_archiver and not revealing its true name.
In ASP you could put something like this into your pages and have it run before anything else is written;
If Request.ServerVariables("HTTP_USER_AGENT") = "ia_archiver" ThenResponse.Status = "403 Denied"
Response.Write "403 - Access Denied"
Response.EndEnd If
This will block their crawler via it's user-agent information and waste the least possible amount of your server's bandwidth.
- Tony
[webmasterworld.com...]
User-agent: ia_archiver
Disallow: /private/
Disallow: /logs/
I'm leaning toward the "spoofing" theory posted by sun818... Has anyone else had a problem with the Internet Archive on a site with a validated robots.txt - and verified that the IP address matches the IA range?
Jim
If your robots.txt is valid and the services prevents you from viewing previous versions, it is a rogue bot.