| Welcome to WebmasterWorld Guest from 220.127.116.11 |
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
|Become a Pro Member|
successor to ia_archiver?
Just noticed a bunch of stuff like this for the first time:
18.104.22.168 - - [09/Oct/2007:00:33:19 +0200] "GET /widgets.html HTTP/1.0" 200 15915 "http://example.com/other-widgets.html" "Mozilla/5.0 (compatible; archive.org_bot/1.13.1x +http://crawler.archive.org)"
IP resolves to archive.org, so I presume it's genuine and a more informative successor to the plain "ia_archiver".
The web page at [crawler.archive.org...] actually exists, but leaves me unsure as to what UA to put in robots.txt (archive.org_bot? heretrix?), though it's getting 403s now anyway.
Scam as usual, I ban this with SetEnvIf in htaccess since those sites almost never respect robots.txt
[edited by: SEOPTI at 9:27 pm (utc) on Oct. 13, 2007]
All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved