Welcome to WebmasterWorld Guest from 54.226.189.112

Forum Moderators: Ocean10000 & incrediBILL

Message Too Old, No Replies

archive.org bot/1.13.1x

successor to ia_archiver?

     

zCat

10:41 pm on Oct 8, 2007 (gmt 0)

10+ Year Member



Just noticed a bunch of stuff like this for the first time:

208.70.24.237 - - [09/Oct/2007:00:33:19 +0200] "GET /widgets.html HTTP/1.0" 200 15915 "http://example.com/other-widgets.html" "Mozilla/5.0 (compatible; archive.org_bot/1.13.1x +http://crawler.archive.org)"

IP resolves to archive.org, so I presume it's genuine and a more informative successor to the plain "ia_archiver".

The web page at [crawler.archive.org...] actually exists, but leaves me unsure as to what UA to put in robots.txt (archive.org_bot? heretrix?), though it's getting 403s now anyway.

SEOPTI

9:27 pm on Oct 13, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Scam as usual, I ban this with SetEnvIf in htaccess since those sites almost never respect robots.txt

[edited by: SEOPTI at 9:27 pm (utc) on Oct. 13, 2007]

 

Featured Threads

Hot Threads This Week

Hot Threads This Month