Forum Moderators: open

Message Too Old, No Replies

www.archive.org

         

OZmike

11:34 am on Jun 12, 2005 (gmt 0)

10+ Year Member



got hit via this
207.241.238.254
indexed my whole site even java scripts and all images:
via www.archive.org, anyone know of these people, or what the hell they are doing taking all the site, not just header info.

volatilegx

5:59 pm on Jun 13, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Related to Alexa, Archive.org keeps old web pages in a cached form. An archive of the web.

Dave_A

8:57 pm on Jun 13, 2005 (gmt 0)

10+ Year Member



The spider your talking about come from the Alexa.com way back machine.
If you visit Alexa if I recall they have away to turn off the visits from the crawler.

wilderness

10:12 pm on Jun 13, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Alexa honors robots.txt

kgun

8:35 pm on Jun 24, 2005 (gmt 0)



Honors

robots.txt?

jatar_k

9:00 pm on Jun 24, 2005 (gmt 0)