Welcome to WebmasterWorld Guest from 54.205.170.21

Forum Moderators: Ocean10000 & incrediBILL

Message Too Old, No Replies

memorybot

from: archivethe.net

   
6:04 pm on Jun 20, 2014 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



37.16.72.207
Mozilla/5.0 (compatible; memorybot/1.20.71 +http://archivethe.net/en/index.php/about/internet_memory1 on behalf of DNB)

robots.txt? Yes

Internet Memory Research
Parent range: 37.16.72.0 - 37.16.79.255
CIDR: 37.16.72.0/21

Note: The bot's versioning is wacky. In the past week, Project Honey Pot participant sites report FIVE different version numbers for the IP that hit me [projecthoneypot.org ]:

/1.20.30
/1.20.33
/1.20.37
/1.20.41
/1.20.70

Notes:

- Neighboring IPs (.208; .209) show the same versions, and more. E.g.: [projecthoneypot.org ]

- The umbrella, internetmemory.org, appears to be more European than not.

- Yet Another All-Web Archive. But apparently not connected to Amazon's archive.org -- yet.
1:39 pm on Jun 21, 2014 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member Top Contributors Of The Month



Can you please describe this "Amazon's archive.org" that you mentioned. It's probably discussed in another thread that I missed, so I don't know about it.
6:59 pm on Jun 21, 2014 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



TIA doesn't actually belong to Amazon, does it? They just crawl from aws ranges.
1:08 am on Jun 22, 2014 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



I don't know about archive.org's crawling bases, AWS or otherwise. That said, a major "Institutional Supporter" of Archive.org is Alexa -- an Amazon company. [archive.org...]

And --

"Alexa's operation includes archiving of webpages as they are crawled. This database served as the basis for the creation of the Internet Archive accessible through the Wayback Machine.[7] In 1998, the company donated a copy of the archive, two terabytes in size, to the Library of Congress.[5] Alexa continues to supply the Internet Archive with Web crawls." [en.wikipedia.org...]

Thus I reckon Archive.org's data, tracking and stats get accessed by Amazon in some way, shape, or form, thus my POV: "Amazon's archive.org".

Sidling back to memorybot -- archivethe.net appears to have different connections.
 

Featured Threads

Hot Threads This Week

Hot Threads This Month