Welcome to WebmasterWorld Guest from 54.167.83.224

Forum Moderators: Ocean10000 & incrediBILL

Message Too Old, No Replies

memorybot

from: archivethe.net

     
6:04 pm on Jun 20, 2014 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Nov 5, 2005
posts: 2038
votes: 1


37.16.72.207
Mozilla/5.0 (compatible; memorybot/1.20.71 +http://archivethe.net/en/index.php/about/internet_memory1 on behalf of DNB)

robots.txt? Yes

Internet Memory Research
Parent range: 37.16.72.0 - 37.16.79.255
CIDR: 37.16.72.0/21

Note: The bot's versioning is wacky. In the past week, Project Honey Pot participant sites report FIVE different version numbers for the IP that hit me [projecthoneypot.org ]:

/1.20.30
/1.20.33
/1.20.37
/1.20.41
/1.20.70

Notes:

- Neighboring IPs (.208; .209) show the same versions, and more. E.g.: [projecthoneypot.org ]

- The umbrella, internetmemory.org, appears to be more European than not.

- Yet Another All-Web Archive. But apparently not connected to Amazon's archive.org -- yet.
1:39 pm on June 21, 2014 (gmt 0)

Senior Member

WebmasterWorld Senior Member aristotle is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Aug 4, 2008
posts:2678
votes: 94


Can you please describe this "Amazon's archive.org" that you mentioned. It's probably discussed in another thread that I missed, so I don't know about it.
6:59 pm on June 21, 2014 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month

joined:Apr 9, 2011
posts:12693
votes: 244


TIA doesn't actually belong to Amazon, does it? They just crawl from aws ranges.
1:08 am on June 22, 2014 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Nov 5, 2005
posts: 2038
votes: 1


I don't know about archive.org's crawling bases, AWS or otherwise. That said, a major "Institutional Supporter" of Archive.org is Alexa -- an Amazon company. [archive.org...]

And --

"Alexa's operation includes archiving of webpages as they are crawled. This database served as the basis for the creation of the Internet Archive accessible through the Wayback Machine.[7] In 1998, the company donated a copy of the archive, two terabytes in size, to the Library of Congress.[5] Alexa continues to supply the Internet Archive with Web crawls." [en.wikipedia.org...]

Thus I reckon Archive.org's data, tracking and stats get accessed by Amazon in some way, shape, or form, thus my POV: "Amazon's archive.org".

Sidling back to memorybot -- archivethe.net appears to have different connections.
 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members