Welcome to WebmasterWorld Guest from 54.167.46.29

Forum Moderators: bakedjake

Message Too Old, No Replies

Advanced Keyword Search at the Wayback Machine

     
2:10 pm on Sep 4, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member heini is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Jan 31, 2001
posts:4404
votes: 0


What a cool tool:
[recall.archive.org...]

It's an experimental app searching through the 11 Bill docs stored at archive.org by keywords. It comes with date limiters, so you can search for pages on any subject in a specific timeframe. The ranking is content based.

Additionall features:
- graph displaying the number of pages over time
- related topics to further refine searches
- graph showing the main related topics popularity in time

3:19 pm on Sept 4, 2003 (gmt 0)

Junior Member

10+ Year Member

joined:Jan 15, 2003
posts:169
votes: 0


That is an excellent tool. Thanks for that.

It doesn't seem to graph things if the url pool is too small.

8:12 pm on Sept 4, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Apr 8, 2002
posts:2012
votes: 0


Wow! That's a new reall search engine, I guess! Even theme clustering, Categories, Topics ... i'm very impressed -> bookmarked! This is great news, heini - thanks for spotting it!

<added>
Ohps, why does a search return my site allthough its not listed at archive.org (since ia_archiver is disallowed to index it)? Are they using alexa data? Hmm ...
</added>

8:24 pm on Sept 4, 2003 (gmt 0)

Junior Member

10+ Year Member

joined:Jan 31, 2003
posts:194
votes: 0


at last - i think they must have been planning this for so long - it could be an absolutely brilliant tool and so useful.

also with 11 billion pages - kinda puts Google and FAST in their place :)

12:10 pm on Sept 7, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Apr 8, 2002
posts:2012
votes: 0


>11 billion pages

I suppose this number includes all time snapshots from a page, or!? If so, i wonder how many "real" unique pages they indexed? The 11 billion pages could get cut down to just a few 100k unique pages. I can't find any number about the unique pages neither on archive.org nor on recall.archive.org.

9:31 am on Sept 8, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:July 11, 2003
posts:955
votes: 0


Great Tool! Well done to the Wayback machine :)
9:46 am on Sept 8, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member vitaplease is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Dec 11, 2001
posts:2725
votes: 0


This is amazing stuff, thanks Heini.

I can imagine it can come in handy with some copyright - who was first - stuff as well.

I do not seem to get their "before" data limiter working.
It seems to always show until April 2003?

Did not know there was a wayback forum either: [archive.org...]

Seems Wayback even has 30 billion pages - wonder why Anna Patterson limited herself to only 11 billion :)

7:14 am on Sept 13, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:July 8, 2002
posts:1157
votes: 0


I found my website allright but it displays the URL as www.myurl.com:80

what is colon 80?

7:25 am on Sept 13, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Oct 1, 2002
posts:1580
votes: 0


>what is colon 80?

The port number.