Forum Moderators: open

Message Too Old, No Replies

BruinBot

         

volatilegx

6:49 pm on Sep 2, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



"BruinBot (+http://webarchive.cs.ucla.edu/bruinbot.html)"

From 131.179.64.187, creating a searchable web archive, letting you view pages as they change over time.

Interesting idea.

wilderness

7:06 pm on Sep 2, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



thanks Dan.

Their URL doesn't offer much of an explantion under the guise of research

BruinBot is the crawler that we have developed here at UCLA, and we use it to download parts of the Web which are important for our research.

Nor does the page include a name to use for robots exlusion.

Even the informational links at page bottom have yet to be added.

If however, your go up in their structure a single directory, you get:

[webarchive.cs.ucla.edu...]

From here down (Web Research Platform:),the folks here are attempting to cover a very broad spectrum regarding use.

Seems to me personally that the only difference between this project and archive.org are grants :(

Don

wilderness

7:13 pm on Sep 2, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



131.179.64.207 - - [25/Nov/2003:23:17:39 -0800] "GET / HTTP/1.0" 200 9398
"-" "PanopeaBot/1.0 (UCLA CS Dpt.)"

131.179.64.177 - - [31/Dec/2003:20:19:10 -0800] "GET /robots.txt HTTP/1.0"
200 2439 "-" "Pita"

Looks to me that they have a few bots ongoing?

Lord Majestic

7:26 pm on Sep 2, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Their URL doesn't offer much of an explantion under the guise of research

Sometimes in research one does not know exactly what they come up with - people take a bunch of interesting things and then attempt to do something that might be of remote use. All those tries might be of total zero value, its like trying to find pearl in sand. Naturally they aint getting into explanations what exactly they are doing, because its entirely possible at this stage they are not doing anything useful, but they just might become next Google, or Teoma.

Its not like they are borrowing your car or your computer anyway - they access public website and its not a crime to do so. As volatilegx says its an interesting idea - they will have to store lots of pages and its possible that they will come up with really good inremental compression for that need. This is how viagra was produced - researches were looking for something completely different (and were failing).

pleeker

8:05 pm on Sep 2, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



BruinBot's been hitting me pretty hard lately. Must say their stated goal (viewing pages over time) sounds no different from what archive.org is doing, but maybe they'll do it better.....