Forum Moderators: phranque

Message Too Old, No Replies

webarchive.org

don't they need permission?

         

Emperor

7:30 pm on Oct 11, 2004 (gmt 0)

10+ Year Member



Hi guys,

A few of my older sites are listed in their wayback archive. I never gave them permission to do that. How dare they copy my pages and store them on their own servers.

Just because my sites were public doesn't mean anyone should be able to copy and store the code without my permission. I wrote all that HTML by hand, it belongs to me.

Do you think they will remove my sites if I ask them? What do you guys think about the whole thing?

Take care,
Emperor

mack

7:41 pm on Oct 11, 2004 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



You can block the bot IE_Archiver. I have found that if you block the bot it stops users being able to see any archived copies. It just tells them the site owner has blocked access via robots.txt

It also allows the user to then view the robots.txt file? Don't know why.

Mack.

choster

7:46 pm on Oct 11, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Archive.org will remove any page from its index which you own the copyright to. As far as copyright law is concerned, the provisions are somewhat hazy in the world of electronic publishing, but they believe it falls under "fair use." Have you followed the court cases surrounding the Google Cache?

HughMungus

7:57 pm on Oct 11, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



they believe it falls under "fair use."

They could not be more wrong.

photon

8:25 pm on Oct 11, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Be sure to notify Google as well. :)

It's a long-running debate that still hasn't been put to a legal test, AFAIK.

topr8

8:45 pm on Oct 11, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



its a grey area - but they do obey robots.txt you could have blocked them years ago.

ogletree

9:15 pm on Oct 11, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Trust me if you want out they will take you out. Put them in your robots.txt and it will erase them forever. Only do this if you don't want to be there ever. I like being in there. I deleted a page one time and that was the only copy. It is a nice backup.

photon

12:08 am on Oct 12, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I've got nothing against Google. It's just that I'm wondering why it's okay for Google to cache your page without explicit permission, and yet it's not okay for webarchive.org to do the same.

I think quite a few people here who love Google would be upset with me caching their web sites so that I could make money off of them.

vkaryl

1:47 am on Oct 12, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



ogletree: I'm with you. Yes, my stuff on the waybackmachine is pretty embarrassing. But that's okay. I'm beyond that now, and I like knowing that there IS a backup out there somewhere....

mikec

3:48 am on Oct 12, 2004 (gmt 0)



if u have a problem with google keeping a cache of your page let them know. i'm sure they won't have a problem removing you from the index

keyplyr

5:49 am on Oct 12, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month




You can block the bot IE_Archiver

I think that would be ia_archiver ;)

longen

11:46 pm on Oct 14, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I think we should leave our sites in the archive, a hundred years from now academics will study them to trace the developement of the internet. Over centuries, everything becomes important.
Imagine the excitement if the British Library discovered a 4th century illustrated pornagraphic manuscript in their archives.

vabtz

12:47 am on Oct 15, 2004 (gmt 0)



well since you wrote it by hand I can totally understand how you would like your html not cached.

it must be really special

/sarcasm

encyclo

1:13 am on Oct 15, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



A few of my older sites are listed in their wayback archive. I never gave them permission to do that. How dare they copy my pages and store them on their own servers.

We had another thread about archive.org recently. The reason why they can is because you gave them explicit permission to do so when you published the pages in a public arena with no restriction.

Just because my sites were public doesn't mean anyone should be able to copy and store the code without my permission.

If I visit your site - which you have made public - I am downloading your HTML and images and storing them locally in my browser cache. You have given me permission to do this, because you have published your site.

it belongs to me.

Yes, it does - but I've still got a copy with your permission.

Having said all that, I banned the ia_archiver years ago without the slightest regret. It's hardly a substitute for proper backups to start with, and I don't consider archive.org to be particularly legitimate in its claim to be the web "library". They are a self-appointed, private organization which has proclaimed itself the saviour of web history, and has previously made unfounded claims of "rights" over material such as obsolete Google crawls. They are also not averse to publicity stunts, but have failed to live up to their self-proclaimed role as guardians of the freedom of access to information.

That an archive should be kept I don't disagree, but I'm not sure that archive.org should be the ones to do it.

netguy

1:33 am on Oct 15, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Actually, I get a kick out of going back to some of the old sites. It's like looking at a time capsule. Last year I took a look at several sites I designed back in 1996... urggg, gray backgrounds, horrible graphics, and one even had a marquee!

Now I'm happy to have it archived for any disputes for first designs without having to get my attorney involved.

Steve

Rosalind

9:53 am on Oct 15, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member




We had another thread about archive.org recently. The reason why they can is because you gave them explicit permission to do so when you published the pages in a public arena with no restriction.

...

If I visit your site - which you have made public - I am downloading your HTML and images and storing them locally in my browser cache. You have given me permission to do this, because you have published your site.

That's explicit permission to make a copy for one's own use, as opposed to the permission to republish something. The difference is significant.

Both Wayback and Google are republishing content without asking for explicit permission, and making it possible to opt out does not justify this. They are assuming that everyone has heard of them and read their rules, which isn't the case. It's only because most webmasters find them convenient that they are allowed to continue.