Forum Moderators: open
[pages.alexa.com...]
Would Google go down this route?
Their archive is a hugely valuable asset
Which they do not own... It would be the equivalent of painting a giant red target on their foreheads - they would be caught in a media <snip>storm before they could even think about removing the links to the press release.
Why should a profitable and private company like Google take such a silly risk for a return, which even on a best case scenario forecast, would be minimal? Who is seriously interested in buying the Alexa data? I don't think anyone is breaking down their doors to buy it...
[edited]period replaced w/ question mark...[/edited]
This to me must be sorted fast. It demeans any information on the web and bascially says that when you publish on the web you lose reproduction and distribution rights to any smart cookie with a writable cd and a marketing network.
Same as copying MP3's on the net. It may be easy to do. you may not like recording companies. You may think artists are paid too much. But in the end, its plain old robbery and thievery, no matter what excuses you use.
This to me must be sorted fast. It demeans any information on the web and bascially says that when you publish on the web you lose reproduction and distribution rights to any smart cookie with a writable cd and a marketing network.
I'm not sure that 600 terrabytes of data can quite compared to that!
I'm just playing devils advocate, and in truth, really don't see what the fuss is about. Websites are available to view on the internet, for free, by the public, by going to a URL.
It seems to me that Alexa is offering that ability to "view" on disk.
The analagy to MP3's and audio doesn't stack up to me. It's different - people make CD's to distribute for sale in a shop. People take that product, and against the terms of the licensing arrangement whereby the user is entitled to use that media, copies it to MP3 and puts it out for public consumption for free. It's not the same. The original intention of the author was that the product was for sale in one form of media, not available for free in another. It's the same with "paid for" MP3's - the intention was to charge.
How can we actually claim that we have *lost* anything. We have not, as far as I can tell, lost anything. To have a reasonable chance of success for an action in tort (certainly in the UK anyway) you have to have suffered loss.
All that's happening is these websites are being distributed on a different form of media for a charge.
I liken it more to a CD mnaufacturers ability to distribute freeware on a CD and charge for that CD.
But at the end of the day, why do you feel you are being harmed?
TJ
(I don't work for Alexa(!) and I'm really just curious. Sorry if I'm being stupid and completely missing the point but I just don't see it!)
Correct. They are not available for copying and redistributing for profit or otherwise, as is the general terms of the Berne convention for any published work, and is made clear in copyright statements on millions of website.
>>60 terrabytes?
No alexa will kindly make you up a custom disk of anything from a few websites to millions according to it's spin.
This is theft. no two ways about it. People can read my website till kingdom come. They can print or download it for personal use. but copy it and give it too all your mates on disk? No way. To allow ay other provelages will be the end of free, useful info on the web as we know it.
I mean, webmasters can prevent their content from being indexed by Alexa (just as the
<META NAME="ROBOTS" CONTENT="NOARCHIVE">tag prevents caching by Google), but how can a webmaster remove existing content from Alexa's index?
One more good reason to disallow ia_archiver [pages.alexa.com] IMHO.
OK, I can see that.
I'm still confused as to what the loss actually is?
If you feel there is one, then maybe we should all be suing google for charging Yahoo! a fee for our websites content in it's search engine?
Or are google allowed to do it because they're google?
TJ
Any ideas.
suing google for charging Yahoo! a fee for our websites content
Not at all, google charges yahoo! for work google has done from which yahoo! benefits.
Alexa will charge for 'a service' ie handing out work that we have done and with no likely direct benefit to us.
It's the principle.
My site is free to view to anybody and those interested can download a compact version in ebook format without charge.
Alexa has no right to charge for my content.
[edited by: Staffa at 12:48 pm (utc) on Aug. 6, 2003]
over 3.5 billion unique URLs, 3 billion unique pages, all updated every 60 days
60 days is an eternity on the web why would anyone be interested in this?
although they do add this:
Special collections may be created on request and updated as often as needed.
They say historians etc may be interested but why when you have Google at your fingertips?
ADD IN
Can someone advise what the bot is that collects info for the Wayback machine?
When you take a graphic down, they won't have it anymore either (I don't know if they save graphics as well, I'd think not), and lots of little red x's may make people precieve your website as sloppy or broken if they're viewing it through the Alexa archive.
Your "money pages" won't work! Most of us have websites so that people can come and buy stuff - the payment pages, dynamic product pages and all the "money pages" won't work - so they are taking away the ethos of most peoples sites.
If you've updated your webpage with new logos, pricing information or taken something down which had incorrect information - then you won't want people to see the old stuff - but now they can...
Just a few issues.
If you feel there is one, then maybe we should all be suing google for charging Yahoo! a fee for our websites content in it's search engine?
The huge difference is that Google does not sell content; just search results.
If you don't want your files to be archived by Google, you can disallow Googlebot and/or Googlebot-Image and/or use the
NOARCHIVEmeta tag attribute. You can even ask Google to remove one or more URLs from their index [google.com]. The latter simply doesn't seem possible with Alexa.
Copying or otherwise reproducing copyrighted material for profit requires written permission from the copyright owner. You don't have to ask permission if you're reproducing something you have access to for private personal use (e.g., printing or saving a web page to your hard drive). But to put that web page on a diskette and charge someone for it with no authorization from the copyright owner is simply illegal according to the current international IP protection legislation.
Something puzzles me here. Alexa search is powered by Google then why, apart for use at the archive site, is Alexa crawling our sites in the first place?
So that they can (legally) store and (illegally?) reproduce and sell our content.
60 days is an eternity on the web why would anyone be interested in this?although they do add this:
Special collections may be created on request and updated as often as needed.
They say historians etc may be interested but why when you have Google at your fingertips?
Data mining is the answer.
Once you have a 3.5 billion URL database, you can extract just about any sort of valuable (=marketable) information from it: stats, correlations, etc.
Example of what a "special collection" request might look like:
"Please extract 20,000,000 UK corporate e-mail addresses from your web archive and burn them on a CD. I will use them for unsolicited commercial emailing (since those addresses are freely available on the Web). I'll pay big bucks for that stuff."
And you know, e-mail addresses don't change every 60 days. ;)
Can someone advise what the bot is that collects info for the Wayback machine?
ia_archiver [pages.alexa.com]
It's on WebmasterWorld's banned bots list [WebmasterWorld.com].
...and why people download their toolbar and follow their stats still amazes me.
I would bet 90% of them are webmasters/SEOs checking their competition.
We have put together our website for others to view. The more people that view it the better through whatever medium as long as we still get credit. Why the fuss that Alexa or even Google's cache offer the services they do is beyond me. Its the internet for crying out loud.
Both Alexa and Google still give credit to the owner of the website. I can understand about being upset with people who steal pictures, audio, text, etc. without giving credit to the original but these complaints about someone's property being copied in the way that Alexa and Google services do, sound like a bunch of people crying to me.
I'm sick of reading about it, but had to give my opinion at least once <g>.
these complaints about someone's property being copied in the way that Alexa and Google services do, sound like a bunch of people crying to me.
I'm not crying at all, HayMeadows, nor am I complaining. Just telling facts as they are: there are coyright laws, and there are copyright infringements.
If you don't care that Alexa may (illegally?) sell whatever information it can extract from your web site, that's fine.
The quality of information available on the Web varies greatly. Most of it is free. But even free information may have strategic value when extracted and processed on a very large scale. Many of us are just concerned about what a third party like Alexa may or may not do (legally speaking) with the information that we publish on our web sites.
That said, everyone is free to allow ia_archiver to crawl their site if they wish so.
In terms of the law, I believe that Google's cache copy is fundamentally the same as what Alexa is selling here. Google is redistributing content by displaying the cache link and making such links available independently of the original website, at the rate of hundreds of millions of links a day. They're doing it to make money. Legally, I don't see much difference between Alexa and Google here. This latest move by Alexa simply makes it more essential to get the copyright issue into court.
If there's a market for the web on disks, you can bet Google will get into it. Everyone says, "Google would never do that." Next thing you know, Google has grabbed all of your images (June 2001) and only later tells you that they're doing this. Everyone scrambles to move their images to disallowed directories. "Google would never sell out." Now Google is the world's largest ad agency. "Google would never tamper with PageRank." Then Google tells the judge that PageRank is just their opinion of a page, and they have First Amendment rights to do whatever they want with a site's ranking, and you better believe we zapped SearchKing, Your Honor.
Google is insensitive to privacy issues. They'd do it in a heartbeat if there was money in it. And I think there could be big money in it. Intelligence agencies, advertising agencies, trend-spotting gurus, etc., would love to have the web on disk combined with cool data mining software, and maybe some data visualization toys also. Lots of fun for those who can afford it.