Welcome to WebmasterWorld Guest from 18.104.22.168 , register , free tools , login , search , pro membership , help , library , announcements , recent posts , open posts Become a Pro Member
Linking to News Sites. ...articles disappear. NickCoons
I have a news aggregate site that posts links to articles and then allows user comments (in a Slashdot-style) with a niche topic.
However, it seems the articles that I link to tend to disappear after a few weeks or so. Any idea why that may be?
I was wondering if I might be able to cache the articles, similar to the way Archive.org does (assuming the site doesn't forbid it with robots.txt) and link to the cache so I know that I'm always linking to a working version of the article.
The articles get taken down on AP partner news sites generally after 14 days by contract. Only AHN lets 180 day archiving I believe
Altough you can cache (copy) their content you probably shouldn't.
1: You generally violate copyrignt when you cache.. Are you an AHN, AP, Reuters, AFP licensee? Get permission... News organizations register their copyrights, that means statutory damages.
2: If the article gets changed or retracted and then you fail to update then you can be liable for libel or even worse. Lawyers, specifically lawyers for celebrities live for this.
So how does something like the Wayback Machine get away with caching the entire internet? Certainly they obey robots.txt, and I would think that would be the only criteria.
Or is this one of those "I'd probably be right if I followed robots.txt, but I could get dragged through court by lawyers and it wouldn't be worth it" sort of situations?
The Internet Archive at archive.org (aka 'wayback machine') is entirely non-commercial, however they still get sued occasionally.
You can almost guarantee a suit if your site is in anyway commercial in nature or seeks any revenues.
You should read the FAQ and news section.