Forum Moderators: not2easy
Moreover to what Lisa has said, in my opinion, if you are going to continually source information from the same site intentionally, to display it on your site, you will be looking for direct permission to do this.
This is opposed to copying the odd line of text from a website and referencing the source.
The question is, if you were them, would you let anyone else do it?
Best to check out the companies/site specific guidelines in regards to this.
But 'scraping' a page from sites without their knowledge may well be illegal - you are assuming they want their content to be scraped. With RSS they can state exacly what content they want used - title, abstract, url etc. With your own scraping they can't. In short you should ask them first. Most sites indded change their HTML regularly to 'break' such scraped feeds.
'Deep linking' and 'framing' are separate issues in which there is considerable debate right now.
And remember search engines generally don't like duplicate content, and may penalise sites that are doing it for different reasons completely realting to the efficacy of their search service.
Basically many sites want you to feature their headlines. Clicking on a headline from your site will redirect them to their site. So to me the rule is if they have an XML RSS feed (and there are thousands.. if you do a search) you can use it bu parsing it and displaying it. There are many tools (and simple code) for parsing and displaying RSS content. But if you are scraping it or displaying it without their knowledge you should contact them for permission.
We provide several RSS feeds (WebmasterWorld has one too) and we also display other site's RSS content using several scripts, so we can see the arguments on both sides.
[edited by: chiyo at 7:00 am (utc) on June 7, 2002]
We have had many people copying large blocks of our text either crediting us or not. Our original content represents a major amount of work for our researchers, columnists and authors. We are NOT happy with people copying content like this.
We do provide a clear copyright policy on our site that permits the reproduction of a certain percentage of text content on a certain page and up to a certain number of pages with proper citatation (not hidden text, not micro text! etc!)
I would approach the problem from the opposite way. If there is no clear copyright statement you must check with the site, i would not assume anything apart from original content no matter how published and with or without the copyright symbol is copyright except in the case of reproduction of small sections for critical or satirical review, and then it must be sourced. Thats the general (though far from written in concrete) understanding in publishing worldwide - on or off line.
There are loads of free news feeds avaialable, 7am.com is one that springs to mind + if you are a non-profit making site, moreover will supply a feed for free.
aspdaddy