Forum Moderators: open
My specific use is I want to make an index of school websites across north america. The top part would be my content, logo and state/province index. The bottom part would be the scraped content of the particular schools URL.
a) If I have more than 50%-60% of total page content as my own original material, will it beat google/other SE's duplicate content filters? (aka is there any benefit to doing this as far as becoming an authority site)
b) Am I infringing on the copyright of the scraped websites? Is there any legal precedent for this?
In the end, it has the same effect of displaying an external URL via a frameset; you are displaying someone else's site in your own context. I can't see how the fact that it is browser or server generated being fathomed by a judge..
And what about hotmail/ask's method of keeping their frame on top when you hit an external link? As long as I followed their format and didn't palce advertising it could hardly be considered 'profiting' off the other site.
Also, frames can have issues when it comes to search engine rankings. Be advised.
A competitors' doing well at this moment in the se's is no guarantee that the practice is a good practice, nor that the competitor will continue to good in se's.
Finally, using a framset, which you mention here, is a completely different animal than 'screenscraping.' Framesets are highly questionable when it comes to copyrights - and screenscraping is strictly prohibited, beyond what could be considered 'fair use' (which is very, very limited, and very, very shaky ground).
Best practices: get permission for whatever you do. You are probably not in a position to compare yourself to the search engines you cite; and if you think you are, make sure you provide very, very explicit grounds for this.
I have decided that as with most shaky black hat stuff, i'd rather sleep well by making a %100 legit site with harder work thats guaranteed not to kicked for illicit practices.