Forum Moderators: not2easy
Basically the content would be shown as maybe one or a series of images to regular users while major spiders such as Google, Yahoo!, and MSN would see the text content. This would require a no-archive tag in the content page as well. Since this would involve IP-address cloaking, there'd be little chance for your content to become stolen.
Spammers and such are lazy, thus they wouldn't want to spend the time, energy, or effort into typing your content out in order to steal it. This would also help with duplicate content penalties.
I'm thinking about developing a program where webmasters and content writers would be able to paste in their content, have it automatically converted to an image, and being able to upload it to their website. What do you guys think? Does something like this already exist that I don't know about or is this "original". I wonder what GoogleGuy would think of this?
My plan is to serve the text content to users in the US, and the same content in image format to visitors outside the US. My thinking is as follows:
1. Search engine crawlers tend to come from US-based IP addresses.
2. Search engines seem to be more tolerant of geo-targetted content.
3. If a US-based visitor republishes the text content, it is likely to be hosted on a U.S-based network and therefore relatively easy to get removed by sending a DMCA takedown notice to the host or upstream provider.
Regarding tools to convert the content to images, I plan to load the html in MSIE, then use an off-the-shelf screen capture program that has scrolling capabilities to convert it to an image. I have not yet decided on the image file format (jpeg, gif, or png).
Are you creating content purely for your own amusement, or do you expect visitors to the site to hang around and read it?
Text based sites are generally quite fast to load, images load slowly. Bear in mind that there's a whole world out there outside the US, and many internet users are still on 56k or less. Surfers who aren't on broadband aren't going to waste their precious time or money waiting for a site whose "text" is contained in slow loading images.
Serious copyright thieves will take the time to retype your text if they consider they'll gain from it, and frankly, I'd rather chase the few who do steal my content rather than **ss off the rest of the visitors who want to view my site to the degree that they won't come back.
>>Spammers and such are lazy, thus they wouldn't want to spend the time, energy, or effort into typing your content out in order to steal it.
quite a lot of spammers scrape snippets from your site directly from the serps anyway, so they wouldn't even visit your site anyway.
good idea though, i think protecting copyright is going to be one of the big issues in the future, but lets face it the huge media (movie and tv and music) companies are spending quite a bit trying to prevent piracy right now and i'm not sure how successful they are being.
The visitors to many content sites are seeking the following:
1. Free content that answers their most pressing questions.
2. An environment that provides that content in user-friendly manner, and that means pages that don't take forever to load.
Unfortunately, there are trade-offs between the two. A site with no content protection requires resources to chase down content thieves, and those resources could be spent developing more content to answer visitors' most pressing questions.
In the early years of a site, it may be feasible to track down infringers one-by-one to get the stolen content removed. As the site traffic grows, this task begins to consume considerable time. For example, consider a site with one million monthly visitors. Assume that 0.01% of those visitors decide that they like the content so much that they want to put it on their own websites. Much of the time, their intent is not malicious; they simply are ignorant of copyright law. That 0.01% amounts to 100 people per month who republish the site's content. It would be a full-time
job to get those lifted copies taken down. For a one-person operation, that leaves no time to develop new content. Therefore, it may be in the best interest of the site visitors to prevent the content from being stolen, even if that means slower loading pages, since the resources that would have been spent dealing with content thieves could be spent developing more content.
The load time of the images may be manageable. For example, one can implement the first few paragraphs of the page in text format and then the remainder of the article as an image so that the visitor can begin reading the article while the image is loading.
Finally, the image itself would not be so large. If the html for the content of a particular article is 15 KB, its image version would be 90-100 KB. While a factor of 6 or 7 increase may seem large, there are many high traffic sites out there with home pages that are 500 KB or more that take several minutes to load over dial-up, so a 20-30 second load time may be tolerable.
The underlying assumption of all this is that images would be served to visitors in countries that don't have the equivalent of a DMCA and that search engines do not ban sites that serve different pages based on visitor country.
Topr8, I don’t believe you can consider this traditional cloaking as what is being done is changing the delivery format of content. The page should visually stay the same,
>Serious copyright thieves will take the time to retype your text if they consider they'll gain from it
Malachite, with all do respect, I don't believe content thieves are serious enough to retype thousands and thousands of characters word for word from images. If content thieves go through the trouble, they should be smart enough to rephrase the stolen content into their own words or at the very least, change the content so it’s not blatantly identifiable as stolen, which would in turn become a matter of plagiarism, not content-theft per say. Now whether or not the stolen content is altered enough to evade duplicate content penalties is an entirely different matter altogether.
this is automated, not hand edited by the search engines, it would trigger their cloaking filters (if they exist)
>>Malachite, with all do respect, I don't believe content thieves are serious enough to retype thousands and thousands of characters word for word from images.
agreed of course, although there is software that will do it for you very quickly, especially if you're not too worried about the odd error.
i'm sure a spammer wouldn't bother to even use software though.
Regarding the effectiveness of the image presentation of content against theft, it is not necessary to make it 100% secure. If the protection scheme reduces content theft occurances by 90%, then it would be of
great benefit.