|Where is the original content tag ?|
A new html attribute value that identifies the original article
The quote below is from the Caffine thread
|In 1 query I was just able to find a mashup site with over 1.9 million indexed pages, all 100% copied from everywhere else online including hotlinked images. |
Many of us are frustrated by the fact our articles are copied or stolen, even worse when the copy site is claimng authorship for our articles.
Can someone explain why they can't, won't or should,nt create a new html attribute value to ID original material as opposed to copied from.
We have a robots.txt to help guide the SE's, and sitemaps to help guide the SE's. The XML sitemaps even show pub dates and they still wil give the copied site top serp because whoever has the most links still rule. They have the keywords in the domain and they have a lot of links, so must be impooooooortant.
We use robots.txt and sitemaps because SE's/AI, can't do what they are designed to do without some guidance. It seems they need even more so why not a tag that ID's yours as the original article.
Eventually, low quality scraper /copy/thieves sites would get filtered as having low or no original material, and the original presented in the serps.
There are already methods of handling copyright infringement. You can file a DMCA request and the offending webpage will be removed from the serps.
The problem with creating a new tag is that many people won't adopt it. Even if everyone starts using this brand new tag, what is to stop the offending site from applying a copyright tag to their page? The search engine is going to see you both have copyright meta tags and we are back to where we are now.
Thanks for the reply Goodroi, i'm sure there are things you havn't mentioned and i still havnt considered that might make the whole tag idea impossible but if the idea is given some attention, someone might come up with a workable solution. Would'nt surprise me if it was someone on this forum,lot of talented and creative people here.
|There are already methods of handling copyright infringement. You can file a DMCA request and the offending webpage will be removed from the serps. |
Yeah but its like playing wack-a-mole
|The problem with creating a new tag is that many people won't adopt it. |
As long as the big 3 Bing , G, and Yahoo do then its like the canonical tag. People don't want to use it its there own benefit or loss.
|Even if everyone starts using this brand new tag, what is to stop the offending site from applying a copyright tag to their page? The search engine is going to see you both have copyright meta tags and we are back to where we are now. |
I was thinking that since you would be the originating article, then once you've published it, anyone copying, scraping, stealing would have a tag showing a publication date thats later than yours. The date acts as a flag showing its an obvious copy that would be easy for search engines to detect and classify as a copy.
its quite easy to fake a date. many webmasters are already manipulating the file creation & last modified dates for their webpages.
if you can think of a tag, there are people that are smart enough to abuse it.
I didn't know you could change dates like that, interesting. Well, i'm gettin the idea, its just not really practical or doable. Really appreciate the feedback though, thanks.
How about a simple ping then to specific servers to claim ownership. If you ping the servers first, it should be obvious that you are the owner. And content scrapers usually need a few minutes to post the content on their sites.
that sounds great for smart webmasters but for less savvy webmasters it is likely a nightmare.
the scrapers would be smart enough to ping and newbie webmasters wouldn't know to do this. so now the scrapers are listed as the official publisher simply because they knew to ping a server.
This post from Incredibill on "crawl delayed publishing [webmasterworld.com]" might be the best that can be managed.
Just be sure you understand the implications - getting it wrong could be worse than the status quo.
I completely blocked crawling of a hobby site of mine several times while developing my white-listing code, and this seems similar