homepage Welcome to WebmasterWorld Guest from 54.211.230.186
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

    
Where is the original content tag ?
A new html attribute value that identifies the original article
walrus

10+ Year Member



 
Msg#: 4152941 posted 2:30 pm on Jun 15, 2010 (gmt 0)

The quote below is from the Caffine thread
[webmasterworld.com...]

In 1 query I was just able to find a mashup site with over 1.9 million indexed pages, all 100% copied from everywhere else online including hotlinked images.

Many of us are frustrated by the fact our articles are copied or stolen, even worse when the copy site is claimng authorship for our articles.

Can someone explain why they can't, won't or should,nt create a new html attribute value to ID original material as opposed to copied from.

We have a robots.txt to help guide the SE's, and sitemaps to help guide the SE's. The XML sitemaps even show pub dates and they still wil give the copied site top serp because whoever has the most links still rule. They have the keywords in the domain and they have a lot of links, so must be impooooooortant.

We use robots.txt and sitemaps because SE's/AI, can't do what they are designed to do without some guidance. It seems they need even more so why not a tag that ID's yours as the original article.

Eventually, low quality scraper /copy/thieves sites would get filtered as having low or no original material, and the original presented in the serps.

 

goodroi

WebmasterWorld Administrator goodroi us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 4152941 posted 12:28 pm on Jun 17, 2010 (gmt 0)

There are already methods of handling copyright infringement. You can file a DMCA request and the offending webpage will be removed from the serps.

The problem with creating a new tag is that many people won't adopt it. Even if everyone starts using this brand new tag, what is to stop the offending site from applying a copyright tag to their page? The search engine is going to see you both have copyright meta tags and we are back to where we are now.

walrus

10+ Year Member



 
Msg#: 4152941 posted 2:40 pm on Jun 17, 2010 (gmt 0)



Thanks for the reply Goodroi, i'm sure there are things you havn't mentioned and i still havnt considered that might make the whole tag idea impossible but if the idea is given some attention, someone might come up with a workable solution. Would'nt surprise me if it was someone on this forum,lot of talented and creative people here.

There are already methods of handling copyright infringement. You can file a DMCA request and the offending webpage will be removed from the serps.

Yeah but its like playing wack-a-mole

The problem with creating a new tag is that many people won't adopt it.

As long as the big 3 Bing , G, and Yahoo do then its like the canonical tag. People don't want to use it its there own benefit or loss.

Even if everyone starts using this brand new tag, what is to stop the offending site from applying a copyright tag to their page? The search engine is going to see you both have copyright meta tags and we are back to where we are now.


I was thinking that since you would be the originating article, then once you've published it, anyone copying, scraping, stealing would have a tag showing a publication date thats later than yours. The date acts as a flag showing its an obvious copy that would be easy for search engines to detect and classify as a copy.

goodroi

WebmasterWorld Administrator goodroi us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 4152941 posted 12:21 pm on Jun 18, 2010 (gmt 0)

its quite easy to fake a date. many webmasters are already manipulating the file creation & last modified dates for their webpages.

if you can think of a tag, there are people that are smart enough to abuse it.

walrus

10+ Year Member



 
Msg#: 4152941 posted 8:20 pm on Jun 18, 2010 (gmt 0)

I didn't know you could change dates like that, interesting. Well, i'm gettin the idea, its just not really practical or doable. Really appreciate the feedback though, thanks.

vandread

5+ Year Member



 
Msg#: 4152941 posted 10:53 pm on Oct 23, 2010 (gmt 0)

How about a simple ping then to specific servers to claim ownership. If you ping the servers first, it should be obvious that you are the owner. And content scrapers usually need a few minutes to post the content on their sites.

goodroi

WebmasterWorld Administrator goodroi us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 4152941 posted 2:03 pm on Oct 28, 2010 (gmt 0)

that sounds great for smart webmasters but for less savvy webmasters it is likely a nightmare.

the scrapers would be smart enough to ping and newbie webmasters wouldn't know to do this. so now the scrapers are listed as the official publisher simply because they knew to ping a server.

Status_203

5+ Year Member



 
Msg#: 4152941 posted 8:48 am on Oct 29, 2010 (gmt 0)

This post from Incredibill on "crawl delayed publishing [webmasterworld.com]" might be the best that can be managed.

Just be sure you understand the implications - getting it wrong could be worse than the status quo.

I completely blocked crawling of a hobby site of mine several times while developing my white-listing code, and this seems similar

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved