Forum Moderators: open
Glad for the extra copy I stuffed the articles in, just to find out now that Google penalizes me with a 0 page rank.
How would you react? What's your take on re-used copy that you are not allowed to change?
Thanks, Jens
If you think it is the duplicate content filter at work, you can see if it has been flagged for this by searching for a specific phrase on the article, with quotes around it. Choose a phrase that is likely to be in this article only, and use most of the ten words Google will search for at once.
In the initial results, is your site's article there? If yes, it isn't the duplicate content filter working against you.
If not, click the link at the bottom of the results that says:
In order to show you the most relevant results, we have omitted some entries very similar to the ## already displayed.
If you like, you can repeat the search with the omitted results included.
If your site then shows up when you click omitted results, that would be the duplicate content filter kicking in. If your site still does not show, it likely hasn't been indexed fully yet, and it isn't the duplicate content causing your problems at this point.
Have your new pages been crawled?
It's clear that you can't change anything in the copy, but in most cases you wrap the article in your templated navigation. Adding some comments before the actual article and a summary could help to prevent duplicate content.
The site has been produced in July, content grew over a period of two months, haven't done any additions after early September. The site was indexed (with 10% of pages available back then) end of July.
All pages show up in index now, all have PR0.
Many thanks again, Jens
I didn't change any content, but I made changes in other ways, including introductory comments as mentioned by Dirkz. I also converted most formatting to CSS to reduce clutter in the source code. It speeds up the pages, and it's one more way to increase the distinctiveness of my pages.
[edited by: buckworks at 8:07 pm (utc) on Oct. 5, 2003]
There are 2 advantages to doing it this way:
1 - You avoid any chance of getting a duplicate content penalty.
2 - Your affiliate commissions will likely be higher than your competitors because you have unique content on your page, not the same boring stuff that your visitors have already seen on 50 other sites.
It's time-consuming, for one thing, and plagiarism questions arise when you try to rewrite someone else's content. Also, the credentials of the original author are part of what makes the material valuable to have on your site.
It's probably more efficient to make reasonable changes to differentiate your pages from possible duplicates, then work on building PR so that even if there are partial duplicates floating around in cyberspace, your pages will be weighted better than theirs.
You're right. I was referring to the cookie-cutter pages used to sell ebooks, software, things like that. These things are ubiquitous on the web and pretty much ignored.
In reference to plagiarism, I believe that completely rewriting a page and not using any passages from the original page would help prevent being accused of plagiarism more so than changing bits and pieces here and there.
If no part of the content is copied (or simply altered and used)it can't be plagiarism. Knowledge and facts can't be copyrighted, only specific written text. For example, lets say I read a book about a topic, say on using a software package. I then write my own book on that topic. Assuming that I write the book in my own words and don't use altered versions of passages from the other book, this isn't plagiarism. Am I right or am I completely confused on this issue?
Assuming that I write the book in my own words and don't use altered versions of passages from the other book, this isn't plagiarism.
Maybe the best of both worlds would be to get copyright permission to use content from other sources on your site, and also do some original writing of your own as time and knowledge permitted.
Side note: When I'm working on an article, I try to leave some downtime between the research phase and the writing phase, to reduce the chance of echoing someone else's words too closely because they're too fresh in my mind. Giving credit where it's due can become difficult, because over a lifetime of reading we accumulate ideas and information but often don't remember where things came from.
[edited by: buckworks at 9:21 pm (utc) on Oct. 5, 2003]
That's not exactly true. It is much more complicated than that. Here is a decent source showing examples.
[indiana.edu...]
Unless you mean that the entire page (duplicate "page") is exactly the same (HTML source) as some other site's page (maybe by file size, modification date, etc?) then it might penalize you (but im not even sure about this)
"G usually doesn't mind dup content, as long as the pages don't "taste" the same."
The taste thing might be hard to prove also .. even with different headers and footers etc if the main article provides the "taste" of the page then this cant be true ..
Otherwise .. goodbye Yahoo and any news or press site that publishes news feeds and press releases word for word.
Ive also read on this forum that the big G "takes the first version of the content it finds and does not list the other sites" ..
Again .. if this was the case .. why should the first site to publish (or first to be crawled) a news article / press release be the one that Google indexes?
I will try to break up the articles into pages and sections, play with h1-h3 headers in between and quote passages in between the article as well. Introduction might help too.
I noticed that GBot only shows up occationally and does not hit deep down, just scratches the site on the surface, need to do some checking here as well.
Thanks all, nice week, Jens