|Duplicate content for religious texts|
I still can't really understand what would be considered by google a duplicate text when we are speaking about religious / philosophical texts?
My blog is about religions, mostly Buddhism, but not only, so I have to wonder if some Buddha speeches or may be works by Buddhist philosophers written in some 5th century or may be prayer texts written in some 6 or 14 century would be considered a duplicate texts, on the basis that somebody else have published them on internet before me or may be could be published and I don't know about it, etc....
- It is clear it is not and can not be my own text, because it is classical texts. And it is not too easy actually find if ever on internet them....
And I often can see at some internet page - they have a text from this book (and I have the same book as well,) - so what - if I have educational site I should start to spun classical/religious texts now? I can't write an original Plato text just because somebody has somewhere published it on internet before me, but if I would wish to find it it would take may be hours to find it....etc....
And what impact on adsense it might have, because some people say adsense might be blocked for duplicate text;
but again - I see those texts are valuable and educational for the readers, not because I am stealing somebodies text....they are nobodies and without even any copy/authoring rights...
Google wants to have good value in their search results. If someone searches for religious text, it is not very helpful to show 10 pages with identical copies of the text.
If you publish religious or historical content then I would be extra careful to make sure you are adding value since that text is likely already available on 100 other sites. For example consider adding some interpretation or commentary to help readers better connect to the writing. Otherwise do not be surprised if the Google automated duplicate content filter kicks you off the search result page.
No, it's not on many sites....Than it wouldn't make sense for me to research and find them....
There might be 1-2-3 or none....
For Buddhism/Hinduism especially - it's not like a one unified Bible, there are hundreds of separate texts, like complete works of Lenin in hundred volumes, some of which might be still somewhere published or not....
+ of course, my site have commentaries, etc. for original content; I don't even need it for search results, I'm getting enough search results on my original content;
but I wish I could publish also those without being penalized by some way, because they are value for my readers.
You're not being penalized, you're being filtered.
You have to do something or provide something that is clearly different than those other sites. Maybe you are, but it needs to be obvious to your users and to Google.
If I were you I would forget about AdSense and don't worry about Google yet, but work on building your community first - you need users that love your site and come back to it over and over because they find something there they can't find anywhere else.
It means you may have to go GET your users, instead of waiting till they happen to find you in Google.
And make sure the percentage of content that is found other sites is relatively low compared to the amount of content that is unique. If 10 or 20% of the content is on other sites, you're probably okay. But if 90% of what you have can be found elsewhere, then you probably won't do well with Google.
That's just the way it works.
Yes, I have visitors it's not a problem, but google have to understand - religious people are searching for a particular info, not my original thoughts;
[edited by: goodroi at 8:17 pm (utc) on Jul 19, 2013]
[edit reason] Please no specific keywords/search terms [/edit]
No, Google doesn't have to understand. YOU have to understand. If you want to show up above the other sites, you have to earn it.
I have original articles which are on first page many articles, besides this G+, FB , feeds, etc....
I just need this point if I add valuable duplicate content.
Are the variant texts within the body of your pages? Is it possible to stash them in no-indexed areas, so the only parts visible to search engines are your original content?
You would probably want to let one version of each text be indexed, because part of your audience is people searching for text. If they don't find it at all, they won't know that your site has it.
"rel='canonical'" is definitely not the way to go in this case. Where's that ROFL emoticon when we need it?
Valuable Duplicate = Rare manuscripts , which are rare, but may happen in other places;
There are some things that google's algorithm simply isn't equipped to handle. I've got one ebook where each title gives the readings of three different MSS. "Why does the site say everything three times? Do they think we can't figure out that 'asks' and 'axeth' are the same word, or that eth and thorn are used interchangeably? Spam, spam, spam..."
|Valuable Duplicate = Rare manuscripts, which are rare, but may happen in other places |
To oversimplify... in duplicate situations, the site with the most PageRank wins. "PageRank" these days includes other PageRank-like "link juice" factors. Such factors (among others) are a measure of online popularity or reputation.
So... if you want to rank for the text of the documents, then you essentially need more relevant, high quality inbound links than competing sites offering the same text. In the past, I've seen Google return multiple copies of documents or articles if there's sufficient linking to the sites that publish them. In your situation, I can imagine that unique and useful commentary might be one way of attracting the kinds of links you need.
|many articles, besides this G+, FB , feeds, etc.... |
Currently, this type of social promotion by itself is not sufficient to produce Google rankings. If it drives traffic to your site, then, over time, that traffic might generate links. You might say that inbound links of sufficient quality are in part what suggests to Google that your material is "authoritative".
It's likely that, say, the Morgan Library... if it offered this material online... would be likely to outrank you for the same queries because it has a big head start.
Keep in mind, though, that if the "rare manuscripts" are rare and not copyrighted, the text content is likely to become less rare when published online. That's generally why your site will need to offer... in addition to the manuscript content... unique self-generated material, to establish a genuine reputation online.
I think Lucy24 had a good point about noindexing, and here's an additional thought. Build a page that discusses a particular text, and design it to rank for the text's name or whatever searchers are likely to use. On this page you could talk about the historical context, any interpretive issues, etc. Then it could link to a no-indexed page containing the actual text. That way, you have a page that can rank for the phrases you're targeting AND lead visitors to the text, but you're not relying on "dupe" religious texts themselves to rank.
I hope you won't mind if I take a stab at re-phrasing the original posters question:
Will your rankings be hurt for your ORIGINAL content if you also have large amounts on your site of public domain content that appears elsewhere on the internet?
What I believe the original poster is trying to do is rank for his / her COMMENTARY on the texts.
I believe that the poster is worried that also having the original text - which does appear on other sites that (might) have more authority - would trip a Panda filter, for example.
If one were to post commentary on these texts, it would be quite helpful for a USER to have the texts at hand.
But would google then punish that site with Panda because that same text also appears elsewhere?
To think of this another way...
Maybe you have a review site that reviews widgets. You have lots of detailed reviews about different widgets out there.
Would it be a BAD idea to also include the manufacturer's description of the widget on that review page since the manufacturer's description already appears elsewhere (such as on the manufacturer's product page and on the pages of all their affiliates)?
would it be possible that:
1) Your site will be demoted because you have content found on other sites?
2) Your pages will be demoted because the RATIO of unique content to public domain content has been lowered by the inclusion of content that is available on other sites?
I think these are the key points the original poster was trying to ask. I apologize if I am assuming wrong here.
Planet13 - I tend to agree with your interpretation, and certainly some of the original poster's comments are in line with that... eg here...
|I don't even need it for search results, I'm getting enough search results on my original content; but I wish I could publish also those without being penalized by some way, because they are value for my readers. |
This is the part that throws me, as there are several ways of interpreting it...
|...religious people are searching for a particular info, not my original thoughts |
I'm guessing that there may be turn out to be some ambiguity in what searchers will be looking for.
Off the top of my head, I think I'd put the rare historical text into a subdomain, and then organize these by religion and appropriate subcategories (eg, historical and/or other appropriate categories). I'd add it gradually, and initially I would not noindex... I'd let Google index the content and see how it flies.
Your possibilities (1) and (2) make sense, but the dupe filtering kicks in only when Google sees other instances of the same text elsewhere with higher linking reputation.
One thought to start these pages off would be to cautiously crosslink texts from some of the commentaries on the main domain. I'd also make efforts to attract backlinks to the text. The publicity for these, though, might also attract competitors who might scrape them.
Hard to say how Google would regard text that is indexed in Google books.
|You're not being penalized, you're being filtered. |
I love that Netmeg. Its like....
Your not being imprisoned your being incarcerated.
Your not getting divorced your getting a permanent separation
Your not dead your just powered down
Or, as my wife often says to me:
"You're not dumb; you're just BEING dumb!"
but in all seriousness, I think that netmeg is right about the difference between filtering and penalizing here.
@ Robert Charlton:
|...religious people are searching for a particular info, not my original thoughts |
Agreed. That could be interpreted that original poster is possibly trying to rank for public domain text... thanks for pointing that out.
|I love that Netmeg. Its like.... |
If I hide someone's posts on my FB timeline because there's too many of them or they're too vituperative or I just plain don't wanna see em, then I am filtering him.
If I unfriend him entirely, then he's penalized.
|That could be interpreted that original poster is possibly trying to rank for public domain text |
... and now we're edging into an area I actually know something about, because the kind of e-books I work with are all in the public domain.
Think about why humans choose one source over another when the words themselves are the same.
#1 they did an exact-text search, or title-of-work search, and this happens to be the site they landed on
or, #1b, one of the top ten results is, let's say, jatakagateway dot com while the other nine are everything-but-the-kitchen-sink scrapers recognizable as such even to the casual human
or, #1c, one of the top ten is your area's equivalent to amazon or wikipedia so the visitor goes there by force of habit
#2 the texts are presented in an outstandingly usable and accessible form
You can't do much about #1. Work on #2.
|you need users that love your site and come back to it over and over because they find something there they can't find anywhere else. |
I've had that for years, still do, yet it does not seem to help. As long as the content farms with sweetheart deals continue to scrape our content and offer bit and pieces, we'll be at the back of the line.
I wasn't trying to imply that that would necessarily directly help you in Google, but that you would need Google less if people came back to you on their own.
Thanks all for comments, sorry that I disappeared, probably I lost hope a day earlier.
Anyway, yes, my basic question was about -
for example, you would be interested in Buddhism, but had no idea what is Tathagathagarbha concept....So I can write (original) articles to explain you what Tathagathagarbha is ; and I can mention - generally speaking , it has been thought in this and that sutra....
So - should I tell people - go and search the net now to find this sutra, may be it is somewhere...Or I can find it and publish just next page....
Can you illustrate the text with your own copyrighted images, watermarked with your site name? I think Lucy's suggestion of improving presentation and usability is the way to go, and one of the best ways to do that is to break the text up with attractive pictures so they're easy to read.
That way, your pages will have unique content, even where you don't add your own commentary, and people will want to link to it.