Forum Moderators: open
But does Google have advanced algorithms that increase or decrease a page's value from a page rank point of view, based on the duplication of information found on the page?
The duplication factor could be the number of pages they have in their index that have identical text between different sites. This would be the case for distinct web sites that are using the same RSS feeds or "sharing" a third party product database like Amazon's (which would contain millions of product descriptions, customer, reviews and the like).
Note: I'm just using Amazon as an example; affiliate program data feeds, search result XML feeds, and much more could be equally feasible as possible sources of duplicate information.
I'm wondering if it is better to spend the time to create a site with a couple hundred pages of handwritten pages, even if I have to hire a freelance writer or two to get it done, or to spend the time creating a database driven site with a few RSS feeds instead?
The second approach would be easier and less time-intensive of course and would result in a lot more pages. This is why I am asking the question.
The goal is to get the highest page rank possible and the best positioning in Google's SERPs. What has been your experience with either approach? Have your competitor's done worse or better than you, when viewing Google SERP's, by using a different approach than you?
thx
There are some affiliates I know that are purposely avoiding datafeed merchants because of the duplication problem --not so much because of the possible penalty but because the serps can only tolerate so many clones. (We'll call it the TigerDirect effect, I guess.)
The SEs are going to be forced at taking a long, hard look at RSS feeds. Very open to exploitation, IMO.
But from my point of you it's better to have own content because you'll never get penalized for duplicate content. I don't see a problem with pages generated out of a public database but from what I've heard there will be some filters in the future. I don't know how good they'll be but I don't think they will release them before they're damn good in order to prevent lots of big companies to complain.
Regarding SPAM I would definetly go for own content because it will mostly satisfy the user as it is unique, shows a new point of view and 100% relevant.
The goal is to get the highest page rank possible and the best positioning in Google's SERPs.
Dynamic sites do not necessarily mean duplicate information. Many go this route if their site has more than a few pages because they're far easier and quicker to maintain and keep up-to-date. RSS is a great tool to get reciprocal links from others as well as advertise yourself since you offer real content instead of just another empty link.
"RSS is a great tool to get reciprocal links from others as well as advertise yourself since you offer real content instead of just another empty link."
Could you elaborate on both halves of that statement?
- How would you use RSS to get reciprocal links from others?
- How RSS2 allows you to advertise yourself?
For the second point I assume you mean that when other sites use an RSS feed from your site, if you have one, they provide or your RSS provide's attribution; creating a traceback to your site.
But what I'm curious about is what kind of RSS feed would a site create, if that site isn't a news or current events type site but more of a reference/knowledgebase content site?
thx
I do know what regular expressions are, and how useful they can be at parsing text streams, I just don't know what you are implying as far as their usefulness in using them to help generate content from the sources you mentioned. It sounded like you have some special application of them but if all you were pointing out is there usefulness, in parsing XML and other data streams, then I've got it now.
thx