what kind of duplicacy is spam?

Forum Moderators: open

Message Too Old, No Replies

what kind of duplicacy is spam?

"edgy" about 'assumed' duplicacy

arun_g

10:21 am on May 28, 2003 (gmt 0)

Hello friends,
This is my first post - I hope the question isn't too amateurish.
I have become quite edgy recently in view of Google's aggressive approach to spam.
I read somewhere that duplicacy of content is 'spam'. But in my opinion duplicacy reproduced with permission such as interesting articles shouldn't be.
Also, suppose I am a distributor of company A,Company B and company C's products, I may be lifting descriptions of the products directly from their websites for the promotion of their very own products. Is this not legitimate as per Google?
Any help would be appreciated.

edit_g

10:35 am on May 28, 2003 (gmt 0)

Is this not legitimate as per Google?

Google shouldn't have a problem with you using your suppliers' descriptions of products - lots of people are doing this and I've not seen anyone run into trouble. Even if google did object their ability to detect duplicate content is questionable. Just make sure that you keep the html structure more or less unique, use CSS etc and present it in your look and feel.

arun_g

10:51 am on May 28, 2003 (gmt 0)

I'm not too sure that "because it can't be detected so do it" is the right approach. Moreover, I don't see why such detection should be too difficult (especially in the not too distant future).
What about genuine articles for visitors reproduced with permission? Suppose they were to be identical with no differentiation, would that be considered as spam?

killroy

11:16 am on May 28, 2003 (gmt 0)

Why would you think that google wants to punish legitimate sites? If your content makes sense, great. Google doesn't want to shape the net, simply index it. the google influenced shaping comes from overzealous SEO.

If the apge works, it works, full stop.

arun_g

11:44 am on May 28, 2003 (gmt 0)

I wish it were that simple. What works today might not tomorrow - ( I can 'feel' all those guys with invisible text nodding in agreement)
As we all know google works on the principle of attempting to systemize or "algorize" everything. And we also know that google considers duplicacy as spam. So, how does it (or will it) separate 'genuine' duplicacy from the spammy variety, assuming of course it acknowledges genuine duplicacy.

Mike12345

11:51 am on May 28, 2003 (gmt 0)

their ability to detect duplicate content is questionable.

I totally agree with this comment.

There is plenty of duplicate content out there "legitimate" or not it has not been removed from the index. Which says to me that there detection methods are questionable.

Imagine how difficult it is to differentiate between a page that has the same HTML structure (nav, and general layout etc.) as another but with different content (actual words etc.) and a site that is "very" similar but slightly different in both senses.

Remember, they try to do everything automatically, writing this kind of routine and getting it to differentiate the two effectivly and justfully is very difficult IMHO.

arun_g

12:07 pm on May 28, 2003 (gmt 0)

I think we're missing the point.
The question is not "Is it possible for them to do so?" but more in the nature of " Will they or will they not remove pages that they detect as duplicates if the duplicacy is done for ethical reasons such as article reproductions / reseller arrangements and so on?"

arun_g

12:09 pm on May 28, 2003 (gmt 0)

madweb

12:16 pm on May 28, 2003 (gmt 0)

I don't think Google is overly bothered about the ethics of your duplicate content, it only cares about having the same thing indexed twice, which is inefficient and spammy.

The answer is to have unique content or some kind of unique way of presenting the duplicate content (e.g. google directory in pagerank order vs open directory).

Mike12345

12:16 pm on May 28, 2003 (gmt 0)

It depends on the algo that is used to identify duplicacy. As we dont know what that is or if it exists in the way one might imagine its difficult to say wether they will they or wont not remove a page.

So do what edit_g said, and hopefully it wont be removed, but like i said there is plenty of crap out there that is duplicated in some way, small or large. So vary your structure and you should be Ok.

killroy

12:28 pm on May 28, 2003 (gmt 0)

Ok let me try to rephrase. Keep in mind this is about the user experience, NOT the webmaster experience.

If you have one of 10 copies of an article on your site, and it relates to the search term, do you want yours, and the other 9 copies show up in the top10? How does that benefit the user?

Google takes the PERFECTLY LOGICAL and INTUITIVE approach of eliminating 9 of the 10 copies and keeping the highest valued one (PR or whoever). Now the article is in, say #1 and the remaining 9 positions are other related well ranking pages.

now if you're one of the nine not displayed, you might ocmplain, but the visitor hasn't lost anything, and your copy of the page doesn't add anything. It's not so much a penalty as simply well filtered results. google doesn'T bear a grudge against you, it's jsut your page gives NO value since the material is already available elsewhere.

This is fine for information, as you provide it free and are happy people get it from you or otherwise.

The moral is, that if you duplicate conent with commercial intent you cannot expect profit, as you have lost your UNIQUE SELLING POINT as any business man will tell you.

If you use a copycat descrition but add your own individual twise such as "free shipping" "free gooddy pencilcase" "price=x$" and so on, then your search will still not show up for hte generic desription text, BUT for searches including your USPs, such as "widgets free shipping" or "widgets for x$"

So keep in min who google serves. If you do good by the searcher you do good by yourself. If you stamp your foot and shout "give me all your money already" that's never convinved anybody... ever.

Hope we will eventually be able to put the debate of admissible duplication at rest.

I myself had 60000 pages PR 0ed because they were available through a seperate domain. Perfectly good, as it eliminates duplicate results. On the other hand I have a different version of the content all indexed in google as it makes the same data available in a different compacter format. Usefull for the searcher, ergo available in hte SERPS.

Good Luck!

naicul

12:30 pm on May 28, 2003 (gmt 0)

Finding duplicate content is not all that hard, but it is very computationally intensive - this is where the bottleneck in applying a filter is.

mil2k

3:44 pm on May 28, 2003 (gmt 0)

Well said Killroy.

arun_g why don't you read this excellent thread. It will help your understanding of Duplicates and the challenges search engines face [webmasterworld.com].