Forum Moderators: open
Then, could anyone here tell me why when I search for "<some MASSIVE keyword here>" in Google, I get lots of "powered by example.com" and "powered by example2" and "powered by example3" sites that use exactly same database and still occupy about half of first 30 positions for "<some MASSIVE keyword here>" search query? Is it ok for user to have same content for his SUPER-POPULAR search term on so many sites? The sites are biggies... They all have pr8, pr9, pr7... but all give absolutely same, duplicate content. Even design is similar. I wouldn’t mind this if they had at least something unique to offer, or maybe an article or something, but they are same? It would be ok to see one site per one network among first, but they just occupy the top 30 whereas they hardly deserve top3. Or maybe Google treats corporate sites differently?
Hey GoogleGuy... it looks like whole bunch of fancy looking corporate-level SPAM to me. And Google feeds users with dupe content for one of the webs most popular keywords.
[edited by: heini at 11:15 am (utc) on April 26, 2003]
[edit reason] Removed specifics per TOS [/edit]
I searched for the phrase "powered by" and got a whole bunch of PR9/PR10 sites even some with no instance of that phrase on those pages. I doubt that the "powered by" buttons are causing that PR.
Can you please use the politically correct "red widget" terminology and clarify?
What I mean by "powered by XYZ" is that the is a company that has a database of products and other companies, being their affiliates/partners feature this database on their sites as well and get commissions for sales. They also add new "products" to the database from their sites, but the whole DB is similar on all partners’ sites. All they have to do is to add a few design templates to make it look a bit different. Anyways, when a visitor makes a search, he gets 100 results that have 90 affiliate sites of 2-3 big networks.
This has nothing to do with "powered by" search phrase. I talk about "powered by" , because it is an easy pointer to see the obvious fact that the sites are parts of networks. This is a nicely hidden way of delivering completely duplicate content. And since a lot of really big, huge companies keep doing this, it might be one of the most massive spam over the Internet. Correct me if I am wrong please.
Plain duplicate content is a recognized spam, right?
Not necessarily. The same Associated Press story might turn up in the CHICAGO TRIBUNE, the SEATTLE TIMES, and the DES MOINES REGISTER, but that wouldn't make it spam. Similarly, half a dozen mail-order sites might have the same photo and catalog description of a Canon digital camera, but that wouldn't be spamming.
This is why duplicate content isn't penalized per se. When Google sees duplicate content, it simply ignores what it perceives to be the mirrored pages (usually the newer pages, I believe).
Of course, there are different degrees of "duplicate content," and the same content on pages with different navigation schemes or other text changes isn't likely to be recognized as duplicate content. (That's probably one reason why the same AP story in the CHICAGO TRIBUNE, the SEATTLE TIMES, and the DES MOINES REGISTER wouldn't be picked up as spam.)
I suspect, when "mirror sites" get into trouble with Google, it's probably because of other factors such as multiple domains with artificial crosslinking patterns--not because of duplicate content per se.
Purely affiliate site that doesn't add value is also spam, right?
Not necessarily. It may clutter up search results, but it isn't spam unless the site owner, Webmaster, or SEO has used questionable techniques to make it rank higher in search engines.
This doesn't mean that Google couldn't choose to give a lower relevancy score to a page that appeared to consist of boilerplate content with an affiliate link. That may not be happening now, but it could happen in the future. But it wouldn't be a spam-fighting measure per se; it would simply be an attempt to deliver higher-quality search results to the user.
The only difference from doorway pages I see here is that different companies own their own doorways and they don't just link to the main site... they host a copy of it themselves. I think that this can be recognized as spamming search engines, because each of them inform Robots (by titles and descriptions and whatever) that they offer Xwidgets, another company offers Ywidgets, etc. whereas both actually offer same Z-netWIDGETS.
I would like to repeat that we are talking about VERY popular keyword and Google guys sure know that Google shows 20 sites of 2-3 networks with same content among first 30 results, including spots #2, #3, #4. I have a feeling that if Google had a spam report about a smaller network doing same or similar thing they'd ban it manually. But that guys are too big.
Imagine you sell pencils... Create 10,000 sites with a bit different design, add php script that whould load results from instant server DB via, say, socket connections, each would call itself differently: Bimbo pencils, Super pencils, Acme pencils. than you optimize the pages, they get 20 out of first 30 positions and what's that? Spam or not?
If the duplicate sites are franchises, then it is not spam. If they all belong to one entity that wants to dominate the SERPs, then it is.
- Ash
As I'm affiliate myself and so can be named as spammer - I leave in similar society.
We have plenty of sites with similar database of products. Who gets first position - that person is winning person and gets money for his technique ... Of course this is no good for google - as we offer nearly similar content and don't provide any new, interesting information to our customers.
But it is not that simple ... also we are running web sites, which are also affiliates, but with different companies (about 3-4) and we combine databases, choose the best by prices + offer the combination - so the better then each of the program. This, I consider, a value web sites for the customers - and not a spam at all.
So to think about just algorithm, which will solve the problem - is very complex, and I think, impossible to implement thing.
To ban 1 by 1 - is also impossible, as you need to spend GREAT amount of human efforts (in fact it what dmoz does on just a few web sites, it has). So it is also not a solution.
And as a result - the best solution leave it as is. Webmasters would be more interested linking to the websites, which I mentioned second - and these web sites will occupy better positions. This is the solution.
We want to start a webservice where other sites can use our information (link directory: name and description).
The lay-out will be made up by the site that use our content, but does this hurt our own directory/pages? Or will the surrounding, the template were the content fits in, be enough to make the difference and get it a unique page (allthough all outbound links will be the same...)?