homepage Welcome to WebmasterWorld Guest from 54.147.196.159
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

This 176 message thread spans 6 pages: < < 176 ( 1 2 3 4 [5] 6 > >     
Adam Lasnik on Duplicate Content
tedster

WebmasterWorld Senior Member tedster us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 3192967 posted 6:06 am on Dec 19, 2006 (gmt 0)

Google's Adam Lasnik has made a clarifying post about duplicate content on the official Google Webmaster blog [googlewebmastercentral.blogspot.com].

He zeroes in on a few specific areas that may be very helpful for those who suspect they have muddied the waters a bit for Google. Two of them caught my eye as being more clearly expressed than I'd ever seen in a Google communication before: boilerplate repetition, and stubs.

Minimize boilerplate repetition:
For instance, instead of including lengthy copyright text on the bottom of every page, include a very brief summary and then link to a page with more details.

If you think about this a bit, you may find that it applies to other areas of your site well beyond copyright notices. How about legal disclaimers, taglines, standard size/color/etc information about many products, and so on. I can see how "boilerplate repetition" might easily soften the kind of sharp, distinct relevance signals that you might prefer to show about different URLs.

Avoid publishing stubs:
Users don't like seeing "empty" pages, so avoid placeholders where possible. This means not publishing (or at least blocking) pages with zero reviews, no real estate listings, etc., so users (and bots) aren't subjected to a zillion instances of "Below you'll find a superb list of all the great rental opportunities in [insert cityname]..." with no actual listings.

This is the bane of the large dynamic site, especially one that has frequent updates. I know that as a user, I hate it when I click through to find one of these stub pages. Some cases might take a bit more work than others to fix, but a fix usually can be scripted. The extra work will not only help you show good things to Google, it will also make the web a better place altogether.

[edited by: tedster at 9:12 am (utc) on Dec. 19, 2006]

 

Jane_Doe

WebmasterWorld Senior Member jane_doe us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 3192967 posted 2:58 am on Dec 21, 2006 (gmt 0)

Adam - Thank you for taking the time to post here. The negative comments regarding your suggestions are not shared by everyone here. Personally, I appreciate any tidbits you and the other Googlers care to pass on to us web publishers.

And now today I see a PR6 page of mine drops 450 spots in the results because some theif put its content in hidden text on PR0 zero page.

Steveb - Association does not prove cause and effect. Perhaps your site has a penalty and the other site outranking it is an effect, not a cause, of the penalty on your site.

[edited by: Jane_Doe at 3:03 am (utc) on Dec. 21, 2006]

Oliver Henniges

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 3192967 posted 8:11 am on Dec 21, 2006 (gmt 0)

> Image labeler isn't involved in indexing web pages so I don't know why that was mentioned

I mentioned this because the image labeller suggests that google is intensively working on means to identify the content of images. Supposedly it will also help to improve semantic correlation of synonyms with statistical evidence supporting hand-entries in databases.

It is quite likely that this work on the image labeller will once help to improve the rating and ranking of site with many images, (except modern arts galleries maybe) so that google will not have to rely on pure text so much.

steveb

WebmasterWorld Senior Member steveb us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 3192967 posted 10:19 am on Dec 21, 2006 (gmt 0)

"Perhaps your site has a penalty and the other site outranking it is an effect, not a cause, of the penalty on your site."

Obviously my page (not "site") was penalized when the other page appeared.

The other site outranking it for the unique text search isn't relevant since who cares about that. Only two sites have those words and no one in the world will ever search for that text. Losing the 450 spots for the big money term when the dupe appears does matter.

g1smd

WebmasterWorld Senior Member g1smd us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 3192967 posted 4:20 pm on Dec 21, 2006 (gmt 0)

>> In my industry a "hey I have this one also in red" does not work you have to show the customer the exact products they want to buy <<

By all means have that detailed information about each product on your site for visitors to see, but don't expect Google to index and rank every permutation that you have. Instead, herd the bot to index a subset of your pages and provide an easy way for visitors to then see the full range available.

jwc2349

5+ Year Member



 
Msg#: 3192967 posted 4:25 pm on Dec 21, 2006 (gmt 0)

Adam:

Thanks for the enlightenment about duplicate content. Could you add these questions to your To Do List?

1) Does excessive duplicate content on a site (re above a certain threshold %), lead directly to a penalty or filter in the serps?
2) Once the duplicate content is deleted or falls below the aforementioned threshold, will the penalty be removed automatically or, alternatively, will it quit triggering the filter?
3) If the answer to #2 is Yes, then the recovery is algorithmic. Accordingly, there would be no need to file a reinclusion request. Is this thinking correct, or is it best to play it safe and file the reinclusion request to expedite the penalty's removal?

It is not my intent to trick you into disclosing anything regarding the algorithm. That is private and should always be so. So if you can't answer these questions directly I understand. I just want to know that I am on the right track getting my 1 year old plus penalty removed.

Thanks for your time and assistance. It is greatly appreciated.

Whitey

WebmasterWorld Senior Member whitey us a WebmasterWorld Top Contributor of All Time 5+ Year Member



 
Msg#: 3192967 posted 7:45 pm on Dec 21, 2006 (gmt 0)

Instead, herd the bot to index a subset of your pages

g1smd - Can you elaborate a little on this phrase as i don't fully understand?

Also, does anyone have any creative ways to demonstrate how to deal with stubs. I'm thinking the page could show snippets of "most similar products" , something like: "there are no oranges today" but we found some alternatives here "snippet one" "snippet two" or check back later.

2) Once the duplicate content is deleted or falls below the aforementioned threshold, will the penalty be removed automatically or, alternatively, will it quit triggering the filter?

How long will it take to release the pages and/or the site-wide filter? Indeed is it site wide?

glengara

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 3192967 posted 10:46 pm on Dec 21, 2006 (gmt 0)

Typical WW thread, the guy gives some helpful information for newbie webmasters, and we go mad reading stuff into it that just isn't there.

Y'All have a good one...

Whitey

WebmasterWorld Senior Member whitey us a WebmasterWorld Top Contributor of All Time 5+ Year Member



 
Msg#: 3192967 posted 11:25 pm on Dec 21, 2006 (gmt 0)

First came the word

Then came the interpretations

Then came havoc :)

Pirates



 
Msg#: 3192967 posted 2:43 am on Dec 22, 2006 (gmt 0)

Adam, thanks for comming to the forum and posting comments, its far better than hidding away in a google moderated forum so well done mate you may get some true feedback. It may offend and upset you at times (that will probably be me) but at least it will be the real webworld you are talking too not a cotton bud make believe somewhere over the rainbow google forum. Welcome to the Real World Adam.

By the way I think that your results suck. Evaluating a website based on url and keyword proximity to root in url is an invite to spammers using dynamic content. Its also a penalty against static sites using urls in a directory structure, and like most things in google these days its not even your idea. You have all the old school hijacks back. More Spam than I ever seen on google.

Anyway Happy Christmass Adam -- Pirates

[edited by: Pirates at 3:15 am (utc) on Dec. 22, 2006]

hvacdirect

5+ Year Member



 
Msg#: 3192967 posted 5:29 am on Dec 22, 2006 (gmt 0)

I don't think they moderate the group that much, a lot of bashing goes on there, much more than in here, and the only time I've seen post's removed or threads stopped is when it got into personal attacks. Plus Googler's comment on real live sites and URLS that the rest of us can see, not talking in generalizations and worthless examples of mysite.com etc really help others help and chime in.

night707

5+ Year Member



 
Msg#: 3192967 posted 10:48 am on Dec 22, 2006 (gmt 0)

Why doesn't Google simply release some precise FAQ to explain what they define as a dupe content problem?

We are running a big media site with 3 different language editions servicing a variety of very different audiences with several theme related channels.

Some urls have a few items in common such as script originated date or niche related links to videos.

Objectively speaking, our content is at least some of the very richest available for each targeted kw and for long periods these urls have been on Google page 1.

Since the Jagger update G traffic is either steady or 100% gone.

Google support has not been willed to explain why and what for and they just wrote back, that changes will always happen.

In fact some of the 12.000 unique pages do sometimes inspire G to cut off all other urls and channels which are as different as fish and football.

The only common thing on all urls is the date script, 5 words copyright and 4 links to our free email, about a.o. services that might be interesting for all visitors.

asher02

5+ Year Member



 
Msg#: 3192967 posted 11:47 am on Dec 22, 2006 (gmt 0)

g1smd wrote "By all means have that detailed information about each product on your site for visitors to see, but don't expect Google to index and rank every permutation that you have. Instead, herd the bot to index a subset of your pages and provide an easy way for visitors to then see the full range available."

Hi g1smd, The right practice when you have your customer interest on your mind is to make it easier for them to find the product they are looking for. So building a landing page with just the products details (to make Google happy) and make the customer search through the page for that particular product, than make them click on it looks to me like a very bad practice. The ultimate way is to be able to make Google index the particular product page because all the info is there.

Your suggestion might work for Google but for me it is not different from a doorway page because if Google was not in the picture you would not build a page like that!

If your suggestion is just to make Google index any page and then make it easy for the customer to find related products well we do that, the problem is that the long tail search brings lots of customers and sales and the amount of data you can put on one page is limited if you want to look professional

When I design a website I first build it for my visitor and then try to make Google happy to, I will never put Google before my customer as there are many other ways to bring traffic.

P.S I can send you the URL of the discussed website for you to see the amount of effort we invest on providing our customer with real unique info on each product, maybe after seeing these pages you will ask yourself too why they are not indexed by Google.

Pirates



 
Msg#: 3192967 posted 2:24 am on Dec 24, 2006 (gmt 0)

Reread this an no don't think a doorway page is suggested at all by g1smnd but instead an index page showing related pages. It seems clean and good seo advice to me.

[edited by: Pirates at 3:04 am (utc) on Dec. 24, 2006]

smells so good

5+ Year Member



 
Msg#: 3192967 posted 8:58 am on Dec 24, 2006 (gmt 0)

Disclaimer of sorts: Sometimes I feel intimidated talking in a thread with some serious Google-heads, but trust me, I do pay attention :-)

if Google was not in the picture you would not build a page like that!

Indeed, but I would, and perhaps many others too. If my visitor lands on my product page he will see a link to the sub-set of all products - that products index. That visitor might decide the blue looks better after all, even though he came in on the red, and I want that blue one to be easy to find. Perhaps where I part ways is that I prefer to not have that index page included in the SERPs, making an attempt to get the individual product pages listed instead, and, I've had some luck with that. Original content, good photos, valid markup and a drop of my lucky snake-oil is what gets those pages where I want them to be.

ask yourself too why they are not indexed by Google

Shakespeare could not have written a better line. I have no doubt yours are quality pages but still, there is no good answer because results are algorithmically processed and what you get is predictable at best. In spite of every effort to do everything as correctly as possible a person still won't get results. I know that's not an answer but that's all I've got today.. an imperfect world. When I look at a page I do everything I know to make it right, and that's all I'm responsible for.

I'll step down now. Peace.

soapystar

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 3192967 posted 11:55 am on Dec 26, 2006 (gmt 0)

i think this has more to do with the sensitivity of the algo towards links and text at the bottom of a page because of how they have become used by certain types of websites. Side navs and headers are less sensitive.

Personally i feel links are actually well placed at the bottom of a page. Since its not until the user hits the bottom that they are looking for new pages to browse for certain. By this time most side navs and headers are out of view.

europeforvisitors



 
Msg#: 3192967 posted 6:46 pm on Dec 26, 2006 (gmt 0)

Why doesn't Google simply release some precise FAQ to explain what they define as a dupe content problem?

Probably because:

1) It isn't as simple as a+b=c.

2) Even if it were as simple as a+b=c, the definition of "a" and "b" could change at any moment in time.

3) Google would prefer that we create content for users, not for search engines.

macman23

5+ Year Member



 
Msg#: 3192967 posted 8:41 pm on Dec 27, 2006 (gmt 0)

I also am dealing with what seems to be a duplicate content penalty/filter that has dropped my ranking substantially since June.

My question is: Does the penalty/filter affect the ranking of only those pages which are deemed to have duplicate content or does it affect the ranking of all pages within the site?

If the answer is the latter, which appears to be the case, it seems a bit unfair, especially to publishers who offer useful content and have no idea why their ranking dropped.

SullySEO

5+ Year Member



 
Msg#: 3192967 posted 9:19 pm on Dec 27, 2006 (gmt 0)

"Could be bad news for sites that use manufacturer/product type drop-down options, these lists can turn up in the text only cache giving a large chunk of "boilerplate" content."

Yup, we have a mfg. dropdown for our users because this is the only way some of them know how to find what they're looking for. I'm not so sure Google appreciates this.

I'm not convinced boilerplate is a bad thing for my users. I think it makes navigation easier for them. I also don't believe it's bad for users to see copyright info on every page even if lengthy. I imagine sites do it for visitors, not search engines. If it's not on every page, there's probably a link to the info page on every page. Why make users click thru to it unnecessarily?

Oliver Henniges

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 3192967 posted 10:06 pm on Dec 27, 2006 (gmt 0)

> Personally i feel links are actually well placed at the bottom of a page.

Depends on the extent. ww's links like eg. "post reply" and such are indeed quite useful, because of course this is a very likely option to pursue after heaving read the last post.

On the other hand I'm quite sure that such link-boiler-plates at the bottom of pages are one of the major targets this new "filter" is aming at. An ideal place for stuffing link-anchor-text, which - if I remember corretly - has been an important SEO-technique for a while.

Weren't there some people here, who have the capacity to "test" such things? Any new insights meanwhile?

moftary

10+ Year Member



 
Msg#: 3192967 posted 12:50 am on Dec 28, 2006 (gmt 0)

A personal problem but kind of relevent to the issue.

My site database was down for a few hours. And because I am a very lucky person (!) googlebots reindexed the whole site during the database outage, hence indexed tens of thousands of "can't connect to bluh bluh" pages and viola! a duplicate content penalty!

The major problem is, once your site is penalized for a duplicate content issue, googlebots stay away of the site for a long time and even after this long time their activities are dramatically decreased and therefore, and yes it's a rare case, a recovery from the penalty takes a looooong time.

In the old days (well not very old, pre bigdaddy I think) there was one googlebot that keeps checking penalized and banned sites for a webmaster spam fix. I miss that bot!

stumped

5+ Year Member



 
Msg#: 3192967 posted 11:29 am on Dec 28, 2006 (gmt 0)

what is cross-linking?

we have several domains that grew organically with the business and we didn't want to eliminate older sites because much of our demographic is elderly and doesn't appreciate when they have to relearn a site.

However we do have some duplicate content which we are cleaning up. SO we decided certain content will go on primary shopping cart site a, certain other content on aux site b and third type of content on aux site c. but we want the user to be able to access that content from any of the sites if that is the info they are looking for. So plan is if we move content from A to B a link will remain on A redirecting to B. ANd vice versa with A B and C.

Is that what is meant by cross linking?

What is the alternative to this, only having substantive content on primary site?

AndyA

5+ Year Member



 
Msg#: 3192967 posted 1:00 pm on Dec 28, 2006 (gmt 0)

Personally i feel links are actually well placed at the bottom of a page.

I agree. If you want to keep a visitor on your site, you'd better have somewhere else for them to go when they get to the bottom of the page. If you only have side and/or top navigation, they will tire of scrolling back up to see what their options are.

I think related, relevant links at the bottom of the page make sense, and it is good for your site visitors. Having 30 or 40 is likely too many, but 10 or so that are related make sense to me. I have links to 3 or 4 related pages and my recent updates page as well as my main contents page at the bottom of most pages, and visitors seem to appreciate them. I have no idea how Google feels about them, but we are supposed to be building sites with our visitors in mind, right?

SullySEO

5+ Year Member



 
Msg#: 3192967 posted 9:00 pm on Dec 28, 2006 (gmt 0)

but we are supposed to be building sites with our visitors in mind, right?

Um, yeah. As brilliant as I believe Google employees to be, I get the feeling many, or most, are inexperienced with business in general, and particularly with marketing and retail. I think they need to have an understanding to eliminate confusion about appropriate pages for both commercial and non-commercial sites.

What is duplicate content?
Duplicate content generally refers to substantive blocks of content within or across domains that either completely match other content or are appreciably similar. Most of the time when we see this, it's unintentional or at least not malicious in origin: forums that generate both regular and stripped-down mobile-targeted pages, store items shown (and -- worse yet -- linked) via multiple distinct URLs, and so on. In some cases, content is duplicated across domains in an attempt to manipulate search engine rankings or garner more traffic via popular or long-tail queries.

If you know that most of the time it's unintentional or not malicious, it seems a little bass-ackward to do something that affects an innocent majority. The "appreciably similar" often cannot be avoided on niche sites. The suggestion seems to be that we change this for search engines and not our users.

I've seen it suggested many times to add more unique content to product pages if we want those pages to appear in the main index. This is not required for our users in many cases, believe it or not, and to do so would be for search engines not our users. I'm in favor of offering as much content as possible for my users if it makes sense to do so. It has been asked here, "Should Google index every product page when it can show the category page instead?" Well, yes...if a user queries a specific product, showing the user that exact page would be best for the user. Eliminating that page from the serps because it may have repetitive information intended for users, or only a two-line description because that's all that is required, is wrong.

As far as the Product Search Results showing at Google are concerned, those are minimal at best and eBay usually dominates. I'm all for sending users to Froogle/Base for products, and Google could do more to promote this service to its users. Further, I don't believe the majority of Google users want or need to see eBay listings dominating both the product results and the organic listings. If they did, we and many others wouldn't have a thriving online business outside of eBay.

Why does Google care about duplicate content?
Our users typically want to see a diverse cross-section of unique content when they do searches. In contrast, they're understandably annoyed when they see substantially the same content within a set of search results.

I agree but filtering out pages because of boilerplate repetition doesn't seem to be the best for your users either. Everyone in the retail business knows that consistency is less confusing and best for users. Requiring users to click thru to more pages is really asking them to hit the back button instead.

I am overall very happy with Google but I think it may be losing sight of what constitutes building pages for users, or just not understanding that it means different things for commercial and non-commercial sites. It may be focusing too much on punishing the minority spammer group no matter the cost.

whitenight

WebmasterWorld Senior Member 5+ Year Member



 
Msg#: 3192967 posted 3:43 am on Jan 4, 2007 (gmt 0)

On that note... ack! I gotta eventually reset this nightowl schedule! I'll stop back here again tomorrow, er, later today :-)

I don't mean to "pick on Adam"...but tomorrow is now next year.

Are we going to get a commentary on this?

Adam_Lasnik

5+ Year Member



 
Msg#: 3192967 posted 2:36 am on Jan 5, 2007 (gmt 0)

Okay, Whitenight, that's a fair call-out. I promised I'd post again in this thread, and I have to admit to getting sucked into holiday / end-of-quarter craziness.

So, with that said, here's a mega catch-up post! Not sure how much more info / clarification I can offer on this topic now beyond this post (not due to super-secret-rules, but because so much of what people want certainty and hard answers on... well, there isn't just one "Duplicate Filter Algorithm" that's uber controlling).

* * *

When I highlighted "boilerplate" stuff in the blog post, I was referring to huge swaths of text repeated on every page, such as an obnoxiously long legal footers (yes, fellow JDs and lawyers, I'm picking on you ;) ("Warning... Pregnant women, the elderly, and children should avoid prolonged exposure to Happy Fun Ball. Caution: Happy Fun Ball may suddenly accelerate to dangerous speeds. [...]")

Even that stuff is very, very rarely an issue when there's actually content on the page. Use common sense on this, okay? Apply the smell test :)

Products just with different colors; probably not a problem, but we may not show the color you most prefer if a person just searches for "widgets type foo."

Internal / cross-domain linking... don't go hog wild. There's no magic threshold. However, penalties for country-domain cross-linking isn't something I have seen.

Leosghost, yes, geotargeting can be frustrating both for Webmasters and users. We're still working on improving the balance... but on the whole, we hear again and again that -- in the aggregate -- people in France (for instance) would rather see a France-centric version of the Web. Not an exclusively French Web, but France-centric. However, we also understand that there are cases in which this sort of emphasis isn't optimal for either the Webmaster or the user, and we're determined to do what we can to finetune how geotargeting works.

index.en.html and index.de.html

To my knowledge, this isn't meaningful to how we view Webpages, sorry. Nor is de.example.com. example.de matters. example.com with German-language text makes a difference (for language targeting).

E.g. German is spoken in Germany (.de), Austria (.at), Switzerland (.ch), Belgium (.be).

Yes, we know this! Remember, we have Google offices all over the world and Googlers who are very proud of their languages and countries (and related Web pages) :). On the whole, I think we're doing a good job returning results with appropriate regional and international content, but again, I also know we can do better.

With new domain I need to wait a year to get rid of sandbox effect.

No, it's not a universal truth that all domains take a year (or [insert time period]) to get indexed. As Matt and I have both noted, there are many variables at play and while some sites will indeed take longer to be more comprehensively indexed, many will not.

Re: menus, particularly lengthy ones. And nav stuff overall.
Again, not likely to be a problem unless the content on the pages is minimal or extremely similar overall.

Does Google just ignore Stubs, or is there a penalty? A wiki without Stubs is no wiki. :\ Ignoring is no problem, but a penalty I would think would be contraproductive.

A wiki is not likely, I think, to have 42,000 pages all with language like: "Looking for real estate in Walla Walla Washington [or Wiki Wiki Washington or Wissie Wiggie Washington or 4817482371 other combinations]? We have just the real estate listings below you're looking for!" and no real content below. That is a problem. wiki-like stubs will probably smell much less annoying and so very unlikely to result in any unhappy ranking adjustments.

RonnieG, you bring up an interesting point about the use of iframed IDX databases. But if everyone's pulling from the same database, that in itself sounds a bit like duplicate content in a broad sense. If that's the only "real" content on a page, why would surfers want to visit that site over the bazillion others that are pulling from the same database, or at least why would we want to include it in our search results with a ton of other sites that offer exactly the same database? That's an honest, not a rhetorical question, by the way. I am open to hearing more about why this content (and other iframed or otherwise included content that is syndicated, essentially) should be valued as "unique and compelling" content for users.

We see what I think is the same issue with affiliates sometimes, many of whom are upset that they no longer garner the same traffic from Google that they used to. Our algorithms take a look at their pages and (computerwise) ask, "What value is this site providing that users can't get from other sites or even the 'mothership'? (originator of content"

So, taking a brief detour... I've witnessed a sea change in the way many Webmasters treat affiliate programs. Back in my younger days, when t-rexes still roamed the earth, most affiliates seemed to be starting with good content and then adding affiliate links to make a spot o' cash to pay their hosting bills. The affiliate links were an afterthought. Nowadays, though, not only are many Webmasters *starting* with affiliate "content" as the foundation and then adding other "content," they're incensed when their rankings fall in Google. I'll be blunt and say that I have little, if any, sympathy in these situations and I'm guessing that my colleagues who write algorithms feel similarly.

* * *

re: menus in javascript vs. CSS / "regular" text
'tis up to each Webmaster. If you put your nav in javascript, be aware that that's not optimal for us to crawl, nor is it going to be usable by many of your visitors (using PDAs, phones, other stuff that often doesn't have jscript-supporting browsers by default).

why not build into your webmaster toolkit something like a "Duplicate Content" threshold meter.

Neat idea :). The fact that duplicate content isn't very cut and dry for us either (e.g., it's not "if more than [x]% of words on page A match page B...") makes this a complicated prospect. But I'll make sure the Webmaster Tools folks are aware of this suggestion. I know we all want to help honest Webmasters as much as we can, and part of that is heightening the transparency of how we view sites.

The idea of duplicate content on the same website causing a penalty to rankings doesn't sound fair to implement until the algo can automatically work out how the data on the site is categorised.

As I noted in the original post, penalties in the context of duplicate content are rare. Ignoring duplicate content or just picking a canonical version is MUCH more typical.

re: wanting specific percentages (re: duplicate content, boilerplate stuff, etc.)
There aren't any. Again, too many variables.

I am not going to remove menus, footers, instructions, phone numbers, necessary definitions of terms or any other content that is duplicated on a large percentage of my pages.

Not a prob, unless it comprises the majority of content of your site. Yes, photo galleries muddy the water a bit here and, indeed, it's going to be tougher for those to rank well in regular (not image) search. No magic bullets here, unfortunately.

similar or identical content on .com and .mobi

Unlikely to be a problem

Sometimes a few sentences are enough as pictures speak for themselves.

True. And steveb, I understand where you're coming from about photopages. Bottom line: with less text to grab on to and interpret, having an intuitive and comprehensive (but not obnoxious) internal sitemap/nav AND getting quality inbound links both become substantially more important.

Association does not prove cause and effect.

Amen, Jane_Doe, amen! On a related note, I've often remarked "correlation does not equal causation" :)

Re: duplication within a site
Again, this very, very rarely triggers a penalty. I can only recall seeing penalties when a site is perceived to be particularly "empty" + redundant; e.g., a reasonable person looking at it would cry "krikey! it's all basically the same junk on every page!"

I don't think they moderate the [Google Webmaster Help] group that much

True, especially when many of us are off on holidays.
And overall, we really want to keep the discussion as free flowing as possible. I've only placed two members on moderation (one temporarily) due to persistent personal attacks and haven't banned anyone. The group will evolve and moderating style may change (and may vary by moderator), but on the whole, we trust that folks will be respectful.

In the old days (well not very old, pre bigdaddy I think) there was one googlebot that keeps checking penalized and banned sites for a webmaster spam fix. I miss that bot!

Moftary, if you think your site's been penalized, do a thorough check/sweep to make sure your site is now squeaky clean, and then file a reinclusion request.

* * *

Whew! I hope this puts some fears to rest and makes some of my blog comments more clear.

All the best of the New Year to y'all, and take care...

RichTC

WebmasterWorld Senior Member 5+ Year Member



 
Msg#: 3192967 posted 3:04 am on Jan 5, 2007 (gmt 0)

Adam,

Thank you for explaining in further detail the position on duplicate content. Our own site has a current google penalty and the only thing left we are trying is excluding some of our pages in the robots txt file so that all pages have nothing similar anywhere on them should it be that some sort of duplicate content issue is killing us.

Is it possible if you find the time, that you could answer my post to Vanessa in groups titled "Still Have Site Penalty - Help Required - FAO Vanessa" - i reposted it again on the 3rd Jan at 4.05pm and like our site the post is at the back of the list with no change! - any advise you can provide would be appreciated.

We are thinking now that when our site was hi-jacked, that in itself caused loads of duplicate content and even with the hijack site now removed following a DMCA Complaint the penalty still remains.

I read what you posted here with interest but some of us webmasters do still worry about being hit esp when its for something not intentional/ previously considered

Kind Regards

Rich

Leosghost

WebmasterWorld Senior Member leosghost us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 3192967 posted 3:12 am on Jan 5, 2007 (gmt 0)

Adam ..( I'm in France) if I go to google.fr in english ..the results offered are from the entire world ..no buttons other than the usual ones ..
if I switch languages to "en français" ..you then offer results via radio buttons from the "web" ..or "web francophone" ..and "en français" ..
but your "web" results are still biased to sites hosted in France ..

that is offering an apparent choice ..and then ignoring the choice that is made ..

but on the whole, we hear again and again that -- in the aggregate -- people in France (for instance) would rather see a France-centric version of the Web. Not an exclusively French Web, but France-centric

you should try speaking to french people who live elsewhere than Paris or on the cote d'azur ..the country is not all nombrilist parigo-centic ..inspite of where you recruit your french team from ..

your ( and the other search engines geo targetting ) is like keeping us at the bottom of a cultural well ..and telling us that the little circle of sky is all there is ..make "web".. mean web ..sometimes from here it feels like your chinese service ..

BTW ..you missed answering my point on your new product search and ebay listings leaving the blue and becoming result one and two of organic serps ..

In the light of your silence on that ..it is interesting that G's position on affiliates is finally out in the open ..the hint about algo targetting of them fits with the experience of many that were QS''d..

that hit many more affs than it did MFA's ..but then MFA's do earn you googles of revenue dont they ..:)

Guess the lion is now much more hungry for the ad money tired of wading through the affs to get to the meat ..

texasville

WebmasterWorld Senior Member 5+ Year Member



 
Msg#: 3192967 posted 3:29 am on Jan 5, 2007 (gmt 0)

Adam...I think that can be called comprehensive and appreciated. Thanks.

mattg3

WebmasterWorld Senior Member 5+ Year Member



 
Msg#: 3192967 posted 5:28 am on Jan 5, 2007 (gmt 0)

Ahh let's scapegoat the french for the abysmal geotargeting. ;)

The Googleaise

Allons enfants de la WebmasterWorld
Le jour de gloire est arrivé!
Contre nous de la tyrannie

Aux armes, webmasters!

Anyway thanks for the info.

Adam_Lasnik

5+ Year Member



 
Msg#: 3192967 posted 5:56 am on Jan 5, 2007 (gmt 0)

Remind me to never randomly use France/French as an example again :P

From now on, I'm using Kazakhstan.

No, wait, that's not gonna work either. This is WebmasterWorld.

Examplestan. Examplia. Hmm. Examplium.
No, no, no.

Widgetopia. Widgetland?

Ah, I digress. Time to head home sans laptop!

whitenight

WebmasterWorld Senior Member 5+ Year Member



 
Msg#: 3192967 posted 5:58 am on Jan 5, 2007 (gmt 0)

Thank you Adam.

Nice reference to happy fun ball btw. You're dating yourself...and me.

This 176 message thread spans 6 pages: < < 176 ( 1 2 3 4 [5] 6 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved