homepage Welcome to WebmasterWorld Guest from 54.167.179.48
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

    
Forum software duplicate content issues and Panda
realmaverick

WebmasterWorld Senior Member 5+ Year Member



 
Msg#: 4461020 posted 10:10 pm on Jun 3, 2012 (gmt 0)

Good evening guys.

One of my websites uses Invision power board as it's forum software.

There is a fundamental flaw that was introduced in V3. Each page of a thread is seen as a completely new thread, not a page of a thread. Some threads can have 20+ pages. GWT reports each page as a duplicate title and meta.

Furthermore searching Google for an exact match for a topic, often returns pages 1 and 2 of the thread but Google sees them as entirely different threads. I know this because where it mentions posts and participants, it only takes them in to account for the page in question, not the entire thread as it did previously and as all other forum software does.

So for example page 1 might say 20 posts by 7 participants and page 2: 6 posts by 2 participants.

I've submitted the bug, along with my proposed fixes, which IPB have agreed is a problem and they're going to address, but this may take weeks/months.

During the discussion in their forums, it emerged that some sites have been hit by Panda as a result.

I want to create my own short term fix. What I want to do, is noindex, follow pages 2+

On the same note, I want to noindex the profiles, as most members don't fill them in and they're largely duplicated.

Does noindex work with Panda? I can't obviously delete the members or page 2+, but I don't want to leave the duplicate content there, for it to hurt me in the future.

Thanks a lot.

 

g1smd

WebmasterWorld Senior Member g1smd us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4461020 posted 6:52 am on Jun 4, 2012 (gmt 0)

Duplicate title would be solved by adding " - Page n" to the end of the title.


There's a more insidious problem with forum software that is never addressed. When new content appears on page 1, what was on page 1 is now on page 2, and what was on page 2 is now on page 3.

There's two problems.
- Page 2 is seen as a duplicate of page 1 and page 3 is seen as a duplicate of page 2 until all pages have been respidered and reindexed.
- For a given search, page 2 is listed in search results and some text is shown in the snippet. However, since indexing there's been dozens of new posts or dozens of new threads. When I click the link in SERP, the page I am taken to no longer contains the content shown in the snippet. It's now several pages away, and there's no clue how many pages away it might be.

realmaverick

WebmasterWorld Senior Member 5+ Year Member



 
Msg#: 4461020 posted 9:59 am on Jun 4, 2012 (gmt 0)

g1smd, new content gets added to page 2 in all forum softwares that I've used. i've never seen it as you've just mentioned.

Although the thread index will work in the way you've described, the thread index would never be what ranks in the SERPS. It's just an index of sorts, similar to the way a Wordpress blog would work. Except with a forum, new posts to a thread, cause that thread to be bumped to the top of the list.

The forum threads do have "- Page n" appended to them, but they're still being flagged in GWT. But the second page, isn't being seen as page 2, just as a brand new thread. The issue mainly lays in their URL structure.

Personally, page 2+ should have Page n first, proceeded by the title of the thread. To essentially un-optimise the second page, to prevent it competing with the first.

But lets say I noindex 500,000 low quality profiles, is this enough for Panda? Or am I supposed to physically delete my members profiles? or any "thin" content for that matter.

realmaverick

WebmasterWorld Senior Member 5+ Year Member



 
Msg#: 4461020 posted 12:54 am on Jun 5, 2012 (gmt 0)

I cannot find an official word on this. Is noindex enough to remove "duplicate" content to recover from Panda? I realise that alone may not be enough to fully recover, but what I mean is, is noindex treated similar to returning a 404?

realmaverick

WebmasterWorld Senior Member 5+ Year Member



 
Msg#: 4461020 posted 8:18 pm on Jun 5, 2012 (gmt 0)

I've just found 56 copies of my homepage, in Google's index. The joys of using a CMS with bugs.

I am guessing 56 copies of my homepage, is going to be a giant problem with Panda?

This website survived Panda 100%, until we updated our CMS that introduced these crazy problems and then boom, we were hit on the next update.

I didn't study Panda like I should have, largely because I wasn't effected and didn't expect to be.

Hopefully fixing all these issues, will fix things.

rajeshth02



 
Msg#: 4461020 posted 7:47 am on Jun 6, 2012 (gmt 0)

Content duplication is the major problem, and it affects the ranking of the keywords and traffic on the site. So before uploading your content in the site check its duplicacy, there are lot of free online tools to check the duplicate content but I will recommend Copy-scape Paid version

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved