Welcome to WebmasterWorld Guest from 54.163.25.166

Message Too Old, No Replies

WordPress And Google: Avoiding Duplicate Content Issues

What about posts in few different categories?

     
3:50 pm on Sep 26, 2006 (gmt 0)

Junior Member

5+ Year Member

joined:June 14, 2006
posts:107
votes: 0


Hey guys,

I was wondering what do you think about blogs and WordPress. As you know wordpress can have categories in which it'll show certian posts.
So now I can have 3 categories: A,B,C and then make a post which will be posted in all 3 cats...it'll show in each category, as well as on main page and in archives. As you can see there are many places on the site where that certian post shows.

What do you think, is this duplicate content, or not? How does Google treat such a behaviour?

Any clues?

THanks,
Manca

3:13 pm on Nov 9, 2006 (gmt 0)

Preferred Member

5+ Year Member

joined:June 5, 2006
posts:352
votes: 0


Our company has six blogs using WP - never tinkered with the settings. Indeed, only when one does a site: command, one can see RSS/trackback/email this/comments RSS feeds but generally not in an actual Google search.

I have been following the whole discussion on duplicate content - simply not an issue, even though we often assign multiple categories to each post. Google relies heavily on what you actually link to and what anchor text you use to serve the results.

So for all of you who believe that we are in business for our readers rather than bots, relax - just create good content and life will be good.

3:47 am on Nov 12, 2006 (gmt 0)

New User

5+ Year Member

joined:July 20, 2006
posts:3
votes: 0


My RSS comment feed is showing up in supplemental results. It seems to be the last little bit that I need to clean up. Anybody have any ideas? This one seems tougher in that the comments RSS feed directory "/feed/" comes at the end of the permalink URL.

Is there a way I can remove it via robots.txt using a wildcard? Think I heard this on a webmasterradio show but i can't recall the specifics.

Thank you.

3:55 am on Nov 12, 2006 (gmt 0)

Senior Member

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:May 26, 2000
posts:37301
votes: 0


Here's the Google reference on their support for pattern matching / wildcards in the robots.txt file [google.com].

Also note that Yahoo's slurp now also supports wild cards in the robots.txt file.

2:24 am on Nov 16, 2006 (gmt 0)

New User

5+ Year Member

joined:Nov 9, 2006
posts:20
votes: 0


Implemented the changes recently and my pages have slowly disappeared from the supplemental index. However, there is still duplicate content due to the indexing of both the content of my mainpage and single posts. Is it wise to disable indexing of the mainpage? Is there a better solution?
2:35 am on Nov 16, 2006 (gmt 0)

Junior Member

10+ Year Member

joined:June 18, 2005
posts:49
votes: 0


1. Use the "more" tag.

2. Do not allow catagories to get indexed.

3. Generate unique titles and meta desc for the single posts

4. Drop meta desc. from index2,index3,index4, ect.

Google will generate its own unique description based on the content.

5. Generate unique titles for index2,index3,index4, ect.

I did this and the results are rather good!

4:16 pm on Nov 16, 2006 (gmt 0)

New User

5+ Year Member

joined:Nov 9, 2006
posts:20
votes: 0


Awesome VictorP, but could you point me in the right direction as far as coding?
1:12 pm on Nov 20, 2006 (gmt 0)

Junior Member

5+ Year Member

joined:July 12, 2006
posts:46
votes: 0


<This message was spliced on to this thread from another location>

I have a wordpress blog, around a month or two old, and after a slow start it was very well indexed by googlebot and started to do very well in the serps.

However I noticed that my wp feed started to rank higher than individual posts or my home page. So to counter this I placed some disallow rules in my robots.txt....but I have just seen a big drop in the amount of pages listed in G's index for my blog.

Have I made a mistake in my robots.txt?

User-agent: *
Disallow: /wp-
Disallow: /search
Disallow: /feed
Disallow: /comments/feed
Disallow: /feed/$
Disallow: /*/feed/$
Disallow: /*/feed/rss/$
Disallow: /*/trackback/$
Disallow: /*/*/feed/$
Disallow: /*/*/feed/rss/$
Disallow: /*/*/trackback/$
Disallow: /*/*/*/feed/$
Disallow: /*/*/*/feed/rss/$
Disallow: /*/*/*/trackback/$

[edited by: tedster at 5:33 pm (utc) on Nov. 20, 2006]

2:32 pm on Nov 29, 2006 (gmt 0)

Preferred Member

10+ Year Member

joined:Nov 2, 2001
posts:597
votes: 0


just noticed an old blog has both the dynamic php urls and the static urls listed for some pages. any ideas how to properly redirect *all* of those at once to the static version?
9:37 pm on Nov 29, 2006 (gmt 0)

Senior Member

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:July 3, 2002
posts:18903
votes: 0


Since * is a widcard, I don't think you can have multiple wildcards in a disallow line.
9:41 pm on Nov 29, 2006 (gmt 0)

Senior Member

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:May 26, 2000
posts:37301
votes: 0


Here's one example of two wildcards on Google's own support pages:

To block access to all URLs that include a question mark (?), you could use the following entry:

User-agent: *
Disallow: /*?*

[google.com...]

12:24 pm on Dec 9, 2006 (gmt 0)

Junior Member

5+ Year Member

joined:June 23, 2006
posts:147
votes: 0


Do these links result in duplicate content problems?

http://www.example.com/test-post/

http://www.example.com/test-post/#comments

They are both the same page, but the second link jumps down to the comments section.

7:30 pm on Dec 9, 2006 (gmt 0)

Senior Member

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:May 26, 2000
posts:37301
votes: 0


No duplicate trouble there -- the "named anchor" part of a url is not spidered.
12:14 pm on Dec 12, 2006 (gmt 0)

Junior Member

5+ Year Member

joined:June 23, 2006
posts:147
votes: 0


Great, thanks!

Now how can I say noindex for numbered pages that follow the home page?
So that the home page is indexed, but the 2nd, 3rd, 4th... etc pages are not?

11:47 am on Dec 13, 2006 (gmt 0)

Senior Member

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:July 3, 2002
posts:18903
votes: 0


Add <meta name="robots" content="noiindex"> on each one that should not be indexed.
11:42 pm on Dec 17, 2006 (gmt 0)

Junior Member

5+ Year Member

joined:June 23, 2006
posts:147
votes: 0


>>Add <meta name="robots" content="noiindex"> on each one that should not be indexed.<<

Thanks, but I don't think that works with wordpress. I have one header template file for the whole site. All the pages are dynamically generated.

In my header I have this:


<?php
if (is_single() ¦¦ is_page() ¦¦ is_home()) {
echo "<meta name=\"robots\" content=\"index,follow\"/>\n";
} else {
echo "<meta name=\"robots\" content=\"noindex,follow\"/>\n";
}
?>

I want to say noindex for all the pages that are after the homepage, which are just a chronological ordering of posts as they're bumped off the homepage.

12:23 am on Dec 18, 2006 (gmt 0)

Junior Member

5+ Year Member

joined:July 18, 2006
posts:161
votes: 0


I have this in my home.php to make sure that only the first page is indexed:

<?php
if (is_home() && ($paged <= "1")) {
echo "<meta name=\"robots\" content=\"index,follow\"/>\n";
} else {
echo "<meta name=\"robots\" content=\"noindex,follow\"/>\n";
}
?>

2:21 am on Dec 18, 2006 (gmt 0)

Junior Member

5+ Year Member

joined:June 23, 2006
posts:147
votes: 0


Oh sweet, it worked! Thanks iridiax!

I added that little bit && ($paged <= "1") to my code above and it worked like a charm.

Thanks a million :)

1:55 pm on Dec 19, 2006 (gmt 0)

Preferred Member

10+ Year Member

joined:Apr 24, 2004
posts:388
votes: 0


I would say that Adam's post [googlewebmastercentral.blogspot.com] makes this topic even more important:

Understand your CMS: Make sure you're familiar with how content is displayed on your Web site, particularly if it includes a blog, a forum, or related system that often shows the same content in multiple formats.
10:32 pm on Jan 3, 2007 (gmt 0)

New User

10+ Year Member

joined:Jan 19, 2005
posts:34
votes: 0


Just read the thread, this is a great thread!

Can someone summarise what code I should put in my header.php so only the index page and the single post pages are cached by google.

Also what I need to add to robots.txt to stop feeds being cached.

4:10 pm on Jan 11, 2007 (gmt 0)

New User

10+ Year Member

joined:Jan 19, 2005
posts:34
votes: 0


I have this in my home.php to make sure that only the first page is indexed

Does Home.php = index.php or header.php?

I still can't get this code to work

<?php if(is_home() is_single() is_page()){
echo <meta name="robots" content="index,follow">;
} else {
echo <meta name="robots" content="noindex,follow">;
}?>

Can I paste into Header.php as is? I have read something about pipes? what do i need to change in layman terms.

5:24 pm on Jan 11, 2007 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:May 30, 2003
posts:932
votes: 0


"home" refers to index.php, and as far as I can see, that code should work if pasted into header.php (but with pipe symbols, not broken pipe symbols).
5:28 pm on Jan 11, 2007 (gmt 0)

Preferred Member

10+ Year Member

joined:Apr 24, 2004
posts:388
votes: 0


After looking at all the posts and knowing the need for preventing dup content, I'm wondering why there isn't a greater push to just stop this in the robot.txt. I would think most people are using the custom URI option and it seems that stopping the indexing of categories, archives, extra pages and feeds could be accomplished pretty easily that way. Am I missing something?

< continued here: [webmasterworld.com...] >

[edited by: tedster at 5:44 am (utc) on Mar. 11, 2007]

This 142 message thread spans 5 pages: 142
 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members