I´ve read a lot about Panda and WordPress sites... and I still have some doubt about WordPress tags....
Should we allow Google to index WordPress tags?
My site has over 4.000 post and over 11.000 tags.
If I search Google "site:example.com" mostly results in the first pages (90%) are WP tags.
If I search Google.com the exact title of one post, the first result is a tag -no the post- and also I get this:
In order to show you the most relevant results, we have omitted some entries very similar to the 3 already displayed.
If you like, you can repeat the search with the omitted results included.
I guess this mean the tags are been cosidered as duplicated content?
Actually my robot txt for the WordPress site is this:
User-agent: Mediapartners-Google
Disallow:
Sitemap: http://www.example.com/sitemap.xml.gz
User-agent: *
Disallow: /cgi-bin/
Disallow: /wp-admin/
Disallow: /wp-includes/
Disallow: /wp-content/plugins/
Disallow: /wp-content/cache/
Disallow: /wp-content/themes/
Disallow: /wp-login.php
Disallow: /wp-register.php
Disallow: /*/*/*/feed
Disallow: /*/*/*/trackback
Disallow: /*/*/*/attachment
Disallow: /author/
Disallow: /category/*/page
Disallow: /category/*/feed
Disallow: /category/*/*/page
Disallow: /category/*/*/feed
Disallow: /tag/*/page
Disallow: /tag/*/feed
Disallow: /page/
Disallow: /xmlrpc.php
Disallow: /*?s=
Where as you can see I only disallow for tags the duplicated ones:
Disallow: /tag/*/page
Disallow: /tag/*/feed
Should I just disallow all? like:
Disallow: /tag/
Also, and really dont know if it has relation with this problem, if I search example.com i get over 24,000 results indexed by Google.
If I see my WordPress Desktop:
3.106Entries
1Pages
20Categories
11.642Tags
All this is over 15,000 urls ... and indexed I see 24,000... yes I know there is Author pages, Date pages etc but not more than 1,000 I guess and also those should be restricted by above robot.txt... does it have any sense for you?
Regards,
[edited by: tedster at 4:41 pm (utc) on Jul 16, 2012]
[edit reason] switch to example.com [/edit]