|Keyword albums / tag clouds triggering Panda algo penalty?|
| 3:58 am on Aug 10, 2013 (gmt 0)|
I have an image site that uses something called "keyword albums", which are keyword links on every page that link to an album of images tagged with that tag on the fly. At some point the number of indexed pages on my site started climbing from around 6,000 to 28,000 very quickly in December of 2012 and then started heading back down to where it started in May of 2013. During this time my site has lost approximately 70% of its traffic and it hasn't recovered no matter what I try.
I strongly suspect that these thin content pages may have triggered the Panda penalty, and I am wondering if blocking them in robots.txt might help me to recover. I do not have a manual penalty or any warning in WMT. If I search for the tag pages they are indexed, and do come up if I search for unique phrases on those pages. They really are just pages of images with image titles and descriptions though, not much text at all.
From what I have read about Panda, it seems that if your site has a high percentage of thin content pages it can push your entire site down. Has anyone seen any improvement by excluding these types of tag pages?
[edited by: Robert_Charlton at 4:26 am (utc) on Aug 10, 2013]
[edit reason] Made niche less specific [/edit]
| 6:06 am on Aug 10, 2013 (gmt 0)|
|...uses something called "keyword albums", which are keyword links on every page that link to an album of images tagged with that tag on the fly. |
ichthyous - This sounds essentially like search-generated tag pages, a combination which IMO is guaranteed to get you into trouble with Panda.
Tag pages generally end up being very thin pages unless they're predefined and prioritized, with their number strictly limited. They're generally not integrated well structurally into the site... and generating them on the fly introduces several levels of randomness and confusion into how Google sees what your site is about. Keyword tags are pretty likely to be superficial.
|...I am wondering if blocking them in robots.txt might help me to recover. |
This would perhaps be the worst way of dealing with them. You'd be sending your link juice, which ought to go into a well-conceived navigation structure, into a black hole.
Better than robots.txt would be to use the robots meta tag with "noindex,follow" as attributes. The "follow" attribute is a default, but many webmasters choose to use "nofollow" in this situation, so I thought I'd note the choice.
IMO, noindex is only a temporary fix, but at least with "noindex,follow" you'd be recirculating your link juice throughout the site, rather than throwing it away completely as you would with robots.txt blocking the tagged "keyword albums".
Those tag links, though, would still be on your pages... most likely providing too many choices for users, and you'd be throwing away most opportunities you have to prioritize the navigation on your site. I realize that this is what you thought that the tags would be doing... but tags are simply a superficial approach to a much more complex problem. I'm guessing also that Panda doesn't do well with a somewhat randomized "on the fly" structure.
So, I'd suggest getting rid of "keyword albums" entirely, and looking into other, more solid approaches to structuring your navigation and to providing an engaging experience for your visitors.
| 1:45 pm on Aug 10, 2013 (gmt 0)|
Hi Robert, thanks for your response. The tags are taken from the keywords used for each image. They are rendered as links on the page, which the user can click to see an album of images relating to that tag.
I did some checking and Google has the tag pages indexed, but they don't bring in a lot of traffic as entry pages so the pages don't rank well. I think that not many people use them so I have removed them entirely from the most import category and subcategory pages on my site. They still remain on thousands of the lowest level image pages though. If I remove them all suddenly then I will have hundreds or thousands of missing pages...not good.
In addition to the potential hit by the Panda algo, I am wondering whether removing all these tag links will improve the page rank. According to the Matt Cutts video on tag clouds (http://www.youtube.com/watch?v=bYPX_ZmhLqg) all the links can drain the page's PR.
I want to remove them from all pages, but what would be the best way to deal with all the 404s if I turn them all off? I don't particularly want to have to redirect all of these low ranking tag pages. Having too many 404 errors though will also potentially drag down the site.
| 4:51 pm on Aug 10, 2013 (gmt 0)|
I decided to move forward and remove all the keyword links from all pages. The module that creates the pages is still active though. So all the pages that have been indexed are still there, but there are no longer any links pointing to them from any page...they are orphaned.
I am wondering if removing the links alone will be good enough, since the "thin" content still exists and is being counted in Google's overall picture of my content. With such a high proportion of thin content I suspect that my site will remain Pandalized. What is your take on that? Thanks for the assist!
| 6:44 pm on Aug 10, 2013 (gmt 0)|
Hi there, ichthyous:
I know this isn't the exact question you asked, but I would like to make a suggestion or two:
Overall I would agree with Robert Charlton's suggestions above. But there are a few things that possibly need mentioning:
|"I did some checking and Google has the tag pages indexed, but they don't bring in a lot of traffic as entry pages so the pages don't rank well." |
OK, they don't rank well in google's eyes, but are they popular with your users?
Do you find that your users tend to navigate through your site via:
- An internal search tool?
- Static category Links?
- Your tag cloud?
And assuming that you SELL things, which type of navigation do your BUYERS prefer?
|"The module that creates the pages is still active though. So all the pages that have been indexed are still there, but there are no longer any links pointing to them from any page...they are orphaned." |
I don't think that is an ideal situation.
I would NOINDEX those individual pages. Google already knows about those pages, and just because you don't link to them doesn't mean that google is going to forget about them.
One thought I have is that you could / should:
1) Use the NOINDEX meta tag on all the individual keyword-based pages to get them out of the search index.
2) Create a SINGLE page with your tag cloud on it.
3) Create a SINGLE link with the text "Find By Keyword" or some other helpful text. This should link to the SINGLE page with the tag cloud on it. This link could be sitewide in the navigation bar or navigation column or in the footer. Just don't link to that tag cloud page multiple times from the same page.
|"Having too many 404 errors though will also potentially drag down the site." |
No. This has been denied by John Mueller (and I am guessing Matt Cutts said this at some point to).
Create a USEFUL 404 page. In fact, you should use your tag cloud and other navigation on that 404 page. Make sure it sends a correct 404 header, and (since I am always fearful), I would include the meta noindex tag on it as well, too.
On the other hand, having a bunch of 301 redirects that are NOT user friendly could hurt your site.
I hope this helps.
| 7:10 pm on Aug 10, 2013 (gmt 0)|
|use the robots meta tag with "noindex,follow" as attributes. The "follow" attribute is a default, but many webmasters choose to use "nofollow" in this situation, so I thought I'd note the choice. |
Because "follow" is default, does that mean if you use the tag meta name="robots" content="noindex" Google will follow the links passing link juice throughout the other pages avoiding black holes? Or do you have to actually use meta name="robots" content="noindex, follow" in order for that to work? I was under the impression "follow" was implied and that blackholes would only be a problem in noindex, nofollow was used?
Appreciate the clarification if you know :)
| 7:15 pm on Aug 10, 2013 (gmt 0)|
|I was under the impression "follow" was implied and that blackholes would only be a problem in noindex, nofollow was used? |
"follow" is the inherent property. No need to explicitly declare it, but some people do.
Suppose one of the pages that was generated by the tag app has the noindex meta tag in the head.
suppose a site with a high page rank linked to that page.
Even though the site is noindexed, the link juice would still flow "through" that page to other links on that page.
| 7:26 pm on Aug 10, 2013 (gmt 0)|
Thanks Planet13. So there is no real need to add the follow portion because it's inherent. Just DO NOT add nofollow.
| 2:16 am on Aug 11, 2013 (gmt 0)|
Thanks so much for the advice Planet13, I will take it into consideration. I am very interested to see if all these changes will actually help my site to recover from Panda somewhat.
One further question...I have links to every main category page of my site in the footer at the bottom of every page. I have heard that footer links can also trigger Panda also penalty. Would you recommend removing those as well?
| 6:58 am on Aug 11, 2013 (gmt 0)|
|Better than robots.txt would be to use the robots meta tag with "noindex,follow" as attributes. |
Actually it would be better to use the googlebot meta tag with "noindex" as attribute. Why? Because Bing and Yahoo send traffic to this type of page still, only Google seemingly hates them, and robots would block them all. Follow is the default behavior so it doesn't need to be identified.
I have pages that Google has cut off 100% that are all-stars in Bing, I use the googlebot noindex meta tag on these with good results.
| 4:30 pm on Aug 11, 2013 (gmt 0)|
Any opinions on the footer links at the bottom of every page?
| 5:06 pm on Aug 11, 2013 (gmt 0)|
Both, robots meta tag and robots.txt exclusion can target a specific bot (e.g. googlebot).
The main difference is that if meta tag is used, the page will be crawled, but not indexed, but if robots.txt is used, the page will not* be crawled, but it may be indexed.
(*) Unless Google+ button is on the page, in which case it may be crawled despite robots.txt exclusion
I do not think footer links are directly connected to Panda, but they will affect how the link juice is distributed through the site.
|I have heard that footer links can also trigger Panda also penalty. |
Here is a thread from about a year ago that discusses connection between footer links and Panda and opinions and experiences are different amongst members: [webmasterworld.com...]