|Panda and Numerical Data|
My husband and I run a site that provides a lot of numerical consumer data. Our content is high-quality and unique, but tends to be in the form of tables rather than paragraphs. We've been hit hard by Panda and I'm having a hard time figuring out why because none of our pages utilize techniques like keyword stuffing at all. Is it possible that Googlebot really wants sentences and the table entries on our site look like strings of disconnected words instead? If not, any thoughts on what the problem could be? We've lost maybe 20% of our traffic and I cannot figure out why.
There is a Matt Cutts post regarding telephone number lists not being good for users.
Is it possible you have been caught in a filter that thinks you are stuffing numbers onto the page?
My only site that's been impacted by Panda is also a numerical data site -- basically, I take a bunch of government data from different agencies, and then mash it up it to tell a story about a particular social problem. This isn't automated stuff, I put a lot of thought into each individual page.
Basically, all of my pages looked like this when Panda hit:
2-3 paragraphs of unique introductory text, followed by a series of charts.
Many pages do have some kind of data overlap. For instance, I might show how one social issue impacts every state, so there's a table with a list of states with a numerical value for each state. This *could* make all of my pages appear to be very similar to all of my other pages (even though they're very different from a user's point of view).
Do you think that's a realistic cause? Or am I reaching here?
So far, my recovery efforts have been spent on increasing the 2-3 paragraphs of introductory text into much more. But I might try removing the noscript if that could be hurting me.
|Do you think that's a realistic cause? Or am I reaching here? |
Gawd I hope not however it may explain why I see plain widget image pages with SFA on them outranking my bells and whistles full technical description pages!
Then again, we are discussing an out-of-control Google here, all other search engines love my pages, do they yours?
If so, it's Google's problem and I am not going to change my pages to suit them, my descriptions are extremely important for my industry, you'll just have to learn to love that G!
|Is it possible you have been caught in a filter that thinks you are stuffing numbers onto the page? |
This is what I'm wondering. Our pages aren't just a continuous stream of numbers, of course, but there are some fairly long tables with a pretty high numbers-to-words ratio. Right now the way the pages are structured with the tables at the top and comments at the bottom of the page, but I'm thinking of interspersing the comments with the tables to see if that will make a difference to Google. I think it might make for a better user experience anyway.
I have a site where users also come for data. A calculator of sorts, you fill in some data and get calculations and answers. I haven't been hit by panda, but I was hit by the "too many ads above the fold" algo. At this point, I believe it is because Googlebot thinks that my only content above the fold is the ad. I don't believe that it thinks that the calculation form is content at all. The only text I have on the page is well below the fold. There are no images to speak of.
It is unfortunate, but Google does not appear to consider non-mainstream site concepts like these when designing there algorithms.
i would suggest you study the table showing heuristics to determine non-layout table usage and see if you can improve the structure and semantics of your table usage.
|User agents, especially those that do table analysis on arbitrary content, are encouraged to find heuristics to determine which tables actually contain data and which are merely being used for layout. |
(i'm pretty sure everything there also applies to HTML4)
I have run into the same problem hit by Panda.
One thing that seemed to improve my rank, mildly, was to change the format of the tables or data. For example making it an excel file. Then adding a paragraph to explain it and a screenshot.
Seemed to help a little bit, but took Gbot a long time (3-4 months) to make sense of it.
Hi Sophronisba, and welcome to WebmasterWorld.
My emphasis added...
|Our content is high-quality and unique... |
This brings up many tricky questions. "Data", and "data as content" are in themselves difficult concepts, and I don't believe that Google likes "data" by itself as a form of content.
What do you mean by unique? Did you make measurements yourself and is this first publication, or did you compile and arrange data from other sources? If your numerical data is truly unique, then how is Google (or any else, for that matter) to evaluate it?
To consider the value of this kind of information with regard to search ranking, Google would need to rely much more on user signals that suggest value... and these would likely be linking, social signals and actual traffic, and signs of user engagement on the site. But having just lists of data on the page would be fighting all that.
As I understand it, data by itself cannot be copyrighted... and what turns data into something copyrightable and unique becomes a long and subtle discussion, one I think that's often difficult for lawyers.
In general, though, in the real world, offline or online, "data" relies on interpretation, commentary, citation, and evaluation to be perceived as valuable. It's not just a question of a verbal search engine needing verbal content containing keywords to rank it.
My guess is that Panda looks at data lists, at best, as a kind of "shallow content". More probably, though, it's dupe content. Even non-numerical data that is curated and re-arranged, unless there is considerable interpretation added to it, is not like to be considered unique. Many of the sites I see that have been hit by Panda are basically lists of things, rearranged by categories, with no original content or value added.
|...I'm thinking of interspersing the comments with the tables to see if that will make a difference to Google. I think it might make for a better user experience anyway. |
If it does make for a better user experience, then you're on the right track. I don't think that tabular presentation by itself is the problem, though a monolithic list of numbers probably would be.
It's likely that Google isn't seeing the originality that you see in the material. If you add sufficient commentary, you may be able to emphasize your vision of what that originality is... and if that commentary makes the material more valuable to users, it's definitely worth trying.
@Sand is there any particular reason why you didn't just use a image for the charts? I've found that the image charts are some of the most shared elements of our site.