Handling manufacturer descriptions on an ecommerce site

Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Handling manufacturer descriptions on an ecommerce site

dereksmalls5

9:31 am on Feb 18, 2015 (gmt 0)

Hi,

I hope you can help me.

The site I'm working on has scraped all of their product descriptions straight from the manufacturer.

This means of course that Panda has reared its head and the site is in the bin when it comes to SERP results.

I'm currently re-writing the product descriptions by hand, but with nearly 1500 products, this is going to take a while.

As the product pages of course take up a huge bulk of the site, would it be beneficial to somehow remove all of the categories that I haven't got to yet out of the index and add them back in as I re-write them?

If so, how exactly would you go about this? I don't really come from a technical background but I do have a web designer who should be able to make changes for me...I just don't know exactly what changes would be beneficial in my situation!

Any help would be appreciated.

Many thanks.

adder

11:28 am on Feb 18, 2015 (gmt 0)

Depends on the software that runs the store. If it allows you to edit header code on a per-category basis, you can add this within your <head> tags


<META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">

And then as you finish the categories, just remove this line and it will become "crawlable" again.

If all the pages use the same header code, you can achieve the same by making changes to your robots.txt file. Add something like this:


User-agent: *
Disallow: /product-category-1/
Disallow: /product-category-2/

And make sure that the specific product pages can't be accessed using another url. For example, if I can access your product pages both via
/product-category-1/green-widget-5/
and
/catalog/green-widget-5/

when you disallow the /product-category-1/ folder, Google will keep crawling the /green-widget-5/ page because it can access it via the catalog page.

So, to sum it up, you're better off adding the robots tag to the specific product pages, not the category pages.

As the product pages of course take up a huge bulk of the site

On a side note, something you should think about. What type of pages take up the remaining bulk? Are they original/high-quality? If it's just the usual "about us" and "ToS" types of pages, you may find that writing those 1500 original descriptions doesn't produce the results you've been expecting.

I had a client who built an e-commerce site using a standard off-the-shelf script. He couldn't be persuaded to write a blog. All the product pages had "thin content" and we made things better by adding user-generated content to the product pages: reviews and how-to videos. Unless the descriptions you write are awesome in-depth stuff, you might need something similar to give those pages a competitive edge.

dereksmalls5

12:27 pm on Feb 18, 2015 (gmt 0)

Thank you very much for your help.

That is great advice which I will pass on today.

Do you know what sort of time frame I would be looking at to know how long it will take to de-index the problem pages and see some improvement in SERP positions for the products that I re-write and improve on? i.e. will this help me to escape from Panda?

I will of course ensure that all the other pages have robust and informative text on it. We do have a blog but it's been very neglected so I will get that started again. All the other pages probably need beefing up as well actually.

You've been a great help. Thank you.

netmeg

1:19 pm on Feb 18, 2015 (gmt 0)

Nobody can tell you how long it will take, because it depends on a lot of other things - it's never just *one* thing with Google or SEO anymore.

But in a very general sense (and I work with a lot of ecommerce sites) you probably won't start seeing noticeable improvements until the percentage of good quality pages significantly outweighs the percentage of lesser quality pages.

So if you want to hurry this process up a bit, you could try putting a NOINDEX on the thin pages while you work on the site. The dangers here are that you 1) are NOINDEXing pages that actually get search traffic (if not from Google, then from Bing) and 2) you forget to remove the NOINDEX. So check the search traffic on the thin pages and don't forget to open it back up.

Note that depending on how your store is set up, removing or noindexing the category pages may not be enough. Specially if there's a canonical tag in play. You may have to do it with every product page too.

It still could take some time though. Like as in months. It can take serious time to clean up a Panda hit.

EditorialGuy

3:52 pm on Feb 18, 2015 (gmt 0)

One other thing:

Don't think in terms of "rewriting," think about "adding value."

Just avoiding duplicate content isn't enough. Make intrinsically useful content your unique selling proposition.

dereksmalls5

4:23 pm on Feb 18, 2015 (gmt 0)

Thank you both.

Is taking a one or two sentence description that describes the product and turning it into 350+ words that describes all the features and benefits of the product creating adding additional value whilst eliminating duplicate content?

rmissey

4:25 pm on Feb 18, 2015 (gmt 0)

I wouldn't recommend disallowing the duplicated content. The site isn't being penalized for this content - it's just not going to rank for it because it's not original.

My recommended path would be to simply continue to add information that is valuable to the user, encourage user generated content via reviews and Q&A, and - if possible - hire a good product description authoring company to handle the bulk of the content creation. It's not cheap, but it's worth it in the long run. The alternative is to ramp-up the internal content team, which I'm also a fan of.

netmeg

6:46 pm on Feb 18, 2015 (gmt 0)

Is taking a one or two sentence description that describes the product and turning it into 350+ words that describes all the features and benefits of the product creating adding additional value whilst eliminating duplicate content?

Not necessarily. You should give enough information to adequately describe the product, but let's face it, you can't write 350 words about everything - say, a paper clip.

You need to add things that are specific to this vendor - things that your competition can't or won't add. Likely applications for the product. Related products. Testimonials and reviews. Examples of problems solved by the product. Explain why you chose to carry that product, and your quality control procedure for making the decision.

But just writing 350 words for the sake of writing 350 words isn't going to help you - Google is WAY on to that by now. Read it out loud as you're writing it - if it sounds weird spoken out loud, it will look weird on paper.

FranticFish

6:37 am on Feb 19, 2015 (gmt 0)

I wouldn't recommend disallowing the duplicated content.

I would. Why take the chance of looking like a scraper?

lucy24

6:51 am on Feb 19, 2015 (gmt 0)

If all the pages use the same header code, you can achieve the same by ...

No. You. Can't. *

meta noindex = don't include this page in your index (but do continue crawling)
robots.txt disallow = don't crawl (but do include in index if there's a match on, for example, someone else's linking text)

meta nofollow has nothing to do with anything. It just means "I am not placing the weight of my authority behind any links on this page".

* I do realize that it took about a year for me to wrap my brain around this. But then, I'm slow on the uptake.

dereksmalls5

9:11 am on Feb 19, 2015 (gmt 0)

Hi Lucy,

what would you recommend I do? How can I de-index individual product pages en masse if not via blocking each category?

Thanks.

netmeg

1:19 pm on Feb 19, 2015 (gmt 0)

Depends on what ecommerce package you're using and the architecture of your site.

lucy24

7:15 pm on Feb 19, 2015 (gmt 0)

How can I de-index individual product pages en masse if not via blocking each category?

The meta noindex approach will work. You just need to be clear that "don't crawl" and "don't index" are different things.

netmeg

7:30 pm on Feb 19, 2015 (gmt 0)

I think he's asking how to get the tags on the pages.

FranticFish

10:21 pm on Feb 19, 2015 (gmt 0)

Adding a 'noindex' toggle to your shopping cart for category / subcategory / product pages should be a simple job for any reasonably competent programmer. Exactly how simple depends on the exact system you use - there might even be an SEO widget plugin if you have an 'off the shelf' solution.

You could get help on the PHP forum here, or on a forum dedicated to your cart software otherwise.

dereksmalls5

10:51 am on Feb 20, 2015 (gmt 0)

I have a programmer who should be able to put the tag on the page for me (it's his custom CMS), but I just wanted to make absolutely sure what the right tag I should use, because this discussion has confused me somewhat.

is <META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW"> the right tag that I should use on all duplicated & thin pages? Will this enable me to start getting that ratio you talked about Netmeg of having 'good' pages far outweighing the 'bad'? And then I slowly build the site back up to its current index with all the duplicate stuff flushed out, removing this tag as I go?

I actually do have a few pages that rank, and that's for local searches because I created a local landing page for each of our branches (yes...all with different, unique copy) and we come up organically for various keywords related to our business + location.

If I de-indexed the huge amount of problem pages, do you think that by de-indexing a huge chunk of the site would this affect these ones that already rank? De-indexing product pages is a fairly drastic solution right, but I can't really see any other way of doing it because I don't want the site to be viewed as some low-quality scraper thing, which unfortunately it technically is at the moment. I'm trying to fix it though but it seems like it's an insane amount of work!

How did other ecommerce sites recover from Panda when it first hit? Surely most places were taking manufacturer descriptions and pasting them on their site, because what Small-Medium business would have even considered re-writing a whole catalogue back then? They surely wouldn't have had the means in terms of manpower or time to do that.

Many thanks for your help.

netmeg

1:51 pm on Feb 20, 2015 (gmt 0)

I wouldn't add the NOFOLLOW part. Just the NOINDEX.

If I de-indexed the huge amount of problem pages, do you think that by de-indexing a huge chunk of the site would this affect these ones that already rank? De-indexing product pages is a fairly drastic solution right, but I can't really see any other way of doing it because I don't want the site to be viewed as some low-quality scraper thing, which unfortunately it technically is at the moment. I'm trying to fix it though but it seems like it's an insane amount of work!

It's possible, because anything is possible with Google. The thing is, you might want to use a scalpel here instead of an ax. Just put the NOINDEX on the pages that aren't getting much or any traffic at all, and continue to work on the site, and watch what happens.

FranticFish

5:34 pm on Feb 20, 2015 (gmt 0)

If I de-indexed the huge amount of problem pages, do you think that by de-indexing a huge chunk of the site would this affect these ones that already rank?

I've not seen that myself; I use 'noindex' only (without 'nofollow') which means that PageRank flows throughout the pages in the website, whether they are indexed or not. So I wouldn't expect the rankings of any pages that have original content to get worse.

what Small-Medium business would have even considered re-writing a whole catalogue back then

Think of it from the point of view of a search engine looking to produce as diverse a set of results as possible. Why would they want to include more than one copy of any page? Or even more than one page from a group that appear to be very similar?

If they make a point of only including pages that are sufficiently distinct, then that's a simple machine test that is more often that not going to provide a better experience for the user.

Imagine going to the library, browsing the shelves, and pulling out the same book all the time, no matter what the cover looked like.

rmissey

1:28 am on Feb 21, 2015 (gmt 0)

I wouldn't recommend disallowing the duplicated content.

I would. Why take the chance of looking like a scraper?

If you're talking about manufacturer descriptions that are in use all over the web, the chances of being penalized as a scraper are practically non-existent.

rish3

1:34 am on Feb 21, 2015 (gmt 0)

Think of it from the point of view of a search engine looking to produce as diverse a set of results as possible

That makes perfect sense for duplication across different domains, which seems to be the concern here.

I do find it odd though, all the hoopla about duplicate content within a single site/domain. The SEO community certainly seems to think it's something that can hurt your rankings, to the degree that there are a ton of tools, plugins, guides, articles, etc, to "fix it".

Certainly, though, the search engines understand the concept of category pages, tag pages, excerpts, url parameters for different sort orders, etc...the kind of things that create dup content within a site. You would imagine they could do a decent job themselves sorting out which "copy" is the right one, and selecting it without assigning some sort of penalty or filter.

lucy24

2:17 am on Feb 21, 2015 (gmt 0)

sorting out which "copy" is the right one

I wouldn't trust the search engine to make this decision on its own. That's what the "canonical" tag is for.

rish3

2:41 am on Feb 21, 2015 (gmt 0)

I wouldn't trust the search engine to make this decision on its own. That's what the "canonical" tag is for.

They had to make that decision before the canonical tag existed :) Anyhow, doesn't change the idea that internal duplication shouldn't result in a filter/penalty for whatever copy if the right one. People still fret about tag and category pages, for example. I feel like a search engine should be able to just "do the right thing" for those, following direction if it's there, and taking a very educated guess if not.

FranticFish

10:18 am on Feb 21, 2015 (gmt 0)

the chances of being penalized as a scraper are practically non-existent

But why take even (assuming you're totally right about that) a small risk when for the sake of ONE METATAG you can take no risk?

Some things you have less control over; some things you have more control over.

If it's possible to be 100% sure on the stuff that you have 100% control over, then why chance it? To prove what point exactly?

netmeg

2:00 pm on Feb 21, 2015 (gmt 0)

I do find it odd though, all the hoopla about duplicate content within a single site/domain.

I believe the OP is talking about duplicate content outside a single domain, and thin content to boot. It's not a question of reducing the internal duplicate content, it's an attempt to jump start the balance between good content and low quality content.

Shai

2:24 pm on Feb 21, 2015 (gmt 0)

They had to make that decision before the canonical tag existed :)

And in many cases, they made a right hash of it. Still do in some cases where the tag is not used.

rish3

3:18 pm on Feb 21, 2015 (gmt 0)

I believe the OP is talking about duplicate content outside a single domain, and thin content to boot.

Yeah, I specifically qualified my remark with that exact thought, in the part you chose not to quote.

dereksmalls5

9:20 am on Feb 23, 2015 (gmt 0)

I believe the OP is talking about duplicate content outside a single domain, and thin content to boot. It's not a question of reducing the internal duplicate content, it's an attempt to jump start the balance between good content and low quality content.

That's exactly right. Internal duplicate content isn't a problem.

I overwhelmingly have thin pages which are the category pages (i.e. no text at all) and low quality pages which are product pages with scraped manufacturer descriptions and I'm wondering how to most effectively 'fix' this.

I've started re-writing things, adding content (text about the products and buyer's guides on category pages etc) but this is going to take absolutely forever!