Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Website experiment... removing keyword cannibalization

         

samwest

12:51 am on Apr 15, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Well, It finally happened. The site hit rock bottom and has flat lined. From six figures to zero.
Over the next few days and weeks I will begin moving backwards, slowly trimming the flesh from my site in one last desperate attempt to recover traffic an relevance.
I admit, over the years the site slowly became fatter and fatter and the likelihood of keyword cannibalization as I understand it has grown.
Most webmasters are under a false assumption that their work and content is flawless. It's hard to admit when you've gone too far with articles about red widgets.
This dissection may be a useful technical autopsy of a now dead, poorly converting site.
Some may say it's foolish and that Google will never allow it to spring back to life.
Other may say it's a brilliant idea. Either way it must be done.
I'll follow up with the gruesome details as the vivisection proceeds.
Should be fun!

Robert Charlton

3:16 am on Apr 15, 2015 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Mod's note. Great topic for a discussion. Since we generally only use the title line in the Google forum and we don't use the description line, as a mod I needed to combine the two lines submitted into one, short enough to work in Google serps pages.

Since one of the questions that's going to come up in this discussion is "what is keyword cannibalization" and how is this going to help with Google, I thought for the record I should post the original title and description here. I don't want to be changing the intent of samwest's topic...

Website Decommissioning Experiment
possible cure to keyword canibalization?



I'm guessing that this is a Panda issue, and I have a sense that this is likely going to be a fruitful experiment.

samwest

3:27 am on Apr 15, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



This is a Wordpress site, so it should benefit a lot of webmasters.
The early site was started in 1997 using Front Page 97, then converted to all Expression Web 4 when css emerged. Those pages were then merged into an existing side blog just last August and life has been miserable ever since. Thinking back, many of my ranking issues started a year or two after launching the blog in 2007 or 2008.

Before I get started, the site consists of about 80 pages and 200 posts.
Ideally, for the product being offered it should be less than 50 pages and half the posts, especially since many are outdated.

Step 1 was to make a backup for posterity then start by clearing out all affiliate related pages and posts since there's not much point in keeping those. I had an Amazon post in page plugin, that's gone. I've made enough money over the past few years with these affiliate links to buy maybe one carry out pizza. I was always of the mindset that these affiliate relationships where hand in hand with my product, so a benefit to users. In reality they weren't much benefit to anyone. Starting from the earliest posts, I also trimmed about 20 really badly outdated and irrelevant posts. There's no argument for keeping that stuff.

This is going to be a slow process. Before each page removed it is set to private. This essentially removes it from the site. Later I'll trash them. It's very hard to admit your work is junk, but you have to do it. I'm no longer standing by work that was done years ago.

Following round one, a surprising amount of traffic came in shortly thereafter and even a few conversions. (I just heard another come in!)
After examination (and also just seeing RC's note) I suspect the site is suffering more from a Panda related ding than Penguin. Ridding the site of affiliate links was liberating. The gut bucket is filling!

Like any experiment, I'll be making small changes of a particular type, then observing the changes, if any.
I'll be able to spot patterns. If tonight is any indication, I am already seeing less of the zero periods. Almost none in fact. More observation is needed.

aakk9999

5:58 am on Apr 15, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Very interesting experiment! And thank you for letting us follow it here. Will you be going as far back as moving it to your pre-wordpress environment? If I remember well from your past posting history, your traffic drop started when you have moved from static html to Wordpress.

Here is some history: Anatomy of a HTML to CMS Upgrade; How Google is responding to the new site... [webmasterworld.com]

Kratos

10:36 am on Apr 15, 2015 (gmt 0)

10+ Year Member Top Contributors Of The Month



I don't understand the concept of keyword cannibalization, much less in websites that are clearly authored by experts and not some UGC site. Are you talking of having lots of documents optimized for the same keyword? If so, the document with the most incoming PR will rank the highest, although this will also depend on the actual keyword that is being searched. That's why having several documents targeting the same keyword isn't something necessarily bad and can actually help to strengthen one document in particular (of your choice) through internal linking. I say this because that's exactly what I do.

Can you give us a clear example of keyword cannibalization occurring on your site and with what keywords? (use "widgets" if you don't want to give out your real keywords). I think of keyword cannibalization as more of a new age SEO buzzword than anything to worry about. It's simple, if you want a particular document to rank for a particular keyword, optimize it via on page and off page methods.

samwest

12:05 pm on Apr 15, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



@aakk999 - My end goal is to dissect it down to a point that it starts working again. If that means to one page, so be it.
I intend to keep it on WP. Isuspect my problems may have started with an overly wordy blog, that was working fine at first, but then go hit by Panda and has been ever since. Now I need to step backwards to find the point at which Panda releases it's bite.
We've learned a lot in the past few years and things have changed big time. Essential articles will be re-written in the method of the Reverend McClain...once again and half as long.

@Kratos: I get a lot of testimonials from customers. I also have side articles I've written over the years basically blowing my own horn about our red widgets, blue widgets etc. Those kind of articles worked pre Panda. In keyword cannibalization, as I understand it, Google finds all these pages and then has a hard time deciding which should rank the best. In the process of deciding, you lose some keyword juice and other likely ranking factors. They can then either post the best one, or toast both (or any number of multiples) as they wind up -45 or in many cases -950. It's a bad practice and one I will admit I was guilty of. This is the cleansing part of this process and a tough thing for an old webmaster.
I need the mindset of someone who started building sites post Panda and respect the dangers of Panda. I learned a lot of bad habits in the early days. You can Google (or Bing if you like) "Keyword Cannibalization" and find many useful articles.

Today's early observation shows sporadic traffic, but no long zero periods like I've been seeing.
Time to sharpen the scalpel for round two.

adder

12:33 pm on Apr 15, 2015 (gmt 0)

10+ Year Member Top Contributors Of The Month



This is going to be a slow process. Before each page removed it is set to private. This essentially removes it from the site.

Essentially you've now created 20 crawl errors on your GWT profile, which is not going to help you. There's a much quicker and beneficial way of dealing with it. The best practice would be to 301 these 20 pages to the next best relevant article or if that's not possible, to add a 410 status to these 20 urls using htaccess.

Have you got any pages that you'd consider "thin content" - e.g. just a few sentences or a paragraph with "blow your own trumpet" stuff or too much boilerplate content?

Have you used any product descriptions that you've borrowed from manufacturers or other sources?

Content consolidation on WP is a relatively easy task. I normally use a plugin called "Redirection 2.3.14" (it's free and I'm not affiliated with it)
Basically, if you've got 5 pages that were built to target "how to choose best red widgets", you'd pick the page with the best content, move all salvageable/useful paragraphs from the remaining 4 pages to the chosen page, then send the 4 pages to 'Trash' and set up redirects so that the 4 bad pages redirect to the one good page.

That way not only you consolidate your quality content, you also reuse any potential 'link juice' that the deleted pages used to have.

Also make sure there's only one url to each page. I still keep seeing WP installations with 4 different ways of accessing the content:
http//www.example.com/red-widgets/
http//example.com/red-widgets/
http//www.example.com/p=123
http//example.com/p=123
In this scenario, it's no longer 80 pages and 200 posts but 320 pages and 800 posts. And then add wrongly formatted tag and category archives to the mix and you've got one big nightmare of a site.

samwest

12:55 pm on Apr 15, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



@adder - Yes, I have been using that exact plugin for quite some time. I am redirecting exactly as you prescribe.

What is really great and the impetus for this epiphany is that there are now many useful tools for analyzing the Panda penalty effects on your site. I currently rank about a 37.1% probability of being dinged. This is based on my GA data.
That's a YELLOW, which is not bad, but not good. Another resource was able to actually pinpoint my date of demise which was way back in 2011, on the first Panda "Farmer" iteration. These new resources make it so much easier to get a grip on what happened and when. I have my work cut out for me!

samwest

1:29 pm on Apr 15, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



@adder - I know there is a plugin for 410, but it's very old. Can you enlighten me on how you would properly 410 a number of pages in .htaccess with wordpress installed? I think it's also possible to use the redirection plugin to do this by creating a new 404 rule and simply changing it to 410. The plugin still record those hits as 404, but tells the browser 410.

samwest

2:22 pm on Apr 15, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Pertty incredible results within 24 hours. One long lost search term that I've been ADDING content to recover has gone from #1 on page 2, back to #1 on page 1....it's old home. Amazing what trimming the fat can do. May be a fluke, but proceeding with cautious optimism.

adder

5:56 pm on Apr 15, 2015 (gmt 0)

10+ Year Member Top Contributors Of The Month



I am redirecting exactly as you prescribe

That's great. Sorry, must have misunderstood something in your OP :)

how you would properly 410 a number of pages in .htaccess


If you've got an SEO plugin installed, you can simply go to SEO -> Edit Files
In your .htaccess field you'd add this within the rewrite_mod module (anywhere after this line:
RewriteEngine On

Redirect 410 /why-i-love-red-widgets/
Redirect 410 /how-to-look-after-red-widgets/
ErrorDocument 410 default


If you don't use an SEO plugin, you'd have to (carefully) edit the .htaccess file on your server

The downside is that I can almost guarantee a custom 410 error page won't work with your installation and I'll be surprised if it does. That's why I included the ErrorDocument directive. It won't work without it. Yeah, and you'd need to send the post to Trash first for the 410 to work.

In the ideal situation, you'd have something like this:
Redirect 410 /why-i-love-red-widgets/
Redirect 410 /how-to-look-after-red-widgets/
ErrorDocument 410 /custom-error-page.html


Otherwise the visitors will be hit with an unsightly error page which certainly won't amuse them. That's why it's best to use 301 whenever possible and only use 410 where you can't find a relevant/related page to 301-redirect to.

P.S. Unfortunately, the Redirection plugin doesn't work with 410s.

Martin Ice Web

6:51 pm on Apr 15, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Samwest, the problem will be to know if it is realy panda that makes your site suffer? As we know latest panda run was propably last october, due to john mueller. When panda want be updated, how will you calculatevthe effect from removingbparts of your site?

lucy24

8:11 pm on Apr 15, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



In your .htaccess field you'd add this within the rewrite_mod module (anywhere after this line:
RewriteEngine On

Redirect 410 /why-i-love-red-widgets/
Redirect 410 /how-to-look-after-red-widgets/
ErrorDocument 410 default

Nooooooooo.

mod_rewrite-- the basis of WP and most other CMS-- will never see these rules, because it executes before mod_alias (Redirect by that name). The ErrorDocument directive is meaningless, because "defafult" is what the server would do anyway. And a generic server-generated message definitely isn't what you want your human visitors to see. You'll need to hard-code a custom 410 page.

What does "private" mean? To me it would imply two things: Remove all links pointing to the page, so you can only reach the URL if you already know about it, and add a noindex meta so no new visitors will come in. But if "private" results in a crawl error (implying a 404) then I guess that isn't what it means.

Kratos

8:25 pm on Apr 15, 2015 (gmt 0)

10+ Year Member Top Contributors Of The Month



I see what you mean now. I don't believe in keyword cannibalization though, but I'm not proactively looking to have all my titles targting the same keyword. I do however intentionally seek very similar titles.

Also about the Panda filter I think there's way too much misinformation going on about it. The fact that you have 12 articles targeting the keyword "red widgets" need not be a sign of low quality content and thus fall under the Panda filter. Of course if you're just building doorway pages to intentionally mislead users then that's something else (and falls under the Panda spectrum), but if you're an expert in a niche and just love writing about "red widgets" I sincerely don't think you have absolutely anything to worry and you can actually take advantage of the situation to power up a chosen document to rank, that's what I do at least.

adder

8:27 pm on Apr 15, 2015 (gmt 0)

10+ Year Member Top Contributors Of The Month



@lucy24, surprisingly, I've been using this on at least 3 different WP installations and it has worked for years (and i can prove it if necessary). Luck maybe? Or, shall I say triple-luck? ;) ErrorDocument depends on the server setup. I've found that I've had to use it every time, otherwise, as you mentioned, WP ignores the rules.

What does "private" mean

It's an option on WP editor. You can set a post to 'private' and it will return status 404 unless you're logged in as admin.

samwest

8:54 pm on Apr 15, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I've chopped about all I need to right now. Don't want to get too crazy.
To recap, I removed:
1. Any old posts that are not appearing by title in the top 3 pages.
I'll whittle those down later if needed.
2. Any thin articles of 100 words or less, there were only a couple and they were of zero value.
3. All affiliate links and their related posts that have not paid off. I kept one that pays well and is highly relevant.
4. I shut down my internal affiliate program. Nobody was using it.
301'd what I could, will follow up with WMT to catch anything I missed.

So, I refreshed my site map and resubmitted. Almost immediately traffic hit the bricks. Almost seems like they said "woah, somebody is trying to fix that site, lets sandbox it for a bit"...may just be coincidence, but it has dropped from pretty robust traffic today to zero for the past 30 minutes. That's an obvious OFF period for me. They sure like their games.

tangor

2:17 am on Apr 16, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



30 minutes? Sorry... I like 30 hours better as a metric. Give it time.

(mumbling about instant gratification and expectations...)

Robert Charlton

3:09 am on Apr 16, 2015 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



I'd mumble about the speed with which this was done. I think I'd have spent a lot more time thinking about it, figuring out optimal ways to remove excess pages, maybe trying a few test pages, etc. I think I'd even check into what kinds of multiple pages Google doesn't like... maybe discuss it here. As tedster used to say frequently, "measure twice, cut once."

samwest - With regard to any ranking drops, whatever you do, don't react to them and make changes in reaction. See my recent comments on this thread [webmasterworld.com...] about Google's rank modifying patent for spam detection.

samwest

4:14 am on Apr 16, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Believe me folks, those pages HAD to go. They weren't ranking and much of it was stale or affiliate related. Those pages were only poisoning the site.
As far as the time I was reporting, I simply meant the site had gone dead shortly after I resubmitted the sitemap. It was likely a coincidence anyway because it happens every day. I certainly don't don't expect instant gratification. I've been at this for a while, I know how long updates take, if they EVER take. Although, a few months back I picked up malware in a WP plugin, G had the big red malware warning on my site in no time, but when I fixed it shortly thereafter, the malware warning cleared in a few hours, so we know they can and do react faster than years ago.

I'm happy with the cuts so far. I plan on waiting for the next crawl error report and addressing anything I might have missed in redirection. As a woodworker I always make the same recommendation RC, but this time I measured thousands of times and cut once.
I am also aware of that patented delay after changes. I'm not holding my breath that it will EVER improve. Whatever happens, happens....it's rock bottom, there's nothing to lose.

Kratos

10:59 am on Apr 16, 2015 (gmt 0)

10+ Year Member Top Contributors Of The Month



I still don't agree with your solution, and I don't agree that it's as simple as labeling your site as being under the Panda filter. I don't agree with keyword cannibalization either and I think it's a buzzword and not anything significant a webmaster should worry about (unless one is trying to manipulate rankings and still it's very easy to take advantage of a subset of documents optimized for the same keyword).

One thing and since you mention the 'sandbox', there is absolutely no sandbox effect or thing going on with Google. Your site is either filtered or it isnt and the thing is that there isn't just one page quality algorithm, there are many that can hit you at any given time. It's just that Panda is the most complete one. Google does look at a webmaster's reaction when they filter a site under one of its spam filters, this in reference to what Robert mentioned. They have patented it and it's a really interesting read. It's why people report massive drops in SERPs when they build dubious links only to regain the SERP positions or come back stronger (this is wrongly label by amateurs in spam forums as the Google dance). It's also why if you go to these amateur forums you will see the terms sandbox, google dance and other nonsense being thrown like it's actually real.

We webmasters tend to think that the high rankings we once had are the rankings we DESERVE. Google could have easily decided there and then that your site was over inflated in its SERP position and that your current ranking is the one your site actually deserves. When this happens, there's no getting out of it with simple changes like you have done. I tend to think that most webmasters saying they're under Panda have actually had their sites re-assessed by Google and they're been ranked according to what their site really deserves and not because Google has decided your site is spammy. Then you have your competitors and new entrants to the SERPs and it becomes a mess to know what is going on with your site. Not to mention the false positives that occur when an automated filter is applied.

I think of my sites as great sites, but I can easily see an outsider telling me they are spammy or garbage. In fact I try to put myself that hat on when truly evaluating a site of mine, because the same deserved rankings issue is one I have encountered with a couple of our sites.

I have no doubts that you want to do whatever possible to convince Google your site deserves higher rankings but it sometimes really helps to get an outsider to tell you if your site is really as great as you think it is. Shame we can't evaluate others' sites in this forum but do sure keep us updated on how it goes and if you need any help from me let me know.

samwest

12:11 pm on Apr 16, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I'm not trying to convince Google of anything. This is simply and experiment on a dead site. It's an autopsy, a post mortem. I don't agree with much of what you said. Deserve? you gotta be kidding me. Put down the Kool-aide. I never expected the site to go from within the top 50k to zero. Nobody deserves that unless they are clearly harmful in some way. "Deserve"...I gotta laugh.

Kratos

12:22 pm on Apr 16, 2015 (gmt 0)

10+ Year Member Top Contributors Of The Month



Well, it isn't my fault Google has decided your site deserves the rankings you have. And I'm going to take a wild guess that they're spot on in their decision to nuke your site. Ergo you certainly deserve your rankings. You can disagree all you want but Google isn't the number 1 search engine (by a huge margin) because they make mistakes with sites that have thin content (like you've confirmed). You are just aimlessly throwing out buzzwords to try to reason why you site has been nuked, and gauging by your last reply I can see it really is not worth it any more to try and offer opinions. You've already convinced yourself of what you have, despite you haven't a clue what has hit you.

You're saying this and that and all you do in this forum is constantly moan about how bad Google is (see the April Google update thread), it gets tiresome after a while. If you ain't happy with them, block them on robots.txt, stop complaining about them on each and every thread and enjoy competing against spammers over at Bing and Yahoo, I'm sure you will enjoy that ride even more.

Anyway, good luck.

samwest

12:29 pm on Apr 16, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Now you're just being obtuse. You know nothing about the site. I know exactly what hit me. Let's agree to disagree or please shill elsewhere.

lucy24

6:30 pm on Apr 16, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



those pages HAD to go

Does there exist hard evidence that individual bad pages can drag down a whole site? We're not talking about pages filled with spammy links and over-the-top black-hatness, right? Just some pages that were less appealing than others. Every site's got those.

frankleeceo

10:28 pm on Apr 16, 2015 (gmt 0)

10+ Year Member Top Contributors Of The Month



Isn't this panda by definition? Too high percentage of individual bad pages can drag down the whole site. I thought we are at a point where no evidence is needed anymore.

RedBar

12:18 am on Apr 17, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Too high percentage of individual bad pages can drag down the whole site.


What precisely, according to the Gorg, are "bad" pages?

I have low visited pages because 99.99% of people cannot afford the products on those pages, me included, however that doesn't make them "bad" pages, simply much less frequented.

The information supplied on those "bad" pages may be similar to the "good" pages ... surely G is not that dumb to consider low-trafficked pages are inferior to higher trafficked pages without comprehending "WHY"?

Oh, I forgot, we're dealing with Gorg children!