Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Fixing a Supplemental Results - duplicate URL problem

         

bo0oost

2:58 am on May 3, 2007 (gmt 0)

10+ Year Member



I've been reading g1smd's information on supplementals, and I had a question.

Almost all of my 90,000 pages were deemed supplemental after a duplicate content issue with a script that google somehow found out about.

I tried to change the URL, and 301 the supplemental to the new url. The new URL after about 2 weeks, all also became supplemental.

Today, desperate to get my ecommerce site back to "normal", completely re-did all of my URLs, and actually deleted the old supplemental pages that used to 301, and the actual page they 301'd to. So now all supplemental pages in google return a custom generic 404 on my site. All products now have new URLs and directory structures.

Is this the right way to get my site back to normal?

[edited by: tedster at 6:35 am (utc) on May 4, 2007]

icedowl

10:23 pm on May 6, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Sorry for all the questions, but I'm sure there are others on here who would like to know as well Thanks for all your help guys!

Yes, I am keenly interested. I have a very similar situation with one site. For years it had been entirely static HTML and ranked very well. Then I changed it to a CMS with all dynamic pages and most of the site went supplemental a bit over a year ago.

All pages of the site have been rebuilt back to static HTML in mid February and it is still suffering with 95% of the pages in the supplementals. The dynamic pages are gone, blocked in robots.txt and return 404's (I don't know how to make them return a 410). Also, they've been removed with the new tool in Webmaster Tools which works like a charm.

Most traffic is now coming from bookmarks and Yahoo, where searches for the site's content is frequently returned in positions 1-3 or at least within the first 3 pages. Searches for the content in Google will on occasion return the supplemental pages within the first few results, but that's a rare occurrence. Traffic from Yahoo has now grown to 30+ times the traffic from Google.

g1smd

10:28 pm on May 6, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The indexed (dynamic) URLs should really return a 301 redirect, so that you still get to keep any search engine traffic that is clicking those "wrong URLs".

icedowl

10:34 pm on May 6, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The indexed (dynamic) URLs should really return a 301 redirect, so that you still get to keep any search engine traffic that is clicking those "wrong URLs".

If I could do that I would, but they're all gone and I've no way to say which of the dynamic URL's belong to which of the static URL's even if they weren't gone. There are hundreds of them. My 404 page invites the user to enter the current site, and states that the content they were looking for is indeed still within. About half do enter the current site.

bo0oost

11:11 pm on May 6, 2007 (gmt 0)

10+ Year Member



>>> Yes. You should fix the Duplicate Content issues and then wait for Google to reindex things. <<<

So the fact that I have re-done all of my URLs, and have 404's the old urls, what do you suggest I do?

Just keep going with the new URLs, and focus on increasing each URLs unique content, and so forth?

Or, since Google only picked up about 20 new links, do you think it's still early enough for me to go back, and redo it *once again* to the old url's, and just make sure there is no more duplicate content issue?

Appreciate the help!

g1smd

11:16 pm on May 6, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



What sort of duplicate content issues were you fixing?

- multiple domains
- www and non-www
- multiple dynamic parameters
- session IDs in URLs
- thin content
- title tag and/or meta description duplication
- boilerplate text template problems

or something else?

bo0oost

11:25 pm on May 6, 2007 (gmt 0)

10+ Year Member



My dynamic issue was this.

My urls were in this structure due to ISAPI Rewrite:

[site...]

Google happened to find out what dynamic url was creating it, which was:

[site...]

Once google found out about this dynamic page, my whole site went into supplemental, except for some pages that were not generated by these scripts.

So I blocked robots.txt from accessing directoryA which is the directory that built everything.

I then started new URLs, and 301 redirected the supplemental URLs to the new URLs. All of those new URLs still turned supplemental. Arg! I thought it was due to a 301 sending all the information it knew about a page to the new page.

Due to this, I completely deleted all the scripts, and all of the old and new pages, and just started fresh.

What do you think?

bo0oost

11:28 pm on May 6, 2007 (gmt 0)

10+ Year Member



One more thing.

I noticed yesterday google had added new pages into site:search

Two of those pages now do not exist in the results. I'm afraid when they come back, they will be marked as supplemental.

Is my only choice to start a whole new site? Arg! I just want to crawl under a rock!

Keniki

11:44 pm on May 6, 2007 (gmt 0)



Supplemental Results here's my opinion.....

Definite Causes Of Supplemental Results:

Canonical issues
Duplicate Content
Orphaned Content

Theoretical Causes Of Supplemental:

Meta Tag
Inbound link text
Pagerank

[edited by: Keniki at 11:46 pm (utc) on May 6, 2007]

jdMorgan

11:45 pm on May 6, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>> With me changing the URLs, is this hurting my credibility with Google? <<

You are signalling that you have a new site, and so you reset some of your age factors back to zero. That likely isn't a good thing. Cool URIs don't change.


This is starting to remind me of the old saying, "If you find yourself in a hole, stop digging!

Fix canonical domain and URL issues, replace dynamic URLs having more than two query parameters with static URLs, make sure spiders don't get assigned session IDs, redirect the old URLs to the new ones, and then stop changing your site's architecture.

Remove your Google Toolbar so you won't see the infrequently-updated-and-often-wrong PageRank Display, ignore all Google Update threads, and go back to writing unique, authoritative, truly-useful articles for your site, making sure that each page on your site has an accurate and unique <title> and <meta name="description" content="great description"> on it. Then chase some relevant links from supplier sites, customer sites, discussion boards related to your products, etc.

Supplementals are still in the index, so don't worry so much about them. As stated several times above, they still rank for some searches. Just let your site settle, age, get its TrustRank back, and work on other things.

If your traffic is tanked, then sign up for Adwords to get some traffic while your pages recover.

A bit of cool, calculating, emotional distance is required in this Web-world, and will save you some ulcers, too.

Jim

Keniki

12:01 am on May 7, 2007 (gmt 0)



get its TrustRank back

Trustrank is the biggest load of bull#*$! I have ever heard of. Trust can be bought and taken away by scrapers. You think people aint noticed non profit sites like goverment or educational sites provide trust? Yes they have, I have people phoning me all day long selling links off these sites from local goverment to schools and universities too guardian too bbc.

bo0oost

12:19 am on May 7, 2007 (gmt 0)

10+ Year Member



>>> making sure that each page on your site has an accurate and unique <title> and <meta name="description" content="great description"> <<<

If my Meta Description is dynamically taken from the Product Description, is this ok?

Keniki

12:24 am on May 7, 2007 (gmt 0)



yes

bo0oost

3:13 am on May 7, 2007 (gmt 0)

10+ Year Member



This would not be considered duplicate content, right?

www.site.com/myproduct.asp

www.site.com/myproduct.asp#details

Keniki

3:19 am on May 7, 2007 (gmt 0)



I can only guess at asp, but I would say search would ignore anything after #.

g1smd

4:40 pm on May 7, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



>> www.site.com/myproduct.asp

>> www.site.com/myproduct.asp#details

Nope. All after the # is ignored.

Read the multi-page threads on duplicate content that went on all through 2006. There are hundreds of posts with a lot of detail.

Mbwto

7:51 pm on May 7, 2007 (gmt 0)

10+ Year Member



I was working on a site that had almost all it's pages in Supplemental. All the title, description and keywords tags were identical. I gave each page unique tags, obviously made them relevant to the page. I added more content to the pages because it was sparse. I also starteed creating a directory structure to help better theme the site. I 301 redirected all pages that needed it, applied canonicalization and the pages slowly but surely started coming out of supplemental. I think there's one left. When I started this project, it had a PR of 0, now after the recent update, it is PR2. Link building helped as well. Supplemental isn't death, it's Google's way of saying, get your #*$! together!
Oh a few other things...I created a sitemap, php and xml applied a few other Best Practices and it seems to be working out. The site ranks pretty well for its chosen keywords where it never ranked before.

bo0oost

1:31 am on May 9, 2007 (gmt 0)

10+ Year Member



Mbwto, do you recall about how long until the pages started to come out of supplemental?

bo0oost

5:40 am on May 9, 2007 (gmt 0)

10+ Year Member



I also want to make a quick note.

The way google discovered my duplicate pages was because I had "Google Analytics" running on my site for a short period of time. During this time, I was testing the build script. It was *never* live, and was *never* exposed to anything but our test area.

I warn people to be careful what URLs you browse on your pages while Google Analytics is running!

bo0oost

12:05 am on May 12, 2007 (gmt 0)

10+ Year Member



Update:

Google has added about 13 page worth of links back, and traffic went back up. Today they slimmed 7 of those pages, so now I have 6 pages worth of non supplemental links, so traffic went back down.

They had a total of 72,000 pages indexed, and now they have 50,000.

I wonder if they are shuffling my pages around right now, and if the pages they added will come back into the index. Google vists my site everyday.

One page that was added went into supplementals, so I'm wondering if low PR is hurting it.

Any input?

g1smd

12:14 am on May 12, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Raw page counts can be misleading.

I am looking at a site that shows 34 000 pages in one datacentre and 43 000 in another datacentre at the same time.

The numbers also fluctuate up and down by a few thousand over a period of a few weeks.

bo0oost

2:07 am on May 12, 2007 (gmt 0)

10+ Year Member



Weird, I just did a site search and they just added a few pages of new links. Some of the ones that were removed are now back, with fresh caches from 5/11.

bo0oost

10:30 pm on May 15, 2007 (gmt 0)

10+ Year Member



Update:

My site is slowly making a comeback.

I did notice a bunch of my urls were added back into the index, and they were added as supplemental, and also ommitted. I searched and found this:

"If there is a link to click to see further "omitted results" then it is a warning about "duplicate content", often just title tags and/or meta descriptions that are the same or too similar. Make them all different.

Make sure that each piece of content on your site has only one canonical URL that is used to access it. Get all the others out of the index by using the meta robots noindex tag (preferred) on the other URL versions, or exclude them using robots.txt instead. Matt Cutts has mentioned it several times recently too. "

Every title, and meta description is unique to that page. The only thing I can think of, is the meta description is the first 250 characters of the item description. I thought this did not matter.

All of my duplicate content issues have been fixed, and all 301s are in place.

Any ideas?

Thanks guys!

g1smd

11:39 pm on May 15, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I recognise that quote!

It takes quite a while for Google to sort those issues out once you fix them, so monitor progress over at least the next few weeks or more.

bo0oost

11:53 pm on May 15, 2007 (gmt 0)

10+ Year Member



g1smd,

Thank you for your input, it's very much appreciated. I will continue to wait.

Google puts them as supplemental because of the previous problem, right? Once they deem the pages are not returning a 200 ok status, they re-evaluate, correct?

Typically this process can take weeks?

EDIT: As you know, I changed all URLs, so the pages are actual pages, and not re-writes, just in case, as I'm paranoid of this happening again. The *NEW* urls that i created were put into supplemental, and ommitted, so they are being referenced against googles old "cache", right?

g1smd

12:19 am on May 16, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



URLs that were once Duplicate Content can get marked as Supplemental Results before, or after, the redirect is applied. Once the redirect is applied, the redirected URLs will get marked as Supplemental Results if that has not already happened, and those redirected URLs can continue to appear as Supplemental Results for a year or more!

Once that happens you can do nothing more. Your measure of success in seeing what happens to all of your URLs that return "200 OK" status.

Your redirected URLs, where they appear in the SERPs, will continue to feed visitors to your site. The redirect will deliver that visitor to the correct URL for that content.

Job Done.

bo0oost

12:31 am on May 16, 2007 (gmt 0)

10+ Year Member



g1smd, what I'm referring to is new URLs that were never duplicate content, but are marked supplemental. What I'm wondering is, the new URLs that I have created, are redirected to, from the old supplemental urls. As google indexes them, they instantly mark them as supplemental, and put them into ommitted results. My question is... is this normal, since google still has a memory of the duplicated content?

I hope I'm making sense :)

g1smd

1:09 am on May 16, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



It is fairly common for that to happen, and it will take a few weeks to see what is really going on.

bo0oost

1:12 am on May 16, 2007 (gmt 0)

10+ Year Member



OK, thanks, just wanted to be sure.

So basically within a few weeks, they either show up in normal results, or stick around in supplemental?

I scoured my site for days on end, and sleepless nights to make sure this duplicate content never happens again!

This 58 message thread spans 2 pages: 58