Forum Moderators: Robert Charlton & goodroi
Almost all of my 90,000 pages were deemed supplemental after a duplicate content issue with a script that google somehow found out about.
I tried to change the URL, and 301 the supplemental to the new url. The new URL after about 2 weeks, all also became supplemental.
Today, desperate to get my ecommerce site back to "normal", completely re-did all of my URLs, and actually deleted the old supplemental pages that used to 301, and the actual page they 301'd to. So now all supplemental pages in google return a custom generic 404 on my site. All products now have new URLs and directory structures.
Is this the right way to get my site back to normal?
[edited by: tedster at 6:35 am (utc) on May 4, 2007]
Sorry for all the questions, but I'm sure there are others on here who would like to know as well Thanks for all your help guys!
Yes, I am keenly interested. I have a very similar situation with one site. For years it had been entirely static HTML and ranked very well. Then I changed it to a CMS with all dynamic pages and most of the site went supplemental a bit over a year ago.
All pages of the site have been rebuilt back to static HTML in mid February and it is still suffering with 95% of the pages in the supplementals. The dynamic pages are gone, blocked in robots.txt and return 404's (I don't know how to make them return a 410). Also, they've been removed with the new tool in Webmaster Tools which works like a charm.
Most traffic is now coming from bookmarks and Yahoo, where searches for the site's content is frequently returned in positions 1-3 or at least within the first 3 pages. Searches for the content in Google will on occasion return the supplemental pages within the first few results, but that's a rare occurrence. Traffic from Yahoo has now grown to 30+ times the traffic from Google.
The indexed (dynamic) URLs should really return a 301 redirect, so that you still get to keep any search engine traffic that is clicking those "wrong URLs".
If I could do that I would, but they're all gone and I've no way to say which of the dynamic URL's belong to which of the static URL's even if they weren't gone. There are hundreds of them. My 404 page invites the user to enter the current site, and states that the content they were looking for is indeed still within. About half do enter the current site.
So the fact that I have re-done all of my URLs, and have 404's the old urls, what do you suggest I do?
Just keep going with the new URLs, and focus on increasing each URLs unique content, and so forth?
Or, since Google only picked up about 20 new links, do you think it's still early enough for me to go back, and redo it *once again* to the old url's, and just make sure there is no more duplicate content issue?
Appreciate the help!
My urls were in this structure due to ISAPI Rewrite:
[site...]
Google happened to find out what dynamic url was creating it, which was:
[site...]
Once google found out about this dynamic page, my whole site went into supplemental, except for some pages that were not generated by these scripts.
So I blocked robots.txt from accessing directoryA which is the directory that built everything.
I then started new URLs, and 301 redirected the supplemental URLs to the new URLs. All of those new URLs still turned supplemental. Arg! I thought it was due to a 301 sending all the information it knew about a page to the new page.
Due to this, I completely deleted all the scripts, and all of the old and new pages, and just started fresh.
What do you think?
Definite Causes Of Supplemental Results:
Canonical issues
Duplicate Content
Orphaned Content
Theoretical Causes Of Supplemental:
Meta Tag
Inbound link text
Pagerank
[edited by: Keniki at 11:46 pm (utc) on May 6, 2007]
>> With me changing the URLs, is this hurting my credibility with Google? <<You are signalling that you have a new site, and so you reset some of your age factors back to zero. That likely isn't a good thing. Cool URIs don't change.
Fix canonical domain and URL issues, replace dynamic URLs having more than two query parameters with static URLs, make sure spiders don't get assigned session IDs, redirect the old URLs to the new ones, and then stop changing your site's architecture.
Remove your Google Toolbar so you won't see the infrequently-updated-and-often-wrong PageRank Display, ignore all Google Update threads, and go back to writing unique, authoritative, truly-useful articles for your site, making sure that each page on your site has an accurate and unique <title> and <meta name="description" content="great description"> on it. Then chase some relevant links from supplier sites, customer sites, discussion boards related to your products, etc.
Supplementals are still in the index, so don't worry so much about them. As stated several times above, they still rank for some searches. Just let your site settle, age, get its TrustRank back, and work on other things.
If your traffic is tanked, then sign up for Adwords to get some traffic while your pages recover.
A bit of cool, calculating, emotional distance is required in this Web-world, and will save you some ulcers, too.
Jim
get its TrustRank back
Trustrank is the biggest load of bull#*$! I have ever heard of. Trust can be bought and taken away by scrapers. You think people aint noticed non profit sites like goverment or educational sites provide trust? Yes they have, I have people phoning me all day long selling links off these sites from local goverment to schools and universities too guardian too bbc.
The way google discovered my duplicate pages was because I had "Google Analytics" running on my site for a short period of time. During this time, I was testing the build script. It was *never* live, and was *never* exposed to anything but our test area.
I warn people to be careful what URLs you browse on your pages while Google Analytics is running!
Google has added about 13 page worth of links back, and traffic went back up. Today they slimmed 7 of those pages, so now I have 6 pages worth of non supplemental links, so traffic went back down.
They had a total of 72,000 pages indexed, and now they have 50,000.
I wonder if they are shuffling my pages around right now, and if the pages they added will come back into the index. Google vists my site everyday.
One page that was added went into supplementals, so I'm wondering if low PR is hurting it.
Any input?
My site is slowly making a comeback.
I did notice a bunch of my urls were added back into the index, and they were added as supplemental, and also ommitted. I searched and found this:
"If there is a link to click to see further "omitted results" then it is a warning about "duplicate content", often just title tags and/or meta descriptions that are the same or too similar. Make them all different.
Make sure that each piece of content on your site has only one canonical URL that is used to access it. Get all the others out of the index by using the meta robots noindex tag (preferred) on the other URL versions, or exclude them using robots.txt instead. Matt Cutts has mentioned it several times recently too. "
Every title, and meta description is unique to that page. The only thing I can think of, is the meta description is the first 250 characters of the item description. I thought this did not matter.
All of my duplicate content issues have been fixed, and all 301s are in place.
Any ideas?
Thanks guys!
Thank you for your input, it's very much appreciated. I will continue to wait.
Google puts them as supplemental because of the previous problem, right? Once they deem the pages are not returning a 200 ok status, they re-evaluate, correct?
Typically this process can take weeks?
EDIT: As you know, I changed all URLs, so the pages are actual pages, and not re-writes, just in case, as I'm paranoid of this happening again. The *NEW* urls that i created were put into supplemental, and ommitted, so they are being referenced against googles old "cache", right?
Once that happens you can do nothing more. Your measure of success in seeing what happens to all of your URLs that return "200 OK" status.
Your redirected URLs, where they appear in the SERPs, will continue to feed visitors to your site. The redirect will deliver that visitor to the correct URL for that content.
Job Done.
I hope I'm making sense :)