Forum Moderators: Robert Charlton & goodroi
I don't combine them into one because that would not be fair to affiliates who promote one page but then lose their commission because the customer paid through another method.
This system has worked great for me but now I'm wondering if I'll get penalized for this since I have 3 pretty much identical pages on one site.
For example I have mydomain.com, mydomain.com/affiliate1.htm, mydomain.com/affiliate2.htm
Am I going to get penalized for this and is there anything I can do?
All of the pages are indexed in Google. In fact, a couple of the affiliate pages are indexed with an affiliate's username (i.e. mydomain.com/affiliate1.htm?-affiliatedude)
So if I go excluding those pages I knock a potential revenue source and tick off an affiliate.
Should I just leave things be?
Just because they are indexed, does not mean you are not being hurt by it.
Different URLs -> same content = duplicate content.
You need to move on this fast, or you are going to lose your rankings. You've got 4 weeks tops before you lose your rankings.
Trust me - I speak from experience. I had a variations on what you have, it toook me a long time to figure out what happened, and then I had to wait weeks after I fixed it before my rankings were restored. The net result is that 2006 will end up being one of our worst years. Everything is fine now, thank goodness...
"All of the pages are indexed in Google."
In my case, I sometimes had more than 30 (!) URLs all pointing to the same content, all of them happily indexed by google, and my rankings sank like a stone. Just because they are indexed does not mean you do not have a problem.
You have duplicate content issues. Fix it now. The preferred fix is to use your httpd.conf and .htaccess files to do the fix. robots.txt is a blunt tool for a fine-tune job, but you might get away with it in your case.
** And with all due respect, you need to look at the rest of your site/urls and make sure you don't have other dup content issues. You seem to have the presence of midn to recofgnize this as dup content - but you may have inadvertently done this elsewhere and still not realize it. Leave no stone unturned in your search.
Trust me - fix this now.
I'm still trying to figure out you have to have 3 different pages, 1 for each affiliate program. Can you explain this is more detail? I find it difficult to believe that there is not soem way to get around that problem, and I'm sure we can provide several different solutions if we knew what the underlying block is.
Each page is the same sales page with a different payment processor.
As you may or may not know there are various affiliate programs out there for those that sell software. Clickbank, Paydotcom, Sharesale, among hundreds of others.
I have multiple setup because some of my best affiliates are in countries not supported by one or another. For example, some affiliates are in Singapore and an affiliate processor won't pay there, etc.
It exposes me to a lot of places and has worked well for me so far.
Its regarding homepages that read like this: "mydomain.com/index.html"
Question: Lets say you start on the homepage which initially reads "http://www.mydomain.com"
Then you go to a section page on that same site and later decide to return back to the homepage. However this time the homepage reads: "http:www.mydomain.com/index.html"
Does google see this as duplicate content because there are two url's "http://www.mydomain.com" and "http://www.mydomain.com/index.html" now having the same content?
Does google look at this as dup content? Will I be penalized for this?
Should I change it to all read "http://www.domain.com" If I should change it, what would be the best way?
There appears to be different opinions on this subject and I would like to hear what is the best way to handle this to ensure no penalties, supplemental pages and better rankings on Google.
You'll need a 301 redirect from the index file name to "/" for each folder and the root.
You should update all internal links to no longer include the actual index file name in the link.
Make sure that when you link to a folder-based URL that you always end with a trailing / on the very end of the URL.
URLs that you redirect will show up as Supplmental for some while. You can safely ignore those.
Among other things can this be the cause of all my pages (except the homepage) being supplemental?
Also if I correct everything that is causing supplemental issues, is there any hope I will get back into the main index any time soon?
How long does getting from supplemental back to main index process take?
How about sites that have one page and then a "printable" version of the same page. Aren't they essentially doing the same thing?
Don't worry about supplementals. Those in and of themselves are not a problem. Some pages that are "supp" are not necessarily dup content. Supplemenals are simply supplemental content, for "whatever" reason. Maybe dupes, maybe just because they are of no extra value and google doesn't want them in the main index. Whatever. Doesn't matter. Do not worry too much about how many pages are or are not in supplemental. Yes, some dup cont pages can end up in the supp index, which makes sense, because they are supplemental pages now... But really, do not get too caught up about what you see there.
The possibility that any URLs you have put 'out there' are dup content is your big problem. You need to use httpd.conf to rewrite any damage that has been done, i.e. you need to permanently move, read: rewrite, all duplicate URLs (except for 1, keep reading) to the 1 stable URL you are going to go with. So, take the set of URLs that point to one page of content. Pick 1 that you want google to keep. Then, all the rest of the URLs, you need to use rewrite in httpd.conf to permanently change those URLs to point to the 1 URL you want google to know about.
Then, when google crawls your site, when they hit one of those URLs you are now trying to kill, your rewrites will intercept it, and change the URL to the 1 URL you want to keep, so that google will understand that there are, say, 3 URLs that have permanently moved into one URL now. So theoretically, the 1 URL will stay in the main index, and the others will get moved to supplemental.
So what will happen is, slowly, depending on how often google crawls your site, they will begin hitting those "bad" URLs, run into your rewrite, and be given the 1 URL you want google to have. So, over time google will begin to understand those bad URLs are no good and which URL you say is OK. Then, when they do their next data refresh, your new set of URLs, with the dupes now dropped out, will come in, and your rankings will / should come back.
It will take time, again, a few weeks at least. So, patience is needed at that point. Makes the fixes, then leave it alone.
Meanwhile, from what I understand, those dup URLs may actually remain in the supp index for 1 year, and then after that, they will finally drop out. This is why it doesn't matter if they are in the supp index, that won't affect your rankings.
Hope this helps more...
That's great advice. Seems like a ton of work. My question is this. Has anybody gotten out of supplemental and into main index without having to rewrite all url's?
Unless I am mistaken I do believe many have gotten out without rewriting url's? Can anybody confirm this?
My major problem was dup content from manufacturer's product descriptions which I have now corrected.
Since the removal of dup content google has pick up a couple thousand new pages but have placed them all into supplemental except for homepage which is in main index.
It appears google has recognized my efforts and now preparing to let me out into the main index.
Traffic has picked up a lot. In fact google loves to display my results in the highly visable "One Box" results.
I could be wrong but I think you can get out without an entire rewrite of url's.
P.S. In my opinion, and I may be wrong ; ), noindex is great if you do that _before_ the damage is done. But now that the damage has been done, and google knows about the bad dupe URLs, you need to do more than noindex -> you need to rewrite those bad URLs into the good URLs using rewrite rules and your httpd.conf. Kinda crappy, but you have to. Your httpd.conf is the road map you now need to provide google with so they know what the hell is going on and how to navigate their way, on your behalf, out of this duplicate content tar baby you got yourself into.
Couldn't a person use Google's "remove a page from the index" feature to remove an offending page once it has been incorrectly indexed?
I did this before when a members-only area of a site was crawled and it was delisted in about 48 hours.
The problem with that is, it isn't just about google. What about Yahoo? About.com? etc. And, even worse, what about other webmasters who may have linked to you, some you may be aware of, some you never dreamed of? What if they are using an URL you want eradicated? These other sources will subsequently get crawled and the URL you are trying to get rid of will once again resurface and you'll be back to Square 1. It'll be deja vu all over again.<grin>