Forum Moderators: Robert Charlton & goodroi
I modified my robots.txt to "disallow" following both "osCsid" and "cPath" URLs. I thought this was a good thing.
I just noticed that my PR dropped.
I then did a Link Popularity check and see that there are a few thousand less links pointing to my site. About the same number that robots.txt are now disallowing.
It seems that I gave myself more incoming links and higher PR by having duplicate content. Is this correct? And what's worse reduced link popularity or multiple URLs serving up same content?
:)
You mean you had your PR updated for existing URLs just recently?
Are you sure?
Isn't it that you chose the wrong set of URLs to disallow? I mean you had duplicates of pages... so there was more than one choice on what to drop. On what did you base your decision on which ones to keep?
It could be that the ones you disallowed actually had a higher PR, or incoming links or... whatever.
When i read your post i kinda thought that you actually made a 3rd kind of URL for each page, disallowing the previous two. In that case i wouldn't wonder that the PR is gone. ( Why not make a sitewide redirect for the URLs if the pattern is this simple? )
I think a domain voting for itself was the second thing G fixed in 2001. Unless of course your main page doesn't have any links only subpages do, and they pass it up to the root level but... i thought that this was impossible. Okay at least i never saw any site that worked like this.
I'm just guessing though :P
So what i meant was that if there were NO incoming links to these URLs then the idea of removing them ( and their links, votes for the home page ) could cause the home page to have a lower PR than before... just sounds silly. For in that case, reversing this logic, having umpteen million pages with no PR pointing to their own domain root could "raise the importance" of the home page, which was - i believe - one of the first filters G implemented when they saw people actually doing this back in the early days...
And basically that's why i asked whether any of these dumped URLs could have any incoming links... and whether the right version of the URLs have been disallowed.
I should learn to make my point better i know :P
Or to make a point at all.
Yes. I used Google Webmaster Tools to test BEFORE I made my Robots.txt change. Here's what I did...
Disallow: /*?osCsid
Disallow: /*?cPath=2*&
Disallow: /*&sort=
By doing this I removed 1000s of duplicate content URL variations. (Yes, I removed "bad urls" and tested that "good urls" could still be crawled.) It seemed like a good thing to do re: Dupe Content. In some cases the osCsid variable was producing 10+ different URLs for the same page.
In some cases people had posted a link on their site back to my product page -- and the link included one of the variables I listed above (i.e. "bad url".)
Anyway, my results are so hammered lately...I'm doing KW searches that previously listed my site in the top 5 results. My search KWs are in my Title, Desc, and Page Content. Now, I can't find myself unless I include my domain name in the search. New results that are appearing on page 1 don't even have these KWs in the Title, Desc, etc and only some of the KWs are in the page text.
Also, there is a site appearing on page 1 results for my previous stellar KWs that is loaded with hidden text and URLs (Ctrl-A produces a TON of junior-grade black hat methods).
Sorry to go off-topic above, but it's frustrating because it seems Google is broken or, perhaps, there is a random rotation of suppressing ecommerce sites to increase their adwords revenue. (?)
I notice that MSN's cache of my pages are dated AFTER I made the "Disallow" change to my robots.txt file.
On the positive side, when searching in MSN for my KWs, I'm not deluged with Ebay pages containing expired auction results and spammy portal pages. i.e, MSN is delivering more useful results for my KWs than Google. I wish the NY Times or Wall Street Journal would test the SEs and write a story about Google's diminished SERP quality. They are probably shareholders so that won't happen!