It should straighten out in one or two spidering cycles, however long that is for your site. I would not do anything more for two weeks or so, at least.
Ted, thanks for confirming. It's a little surprising how sensitive google was to this change.
If this were to happen in the future, are we better off 301ing to the correct URL, or using the robots.txt to prevent it from getting crawled?
That's very hard to answer without a knowledge of how query strings work on your site. As a general rule, stop the server from resolving dupe urls. Then a 404 cures many ills. If there are important backlinks to the dupe url, then 301 if you can't get them changed. If you don't need any query string urls in the search engines, block them all with robots.txt. Depending on your schema for generating query strings, you may be able to block just some of them with robots.txt and allow others.
Whatever you do, create a work flow discipline that prevents future issues.
I once had someone do me the "favor" of linking to a site of mine with spurious query strings, intending to cause just this kind of dupe problem. So I temporarily lost the ranking of a few already-low-ranked pages, but in response, I quickly 301-redirected all of those bogus URLs to the correct URLs. Thanks for the PageRank!
It didn't last long of course, but keeping the redirect in place prevented any further such silliness.
This whole querystring causing duplicate content issues is a MAJOR issue. What if you were running ad campaigns with legit sources which pass you a querystring to show affiliate information, or campaign information?
Or better yet, what if you have a way for customers to sort information on the page and you're passing a string to define the sort option.
Google really has to do something about this.
|Google really has to do something about this. |
I'm not disagreeing, but I think Google is generally handling inbound query strings much better than Yahoo and MSN are. I can routinely expect problems on both of them, whereas problems I've seen on Google have been rare.
Jd has a good point. It's almost always better to 301 redirect as opposed to using robots.txt in this case to 'guide' Google as to what credit belongs to which page.