Msg#: 3858711 posted 10:48 pm on Feb 26, 2009 (gmt 0)
We recently underwent some changes to our website, mainly url re-write. It has been done in such a way that after a certain point, the rest of the URL does not matter.
For Example:..... the true page is "/glow_categories/41/glow_sticks.html" however our server knows to go to category "41", so even "/glow_categories/41/webmasterworldqwerty.html" would still go to the same page. Make sense?
Anyway... we had written some links on our website incorrectly, and they appeared as /glow_categories/41/glowsticks.html. They still go to the same page as the original link posted above.
The links have been updated however Google have already indexed them. If I exclude these through my robots.txt file to get them removed from their index, will Google re index them although they are not linked to on any of our pages?
Msg#: 3858711 posted 11:12 pm on Feb 26, 2009 (gmt 0)
Consider 301-redirecting the incorrect links to the correct ones. This will explicitly tell the search engines that those URLs are wrong, and that the corrected ones should be indexed instead. I suspect you might see a faster disappearance of the bad URLs using this method, and any page rank accrued by the bad links will pass to the correct ones through the 301.
It has been done in such a way that after a certain point, the rest of the URL does not matter.
This is essentially creating a duplicate-content problem for yourself, and one that could be exploited by competitors. You should add logic to your page-generation script that tests the URL PathInfo against your database, and issues a 301 redirect to the correct URL if the PathInfo does not contain the canonical URL-path for that "page."