| Website Disallowed through robots.txt, will google re index Website Disallowed through robots.txt, will google re index |
jackgordon

msg:3858713 | 10:48 pm on Feb 26, 2009 (gmt 0) | We recently underwent some changes to our website, mainly url re-write. It has been done in such a way that after a certain point, the rest of the URL does not matter. For Example:..... the true page is "/glow_categories/41/glow_sticks.html" however our server knows to go to category "41", so even "/glow_categories/41/webmasterworldqwerty.html" would still go to the same page. Make sense? Anyway... we had written some links on our website incorrectly, and they appeared as /glow_categories/41/glowsticks.html. They still go to the same page as the original link posted above. The links have been updated however Google have already indexed them. If I exclude these through my robots.txt file to get them removed from their index, will Google re index them although they are not linked to on any of our pages? Thanks
|
jdMorgan

msg:3858740 | 11:12 pm on Feb 26, 2009 (gmt 0) | Consider 301-redirecting the incorrect links to the correct ones. This will explicitly tell the search engines that those URLs are wrong, and that the corrected ones should be indexed instead. I suspect you might see a faster disappearance of the bad URLs using this method, and any page rank accrued by the bad links will pass to the correct ones through the 301. | It has been done in such a way that after a certain point, the rest of the URL does not matter. |
| This is essentially creating a duplicate-content problem for yourself, and one that could be exploited by competitors. You should add logic to your page-generation script that tests the URL PathInfo against your database, and issues a 301 redirect to the correct URL if the PathInfo does not contain the canonical URL-path for that "page." Jim
|
schumi

msg:3859729 | 9:41 am on Feb 28, 2009 (gmt 0) | Jackgordon, you will help SE decide which's the most matching url for the same content,so,just do it.
|
|
|