In these examples my domain is before the c portion. It should disallow all links with this format. All of them seem to be from the category "C" but that page should not exist in that format. For example these should also be disallowed:
All I can figure is that Google picked these up when I was playing with pretty links a very long time ago... Thanks!
Msg#: 4334375 posted 11:22 am on Jul 3, 2011 (gmt 0)
Go into Google WMTs and have the C and A directories removed from the index permanently which should stop the crawling. Blocking it in robots.txt is just a easy, block it as a path of 'c/' and 'a/' unless you use those for anything else.
Msg#: 4334375 posted 7:18 pm on Oct 31, 2011 (gmt 0)
Hi, its not going to work blocking the urls or directories which do not present on your server. If they are not there, which you are going to block? The googlebot is picking it from somewhere else and not from your root directory. You can control your own server but not of others. You may have submitted an article that contains those urls as anchor link (of course by article directories' programming error), now you need to find out the source and correct the links to point to the valid url. In the meantime you can do a 301 redirect to some page on your website to avoid 404 error.