Welcome to WebmasterWorld Guest from 188.8.131.52
Forum Moderators: goodroi
inside js. Now that Google is indexing these js links, mysite.com is quickly filling google with pages I don't want indexed. Same 'stuff' for each client means dup content to google.
my question is, if I set robots.txt to dissallow /clients/ will google respect this and deindex the files? Or will it ignore robots.txt because there are links coming into the specific pages?
I've also read cases of google ignoring noindex on the pages themselves, but that may be a better option....?
If there are incoming links then you should expect to see 'URL-only' listings. No title, snipped, size or cache.
As far as I know, using noindex in a robots META tag or returning HTTP status 404 will remove the result if you remove the /robots.txt exclusion and the bot fetches the URL.