Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Robots and duplicate files

         

fsmobilez

10:39 pm on Dec 19, 2008 (gmt 0)

10+ Year Member



I want to know that if i block the urls (using robots.txt) which are already crawled by google and urls are above 10,000

When will google permanently remove them from its search or they will remain their forever

bcoz i tried on one of my site but google only removes the cached pages but still url was there in google

if the urls are blocked by robots but still they can be searched in google only with url will google consider it as duplicate

let say url is

www.example.com/anything/mypost&mainlink=
and i block the above urls and add this url
www.example.com/anything/mypost

how google will treat it

and is there any way to remove the urls from google using only robots file

to inform u its dynamic site i cant use nofollow link in the pages which are blocked by robots as all pages of site use same header.

Thanks For Ur time

tedster

7:11 am on Dec 20, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



If a url becomes searchable as url-only because it is disallowed in robots.txt, then that url is no longer spidered. Because Google no longer has current content, that url can only rank for keyword indicators that occur in backlinks.

You can use the url removal request in your WebmasterTools account to remove any url that is disallowed in your robots.txt file. Then it won't even occur as a url-only result.