Welcome to WebmasterWorld Guest from 22.214.171.124 , register , free tools , login , search , pro membership , help , library , announcements , recent posts , open posts Pubcon Platinum Sponsor 2014
Blocked Shot - Robots.txt blocking numerous duplicate pages using robots.txt underglass msg:4012488 12:32 am on Oct 24, 2009 (gmt 0) Can someone answer this for me? I need help. I have duplicate title problems and I need to block them using a robots.txt.
and then there are up to 142 other titles just like it.
How can I block or disallow this in robots.txt without blocking the directory/files and without having to use every number to up to 142?
phranque msg:4012538 3:05 am on Oct 24, 2009 (gmt 0)
the robots.txt matching is left-to-right, so if you Disallow: /blue-widgets/ that will exclude anything in the blue-widgets directory. note that you don't specify the domain in robots.txt. the pattern matching is limited so i'm not sure how you could specify a range of numbers.
if those urls are already indexed, blocking with robots.txt may not really solve your duplicate content problem nor will it prevent your url from being indexed in the future.
the proper way to solve this is probably to use 301 redirects to the canonical url. underglass msg:4012709 2:00 pm on Oct 24, 2009 (gmt 0)
Thank you, phranque.
What if you want /bluewidgets/ but want to block /bluewidgets/2/, /bluewidgets/3/, /bluewidgets/4/ and so on?
They have not been indexed, so far not yet.
phranque msg:4013037 2:41 pm on Oct 25, 2009 (gmt 0)
Disallow: /blue-widgets/0 Disallow: /blue-widgets/1 Disallow: /blue-widgets/2 Disallow: /blue-widgets/3 Disallow: /blue-widgets/4 Disallow: /blue-widgets/5 Disallow: /blue-widgets/6 Disallow: /blue-widgets/7 Disallow: /blue-widgets/8 Disallow: /blue-widgets/9 underglass msg:4013082 4:30 pm on Oct 25, 2009 (gmt 0)
My Worst Fears! Yikes. Good Halloween Scare! Just have to take my lumps with Google.
Thank you, phranque!
phranque msg:4014148 10:50 am on Oct 27, 2009 (gmt 0)
it might not be so bad.
since it matches left-to-right you only need those 10 rules to exclude all subdirectories of blue-widgets that begin with a numeric and their contents.