homepage Welcome to WebmasterWorld Guest from 54.226.173.169
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

    
Blocked Shot - Robots.txt
blocking numerous duplicate pages using robots.txt
underglass




msg:4012488
 12:32 am on Oct 24, 2009 (gmt 0)

Can someone answer this for me? I need help. I have duplicate title problems and I need to block them using a robots.txt.

www.mystore.com/blue-widgets/2/

and then there are up to 142 other titles just like it.

How can I block or disallow this in robots.txt without blocking the directory/files and without having to use every number to up to 142?

Thank you!

 

phranque




msg:4012538
 3:05 am on Oct 24, 2009 (gmt 0)

the robots.txt matching is left-to-right, so if you
Disallow: /blue-widgets/
that will exclude anything in the blue-widgets directory.
note that you don't specify the domain in robots.txt.
the pattern matching is limited so i'm not sure how you could specify a range of numbers.

if those urls are already indexed, blocking with robots.txt may not really solve your duplicate content problem nor will it prevent your url from being indexed in the future.
the proper way to solve this is probably to use 301 redirects to the canonical url.

underglass




msg:4012709
 2:00 pm on Oct 24, 2009 (gmt 0)

Thank you, phranque.

What if you want /bluewidgets/ but want to block /bluewidgets/2/, /bluewidgets/3/, /bluewidgets/4/ and so on?

They have not been indexed, so far not yet.

phranque




msg:4013037
 2:41 pm on Oct 25, 2009 (gmt 0)

Disallow: /blue-widgets/0
Disallow: /blue-widgets/1
Disallow: /blue-widgets/2
Disallow: /blue-widgets/3
Disallow: /blue-widgets/4
Disallow: /blue-widgets/5
Disallow: /blue-widgets/6
Disallow: /blue-widgets/7
Disallow: /blue-widgets/8
Disallow: /blue-widgets/9

underglass




msg:4013082
 4:30 pm on Oct 25, 2009 (gmt 0)

My Worst Fears! Yikes. Good Halloween Scare! Just have to take my lumps with Google.

Thank you, phranque!

phranque




msg:4014148
 10:50 am on Oct 27, 2009 (gmt 0)

it might not be so bad.

since it matches left-to-right you only need those 10 rules to exclude all subdirectories of blue-widgets that begin with a numeric and their contents.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved