homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

Pattern Matching in Robots.txt !
Can someone please clarify a few points for me?

Msg#: 4156501 posted 4:07 am on Jun 22, 2010 (gmt 0)

I created the following Robots.txt file and want to block access to URLs ending in a certain pattern.

The URLs I want to block look like this:

http: //www.XYZ.com/-white/car.html
http: //www.XYZ.com/-blue/car.html

I dont want to block any URL that looks like this:

http: //www.XYZ.com/white-car.html
http: //www.XYZ.com/blue-car.html

so I have my robots.txt file as follows:

User-agent: *
Disallow: /*/car.html$

So this should only block URLs ending with " /car.html "
I want to make sure it only blocks URLs that have a " / " right before the "car.html" and not ones ending like" -car.html"

Can you guys let me know if I have this concept correct.
I would appreciate all feed back.




WebmasterWorld Administrator phranque us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

Msg#: 4156501 posted 1:28 am on Jun 23, 2010 (gmt 0)

welcome to WebmasterWorld [webmasterworld.com], James!

there is a robots.txt test function available in GWT, described at the bottom of this page:


Msg#: 4156501 posted 6:22 am on Aug 16, 2010 (gmt 0)

Hey, James -

If you want your pages blocked completely, you may want to go with Noindex...Disallowed pages can still be accessed by spiders, albeit in a very limited capacity.

Global Options:
 top home search open messages active posts  

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved