homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

robots.txt disallow searching reoccuring folders

 8:10 pm on Aug 29, 2011 (gmt 0)

I'm searching my heart out, but must not be using the correct terms to get to the following answer.

We have a large site that has plenty of subdirectories of similar construction.

Within each subdir, there are specific folders that we want to disallow searches of.

I'm trying to ascertain how little I can get by with in the robots.txt file to achieve the desired results.

If the server has the four following structures:


what disallow language would tell the search engines not to search any "Images" folder?

Would */Images/ do it?

or /Images/
to catch the root, and then all deeper buried folders?

or, must I specify entire subdir paths to the folders I want blocked?



 8:30 pm on Aug 29, 2011 (gmt 0)

The pattern matching is "from the left".

So /images disallows any URL path "beginning" /images

And /*images disallows any URL path "containing" images


 8:41 pm on Aug 29, 2011 (gmt 0)


and i need to include the trailing / ( /*images/ ) if I want to be sure i match only "images" folders and not "imageofsomething" folders?


 8:55 pm on Aug 29, 2011 (gmt 0)



 10:34 pm on Aug 29, 2011 (gmt 0)

also note that the wildcarding/file globbing patterns are not specified in the robots exclusion protocol but rather are extensions supported by most of the big players including google.
in other words, don't expect EVERY "well-behaved" bot to necessarily understand and respect your exclusions.


 2:14 am on Aug 30, 2011 (gmt 0)

I understand it's not honored by everything.

Basically, we have a Sharepoint deployment, and for every site listed below the domain, they have similar structures. I do not want to spell out every disallow for every site.

if this gets me through the big dogs... that will be enough for me.

Global Options:
 top home search open messages active posts  

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved