Welcome to WebmasterWorld Guest from 54.242.53.253

Forum Moderators: goodroi

Message Too Old, No Replies

robots.txt disallow searching reoccuring folders

     
8:10 pm on Aug 29, 2011 (gmt 0)

New User

5+ Year Member

joined:Aug 29, 2011
posts: 3
votes: 0


I'm searching my heart out, but must not be using the correct terms to get to the following answer.

We have a large site that has plenty of subdirectories of similar construction.

Within each subdir, there are specific folders that we want to disallow searches of.

I'm trying to ascertain how little I can get by with in the robots.txt file to achieve the desired results.

If the server has the four following structures:

www.server.com/Images/picture.gif
www.server.com/subdirectory1/Images/picture.gif
www.server.com/subdirectory2/Images/picture.gif
www.server.com/subdirectory3/andevenmorefolderstructure/Images/picture.gif

what disallow language would tell the search engines not to search any "Images" folder?

Would */Images/ do it?

or /Images/
*/Images/
to catch the root, and then all deeper buried folders?


or, must I specify entire subdir paths to the folders I want blocked?
/Images/
/subdirectory1/Images/
/subdirectory2/Images/
/subdirectory3/andevenmorefolderstructure/Images/
8:30 pm on Aug 29, 2011 (gmt 0)

Senior Member

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:July 3, 2002
posts:18903
votes: 0


The pattern matching is "from the left".

So /images disallows any URL path "beginning" /images

And /*images disallows any URL path "containing" images
8:41 pm on Aug 29, 2011 (gmt 0)

New User

5+ Year Member

joined:Aug 29, 2011
posts: 3
votes: 0


thanks...

and i need to include the trailing / ( /*images/ ) if I want to be sure i match only "images" folders and not "imageofsomething" folders?
8:55 pm on Aug 29, 2011 (gmt 0)

Senior Member

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:July 3, 2002
posts:18903
votes: 0


Yes.
10:34 pm on Aug 29, 2011 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:10563
votes: 15


also note that the wildcarding/file globbing patterns are not specified in the robots exclusion protocol but rather are extensions supported by most of the big players including google.
in other words, don't expect EVERY "well-behaved" bot to necessarily understand and respect your exclusions.
2:14 am on Aug 30, 2011 (gmt 0)

New User

5+ Year Member

joined:Aug 29, 2011
posts: 3
votes: 0


I understand it's not honored by everything.

Basically, we have a Sharepoint deployment, and for every site listed below the domain, they have similar structures. I do not want to spell out every disallow for every site.

if this gets me through the big dogs... that will be enough for me.
 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members