Welcome to WebmasterWorld Guest from 188.8.131.52 , register , free tools , login , search , subscribe , help , library , announcements , recent posts , open posts Pubcon Website
robots.txt disallow searching reoccuring folders jchristi msg:4356402 8:10 pm on Aug 29, 2011 (gmt 0) I'm searching my heart out, but must not be using the correct terms to get to the following answer. We have a large site that has plenty of subdirectories of similar construction. Within each subdir, there are specific folders that we want to disallow searches of. I'm trying to ascertain how little I can get by with in the robots.txt file to achieve the desired results. If the server has the four following structures: www.server.com/Images/picture.gif www.server.com/subdirectory1/Images/picture.gif www.server.com/subdirectory2/Images/picture.gif www.server.com/subdirectory3/andevenmorefolderstructure/Images/picture.gif what disallow language would tell the search engines not to search any "Images" folder? Would */Images/ do it? or /Images/ */Images/ to catch the root, and then all deeper buried folders? or, must I specify entire subdir paths to the folders I want blocked? /Images/ /subdirectory1/Images/ /subdirectory2/Images/ /subdirectory3/andevenmorefolderstructure/Images/
g1smd msg:4356408 8:30 pm on Aug 29, 2011 (gmt 0)
The pattern matching is "from the left". So /images disallows any URL path "beginning" /images And /*images disallows any URL path "containing" images
jchristi msg:4356414 8:41 pm on Aug 29, 2011 (gmt 0)
thanks... and i need to include the trailing / ( /*images/ ) if I want to be sure i match only "images" folders and not "imageofsomething" folders?
g1smd msg:4356417 8:55 pm on Aug 29, 2011 (gmt 0)
Yes. phranque msg:4356446 10:34 pm on Aug 29, 2011 (gmt 0)
also note that the wildcarding/file globbing patterns are not specified in the robots exclusion protocol but rather are extensions supported by most of the big players including google. in other words, don't expect EVERY "well-behaved" bot to necessarily understand and respect your exclusions. jchristi msg:4356510 2:14 am on Aug 30, 2011 (gmt 0)
I understand it's not honored by everything. Basically, we have a Sharepoint deployment, and for every site listed below the domain, they have similar structures. I do not want to spell out every disallow for every site. if this gets me through the big dogs... that will be enough for me.