Welcome to WebmasterWorld Guest from 50.19.190.144

Forum Moderators: goodroi

Message Too Old, No Replies

What's the Proper Way to Block Sub Folders

Looking for the best way to block 2nd level sub folders

     
12:12 pm on Mar 28, 2007 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Feb 28, 2002
posts:1324
votes: 0


I'm looking at new site where I need to block the 2nd level of sub folders but still keep the first indexed, Never actually tried it this way so looking for some guidance. Here's the structure:

example.com/username1/foo/
example.com/username2/foo/
example.com/username3/foo/
...
example.com/username5000/foo/

There are going to be thousands of "username" folders and I want them indexed. However I don't want any of the "foo" folders indexed. Which is the best way to block those folders

User-agent: *
Disallow: /foo/

or

User-agent: *
Disallow: /*/foo/

We'll be using meta tags as well just want to keep the robots file in order.

12:52 pm on Mar 28, 2007 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:10542
votes: 8


according to the Web Server Administrator's Guide
to the Robots Exclusion Protocol
[robotstxt.org]:
Note also that regular expression are not supported in either the User-agent or Disallow lines. The '*' in the User-agent field is a special value meaning "any robot". Specifically, you cannot have lines like "Disallow: /tmp/*" or "Disallow: *.gif".

in other words, they so don't want to support wildcarding that they mistakenly refer to it as a regular expression, which they also don't want to support.

note also that grammar are not important to robots...

6:07 pm on Mar 28, 2007 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Feb 28, 2002
posts:1324
votes: 0


heh well I'm not so much concerned about the robots protocol as much as I am google who seems to be supporting it

[google.com...]

6:43 pm on Mar 28, 2007 (gmt 0)

Senior Member

WebmasterWorld Senior Member jimbeetle is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Oct 26, 2002
posts:3292
votes: 6


G and Y! support wildcards, I don't think Ask does as yet, nor most of all those other little pesky critters that flit around all over the net. So, anything blocked with wildcards is going wind up in the wild eventually anyway. I think the best bet is to use the robots meta. Just be sure not to block the subfolders in robots.txt so bots will be able to read and obey the instruction.

<added>
And of course, if there are links to the pages in the subfolders, the meta robots will ensure that those pages don't wind up as a URL only listing in the SERPs.
</added>

7:43 pm on Mar 28, 2007 (gmt 0)

Administrator from US 

WebmasterWorld Administrator goodroi is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:June 21, 2004
posts:3080
votes: 67


Google Pattern Matching Instructions [google.com]
Yahoo Wildcards Instructions [ysearchblog.com]
MSN Instructions [search.msn.com]<cough> careful with msn

also you may want to use htaccess

 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members