Forum Moderators: goodroi

Message Too Old, No Replies

Banning all subdirectories in a directory without the files?

         

zver0

11:14 pm on Feb 13, 2006 (gmt 0)

10+ Year Member



I have a directory that has a lot of subdirectories, which I don't want SE to crawl. So I want to ban all subdirs, but without the parent dir and the files in it.

For example:

dir/file1, dir/file2, dir/file2 must be crawled
but
dir/dir1/, dir/dir2/, dir/dir3/ to be disallowed.

Is that possible without disallowing individually every subdir?

Dijkgraaf

11:24 pm on Feb 13, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Only if the file name and the directory names are sufficiently different.
For your example you would have
disallow: /dir/dir
or
disallow: /dir/d
which would disallow dir1, dir2, dir3 etc., without disallowing file1, file2 etc.

But I'd suspect that your directories have varying names.

In that case you would have to disallow the directories induvidually, or in groups if you can match the starting part of the directories without disallowing a file.

Little_G

11:29 pm on Feb 13, 2006 (gmt 0)

10+ Year Member



Hi,

The following script should help you to auto generate your robots.txt:
It's in PHP, hopes that not a problem,


<?php
header("Content-Type: text/plain");
$dir = $_SERVER['DOCUMENT_ROOT'];
if(!$dh = opendir($dir)){
die("error - could not open directory");
}
while (false!== ($filename = readdir($dh))) {
if($filename!= "." && $filename!= ".."){
if(is_dir($dir . $filename)){
$files[] = $filename;
}
}
}
echo "User-agent: *\n";
foreach($files as $file){
echo "Disallow: " . $dir . $file . "/\n";
}
?>

Andrew

zver0

1:17 pm on Feb 16, 2006 (gmt 0)

10+ Year Member



Thank you Andrew, the script did the job perfectly :)

Little_G

1:28 pm on Feb 16, 2006 (gmt 0)

10+ Year Member



Happy to help :)

Andrew