| Count files in a directory and each subdirectory Enhanced script from popular 2006 thread |
TimD

msg:3816108 | 8:04 am on Dec 30, 2008 (gmt 0) | The following script is based on one written by jjc_mn, posted in Sept 2006 (http://www.webmasterworld.com//linux/3087663.htm). Judging by its high ranking in a Google search, I'm guessing that it has been quite a popular question. My version gets passed a directory name as a parameter. It then outputs a line for each sub-directory, showing the directory name (excluding the portion passed as a parameter), along with the number of files (tracks) within the directory. Hopefully someone will find it useful. Many thanks to jjc_mn for providing the starting point. #!/bin/bash #counts files in subdirectories #set -x top_dir=$1 if [ -z $top_dir ] then echo "Directory path must be given" exit -1 fi if [ ! -d $top_dir ] then echo "Directory path given is not valid" exit -2 fi echo "Analysing files in $top_dir..." date_stamp=`date +"%m%d%Y.%H%M%S"` temp_file="temp.$date_stamp.txt" done_loop="" find $top_dir -type d ¦ sort > $temp_file exec 3< $temp_file until [ $done_loop ] do read <&3 myline if [ $? != 0 ] then done_loop=1 continue fi dir_count=`find "$myline/." \( -name . -o -prune \) -type f ¦wc -l` if [ $dir_count != 0 ] then short_name=${myline:${#top_dir}} short_name=${short_name#/} short_name=${short_name:=$top_dir} count_desc="tracks" if [ $dir_count = 1 ] then count_desc="track" fi echo "$short_name ($dir_count $count_desc)" fi done rm $temp_file #EOF
|
maxchk

msg:3824992 | 1:49 am on Jan 13, 2009 (gmt 0) | Wouldn't it be simpler to run: find . -type f ¦ awk '{dir=gensub(/(.+\/).+/,"\\1","g",$0); dir_list[dir]++} END {for (d in dir_list) printf "%s %s\n",d,dir_list[d]}' ¦ sort or to exclude from the search hidden dirs/files: find . -type f ¦ awk '!/\/\..*/ {dir=gensub(/(.+\/).+/,"\\1","g",$0); dir_list[dir]++} END {for (d in dir_list) printf "%s %s\n",d,dir_list[d]}' ¦ sort
|
coopster

msg:3826419 | 8:57 pm on Jan 14, 2009 (gmt 0) | I'd like to welcome you both to WebmasterWorld ... maxchk, would you mind breaking that string down for us in a quick explanation of each part?
|
maxchk

msg:3826897 | 10:02 am on Jan 15, 2009 (gmt 0) | OK, easy. "find -type f" will make sure that we are looking ONLY for files, not directories and will give a list of filenames with a relative path to them including subdirectories awk '{ # $0 will contain relative path to the file # we don't care about file name, so will strip it # for details on gensub check awk man page # so, variable "dir" will hold the directory name dir=gensub(/(.+\/).+/,"\\1","g",$0); # add directory name to the hash "dir_list" # at the same time increase the counter by one # for that directory name dir_list[dir]++ END { # read the hash and print values # "d" will be directory name # hash element "dir_list[d]" will be a number # showing how many files are in that directory for (d in dir_list) printf "%s %s\n",d,dir_list[d] }' "sort" will sort output :) Extra part in second line (one to exclude hidden files) says: !/\/\..*/ what means: anything what has slash with following dot in a path consider as a hidden file/dir and ignore. Thanks you for welcoming us :)
|
phranque

msg:3826994 | 12:45 pm on Jan 15, 2009 (gmt 0) | welcome to WebmasterWorld [webmasterworld.com], TimD and maxchk! that is really sweet stuff! i haven't used awk enough recently. and i never thought about using find -type f that way. i've always wondered what the ls option was to include the path and now i know... =8)
|
coopster

msg:3827159 | 3:46 pm on Jan 15, 2009 (gmt 0) | | Thanks you for welcoming us |
| And thank you for breaking it down. | that is really sweet stuff! |
| Indeed. I'm always impressed by the power of the command line in the hand's of an artist.
|
TimD

msg:3827323 | 7:06 pm on Jan 15, 2009 (gmt 0) | Thanks all, for the words of welcome! Maxchk, I've gotta agree with Coopster - amazing what can be achieved at the command line, in the right hands! And thanks for breaking it down for the rest of us.
|
phranque

msg:3827529 | 12:17 am on Jan 16, 2009 (gmt 0) | this is a good time to point out the "flag it" thread option to the newbies because i am using it on this thread! =8)
|
coopster

msg:3827853 | 2:47 pm on Jan 16, 2009 (gmt 0) | I just read up a little on gensub (general substitution) and the reason it looks so familiar is that it is exactly what it appears to be, a regular expression. The first argument is the regex, the second the replacement (in this case a matched subpattern in the regex itself), the third argument here denotes "global" replacement and the last is the target. If no target is specified then the entire input record ($0) is used. So, if you really wanted to you could shorten the command by dropping the last argument in your gensub: gensub(/(.+\/).+/,"\\1","g"); I learned something new today! Thanks.
|
|
|