homepage Welcome to WebmasterWorld Guest from 23.20.149.27
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor
Home / Forums Index / Hardware and OS Related Technologies / Linux, Unix, and *nix like Operating Systems
Forum Library, Charter, Moderators: bakedjake

Linux, Unix, and *nix like Operating Systems Forum

    
Count files in a directory and each subdirectory
Enhanced script from popular 2006 thread
TimD




msg:3816108
 8:04 am on Dec 30, 2008 (gmt 0)

The following script is based on one written by jjc_mn, posted in Sept 2006 (http://www.webmasterworld.com//linux/3087663.htm). Judging by its high ranking in a Google search, I'm guessing that it has been quite a popular question.

My version gets passed a directory name as a parameter. It then outputs a line for each sub-directory, showing the directory name (excluding the portion passed as a parameter), along with the number of files (tracks) within the directory.

Hopefully someone will find it useful. Many thanks to jjc_mn for providing the starting point.

#!/bin/bash
#counts files in subdirectories
#set -x

top_dir=$1
if [ -z $top_dir ]
then
echo "Directory path must be given"
exit -1
fi
if [ ! -d $top_dir ]
then
echo "Directory path given is not valid"
exit -2
fi
echo "Analysing files in $top_dir..."

date_stamp=`date +"%m%d%Y.%H%M%S"`
temp_file="temp.$date_stamp.txt"
done_loop=""
find $top_dir -type d ¦ sort > $temp_file

exec 3< $temp_file
until [ $done_loop ]
do
read <&3 myline
if [ $? != 0 ]
then
done_loop=1
continue
fi

dir_count=`find "$myline/." \( -name . -o -prune \) -type f ¦wc -l`
if [ $dir_count != 0 ]
then
short_name=${myline:${#top_dir}}
short_name=${short_name#/}
short_name=${short_name:=$top_dir}
count_desc="tracks"
if [ $dir_count = 1 ]
then
count_desc="track"
fi
echo "$short_name ($dir_count $count_desc)"
fi
done
rm $temp_file
#EOF

 

maxchk




msg:3824992
 1:49 am on Jan 13, 2009 (gmt 0)

Wouldn't it be simpler to run:
find . -type f ¦ awk '{dir=gensub(/(.+\/).+/,"\\1","g",$0); dir_list[dir]++} END {for (d in dir_list) printf "%s %s\n",d,dir_list[d]}' ¦ sort

or to exclude from the search hidden dirs/files:
find . -type f ¦ awk '!/\/\..*/ {dir=gensub(/(.+\/).+/,"\\1","g",$0); dir_list[dir]++} END {for (d in dir_list) printf "%s %s\n",d,dir_list[d]}' ¦ sort

coopster




msg:3826419
 8:57 pm on Jan 14, 2009 (gmt 0)

I'd like to welcome you both to WebmasterWorld ...
maxchk, would you mind breaking that string down for us in a quick explanation of each part?

maxchk




msg:3826897
 10:02 am on Jan 15, 2009 (gmt 0)

OK, easy.

"find -type f" will make sure that we are looking ONLY for files, not directories and will give a list of filenames with a relative path to them including subdirectories

awk '{
# $0 will contain relative path to the file
# we don't care about file name, so will strip it
# for details on gensub check awk man page
# so, variable "dir" will hold the directory name
dir=gensub(/(.+\/).+/,"\\1","g",$0);
# add directory name to the hash "dir_list"
# at the same time increase the counter by one
# for that directory name
dir_list[dir]++

END {
# read the hash and print values
# "d" will be directory name
# hash element "dir_list[d]" will be a number
# showing how many files are in that directory
for (d in dir_list)
printf "%s %s\n",d,dir_list[d]
}'
"sort" will sort output :)

Extra part in second line (one to exclude hidden files) says:
!/\/\..*/
what means: anything what has slash with following dot in a path consider as a hidden file/dir and ignore.

Thanks you for welcoming us :)

phranque




msg:3826994
 12:45 pm on Jan 15, 2009 (gmt 0)

welcome to WebmasterWorld [webmasterworld.com], TimD and maxchk!

that is really sweet stuff!
i haven't used awk enough recently.
and i never thought about using find -type f that way.
i've always wondered what the ls option was to include the path and now i know...
=8)

coopster




msg:3827159
 3:46 pm on Jan 15, 2009 (gmt 0)

Thanks you for welcoming us

And thank you for breaking it down.

that is really sweet stuff!

Indeed. I'm always impressed by the power of the command line in the hand's of an artist.

TimD




msg:3827323
 7:06 pm on Jan 15, 2009 (gmt 0)

Thanks all, for the words of welcome!

Maxchk, I've gotta agree with Coopster - amazing what can be achieved at the command line, in the right hands! And thanks for breaking it down for the rest of us.

phranque




msg:3827529
 12:17 am on Jan 16, 2009 (gmt 0)

this is a good time to point out the "flag it" thread option to the newbies because i am using it on this thread!
=8)

coopster




msg:3827853
 2:47 pm on Jan 16, 2009 (gmt 0)

I just read up a little on gensub (general substitution) and the reason it looks so familiar is that it is exactly what it appears to be, a regular expression. The first argument is the regex, the second the replacement (in this case a matched subpattern in the regex itself), the third argument here denotes "global" replacement and the last is the target. If no target is specified then the entire input record ($0) is used. So, if you really wanted to you could shorten the command by dropping the last argument in your gensub:

gensub(/(.+\/).+/,"\\1","g");

I learned something new today! Thanks.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Hardware and OS Related Technologies / Linux, Unix, and *nix like Operating Systems
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
© Webmaster World 1996-2014 all rights reserved