Summarize size of all files of given type in all subdirectories (in bytes)

SUM=0; for FILESIZE in `find /tmp -type f -iname \*pdf -exec du -b {} \; 2>/dev/null | cut -f1` ; do (( SUM += $FILESIZE )) ; done ; echo "sum=$SUM"
This example summarize size of all pdf files in /tmp directory and its subdirectories (in bytes). Replace "/tmp" with directory path of your choice and "\*pdf" or even "-iname \*pdf" with your own pattern to match specific type of files. You can replace also parameter for du to count kilo or megabytes, but because of du rounding the sum will not be correct (especially with lot of small files and megabytes counting). In some cases you could probably use sth like this: du -cb `find /tmp -type f -iname \*pdf`|tail -n 1 But be aware that this second command CANNOT count files with spaces in their names and it will cheat you, if there are some files matching the pattern that you don't have rights to read. The first oneliner is resistant to such problems (it will not count sizes of files which you cant read but will give you correct sum of rest of them).
Sample Output

By: alcik
2009-03-05 17:16:52

1 Alternatives + Submit Alt

What Others Think

This can be with less typing if you have awk/gawk, and some options to find maybe aren't necessary (how many directories are named "dirname.pdf" ?) This is what I use: find /tmp -iname "*.pdf" -exec du -b {} \; | awk '{t=t+$1} END {print t}' Sometimes a solution to the "filenames with spaces" problem is to use find -name "whatever" -print0 | xargs -0 du which null-terminates the filenames, making the spaces not to be a problem. xargs is frequently used in concert with find, and in some cases your command will probably complete much sooner because instead of invoking du (as in this example) once on each file, xargs will invoke du on many arguments at once; still the same number of system calls, but fewer process creations. xargs can put the piped-in arguments at any location in a command using the -I (capital i) option.
bwoodacre · 654 weeks and 5 days ago
1) > how many directories are named "dirname.pdf" ? Well, I want it to be more universal. Without "-type f" and counting for "\*java" pattern on some source tree will get me in troubles. I have here also some dirs called "java". The same with "\*mp3" on a whole system. 2) You perfectly right. I was looking into some "-print" use but haven't got to "-print0". And I do not know "xargs" enough - I have to change it ;-). So the command could look like this: find /tmp -type f -name \*pdf -print0 2>/dev/null | xargs -0 du -bc | tail -n 1 It is MUCH, MUCH faster. Thanks. :-D
alcik · 654 weeks and 5 days ago
bubo · 612 weeks and 4 days ago

What do you think?

Any thoughts on this command? Does it work on your machine? Can you do the same thing with only 14 characters?

You must be signed in to comment.

What's this? is the place to record those command-line gems that you return to again and again. That way others can gain from your CLI wisdom and you from theirs too. All commands can be commented on, discussed and voted up or down.

Share Your Commands

Stay in the loop…

Follow the Tweets.

Every new command is wrapped in a tweet and posted to Twitter. Following the stream is a great way of staying abreast of the latest commands. For the more discerning, there are Twitter accounts for commands that get a minimum of 3 and 10 votes - that way only the great commands get tweeted.


Subscribe to the feeds.

Use your favourite RSS aggregator to stay in touch with the latest commands. There are feeds mirroring the 3 Twitter streams as well as for virtually every other subset (users, tags, functions,…):

Subscribe to the feed for: