find -name *.bak -print0 | du -hc --files0-from=-

Show sizes and calculate sum of all files found by find


0
By: muffl0n
2013-08-23 14:15:33

These Might Interest You

  • * Find all file sizes and file names from the current directory down (replace "." with a target directory as needed). * sort the file sizes in numeric order * List only the duplicated file sizes * drop the file sizes so there are simply a list of files (retain order) * calculate md5sums on all of the files * replace the first instance of two spaces (md5sum output) with a \0 * drop the unique md5sums so only duplicate files remain listed * Use AWK to aggregate identical files on one line. * Remove the blank line from the beginning (This was done more efficiently by putting another "IF" into the AWK command, but then the whole line exceeded the 255 char limit). >>>> Each output line contains the md5sum and then all of the files that have that identical md5sum. All fields are \0 delimited. All records are \n delimited.


    0
    find . -type f -not -empty -printf "%-25s%p\n"|sort -n|uniq -D -w25|cut -b26-|xargs -d"\n" -n1 md5sum|sed "s/ /\x0/"|uniq -D -w32|awk -F"\0" 'BEGIN{l="";}{if(l!=$1||l==""){printf "\n%s\0",$1}printf "\0%s",$2;l=$1}END{printf "\n"}'|sed "/^$/d"
    alafrosty · 2013-10-22 13:34:19 0
  • Here's an annotated version of the command, using full-names instead of aliases. It is exactly equivalent to the short-hand version. # Recursively list all the files in the current directory. Get-ChildItem -Recurse | # Filter out the sub-directories themselves. Where-Object { return -not $_.PsIsContainer; } | # Group the resulting files by their extensions. Group-Object Extension | # Pluck the Name and Count properties of each group and define # a custom expression that calculates the average of the sizes # of the files in that group. # The back-tick is a line-continuation character. Select-Object ` Name, Count, @{ Name = 'Average'; Expression = { # Average the Length (sizes) of the files in the current group. return ($_.Group | Measure-Object -Average Length).Average; } } | # Format the results in a tabular view, automatically adjusted to # widths of the values in the columns. Format-Table -AutoSize ` @{ # Rename the Name property to something more sensible. Name = 'Extension'; Expression = { return $_.Name; } }, Count, @{ # Format the Average property to display KB instead of bytes # and use a formatting string to show it rounded to two decimals. Name = 'Average Size (KB)'; # The "1KB" is a built-in constant which is equal to 1024. Expression = { return $_.Average / 1KB }; FormatString = '{0:N2}' } Show Sample Output


    0
    ls -r | ?{-not $_.psiscontainer} | group extension | select name, count, @{n='average'; e={($_.group | measure -a length).average}} | ft -a @{n='Extension'; e={$_.name}}, count, @{n='Average Size (KB)'; e={$_.average/1kb}; f='{0:N2}'}
    brianpeiris · 2012-03-13 17:58:10 2

  • 1
    ls -l | awk '$5 > 1000000' | sort -k5n
    cfajohnson · 2009-11-20 21:18:45 3
  • no fancy grep stuff here.


    2
    du | sort -nr | cut -f2- | xargs du -hs
    unixmonkey7118 · 2009-11-20 17:14:40 1
  • List files and sizes


    0
    find / -type f -exec wc -c {} \; | sort -nr | head -100
    hemanth · 2009-08-22 09:38:44 0
  • Depending on your Apache access log configuration you may have to change the sum+=$11 to previous or next awk token. Beware, usually in access log last token is time of response in microseconds, penultimate token is size of response in bytes. You may use this command line to calculate sum and average of responses sizes. You can also refine the egrep regexp to match specific HTTP requests. Show Sample Output


    0
    egrep '.*(("STATUS)|("HEAD)).*' http_access.2012.07.18.log | awk '{sum+=$11; ++n} END {print "Tot="sum"("n")";print "Avg="sum/n}'
    fanchok · 2012-07-27 12:18:29 0

What do you think?

Any thoughts on this command? Does it work on your machine? Can you do the same thing with only 14 characters?

You must be signed in to comment.

What's this?

commandlinefu.com is the place to record those command-line gems that you return to again and again. That way others can gain from your CLI wisdom and you from theirs too. All commands can be commented on, discussed and voted up or down.

Share Your Commands



Stay in the loop…

Follow the Tweets.

Every new command is wrapped in a tweet and posted to Twitter. Following the stream is a great way of staying abreast of the latest commands. For the more discerning, there are Twitter accounts for commands that get a minimum of 3 and 10 votes - that way only the great commands get tweeted.

» http://twitter.com/commandlinefu
» http://twitter.com/commandlinefu3
» http://twitter.com/commandlinefu10

Subscribe to the feeds.

Use your favourite RSS aggregator to stay in touch with the latest commands. There are feeds mirroring the 3 Twitter streams as well as for virtually every other subset (users, tags, functions,…):

Subscribe to the feed for: