commandlinefu.com is the place to record those command-line gems that you return to again and again.
You can sign-in using OpenID credentials, or register a traditional username and password.
Subscribe to the feed for:
* Find all file sizes and file names from the current directory down (replace "." with a target directory as needed).
* sort the file sizes in numeric order
* List only the duplicated file sizes
* drop the file sizes so there are simply a list of files (retain order)
* calculate md5sums on all of the files
* replace the first instance of two spaces (md5sum output) with a \0
* drop the unique md5sums so only duplicate files remain listed
* Use AWK to aggregate identical files on one line.
* Remove the blank line from the beginning (This was done more efficiently by putting another "IF" into the AWK command, but then the whole line exceeded the 255 char limit).
>>>> Each output line contains the md5sum and then all of the files that have that identical md5sum. All fields are \0 delimited. All records are \n delimited.