This dup finder saves time by comparing size first, then md5sum, it doesn't delete anything, just lists them.
Why remember? Generate! Up to 48 chars, works on any unix-like system (NB: BSD use md5 instead of md5sum) Show Sample Output
Calculates md5 sum of files. sort (required for uniq to work). uniq based on only the hash. use cut ro remove the hash from the result.
Original author unknown (I believe off of a wifi hacking forum). Used in conjuction with ifconfig and cron.. can be handy (especially spoofing AP's) Show Sample Output
This can be much faster than downloading one or both trees to a common servers and comparing the files there. After, only those files could be copied down for deeper comparison if needed. Show Sample Output
usage: sitepass MaStErPaSsWoRd example.com description: An admittedly excessive amount of hashing, but this will give you a pretty secure password, It also eliminates repeated characters and deletes itself from your command history. tr '!-~' 'P-~!-O' # this bit is rot47, kinda like rot13 but more nerdy rev # this avoids the first few bytes of gzip payload, and the magic bytes. Show Sample Output
Command makes use of the Malware Hash Registry (http://www.team-cymru.org/Services/MHR/). It parses the current directory and subdirectories and calculates the md5 hash of the files, then prints the name and sends the hash to the MHR for a lookup in their database. The 3rd value in the result is the detection percentage across a mix of AV packages. Show Sample Output
[re]verify those burned CD's early and often - better safe than sorry - at a bare minimum you need the good old `dd` and `md5sum` commands, but why not throw in a super "user-friendly" progress gauge with the `pv` command - adjust the ``-s'' "size" argument to your needs - 700 MB in this case, and capture that checksum in a "test.md5" file with `tee` - just in-case for near-future reference. *uber-bonus* ability - positively identify those unlabeled mystery discs - for extra credit, what disc was used for this sample output? Show Sample Output
Found this little gem here: http://info.michael-simons.eu/2008/10/25/recursively-md5sum-all-files-in-a-directory-tree/
This is usefull to diff 2 paths in branches of software, or in different versions of a same zip file. So you can get the real file diff. Show Sample Output
All valid files are withheld so only failures show up. No output, all checks good.
This was useful to generate random passwords to some webpage users, using the sample code, inside a bash script Show Sample Output
Improvement of the command "Find Duplicate Files (based on size first, then MD5 hash)" when searching for duplicate files in a directory containing a subversion working copy. This way the (multiple dupicates) in the meta-information directories are ignored.
Can easily be adopted for other VCS as well. For CVS i.e. change ".svn" into ".csv":
find -type d -name ".csv" -prune -o -not -empty -type f -printf "%s\n" | sort -rn | uniq -d | xargs -I{} -n1 find -type d -name ".csv" -prune -o -type f -size {}c -print0 | xargs -0 md5sum | sort | uniq -w32 --all-repeated=separate
Show Sample Output
Finds duplicates based on MD5 sum. Compares only files with the same size. Performance improvements on:
find -not -empty -type f -printf "%s\n" | sort -rn | uniq -d | xargs -I{} -n1 find -type f -size {}c -print0 | xargs -0 md5sum | sort | uniq -w32 --all-repeated=separate
The new version takes around 3 seconds where the old version took around 17 minutes. The bottle neck in the old command was the second find. It searches for the files with the specified file size. The new version keeps the file path and size from the beginning.
This is a beginning script. You can create a file with > filename. You can also use diff to compare output run at different times to verify no change in your files. I apologize in advance if this is too simple. For some it should be a start.
I had the problem that the Md5 Sum of a file changed after copying it to my external disk. This unhandy command helped me to fix the problem. Show Sample Output
Generates the md5 hash, without the trailing " -" and with the output "broken" into pairs of hexs. Show Sample Output
commandlinefu.com is the place to record those command-line gems that you return to again and again. That way others can gain from your CLI wisdom and you from theirs too. All commands can be commented on, discussed and voted up or down.
Every new command is wrapped in a tweet and posted to Twitter. Following the stream is a great way of staying abreast of the latest commands. For the more discerning, there are Twitter accounts for commands that get a minimum of 3 and 10 votes - that way only the great commands get tweeted.
» http://twitter.com/commandlinefu
» http://twitter.com/commandlinefu3
» http://twitter.com/commandlinefu10
Use your favourite RSS aggregator to stay in touch with the latest commands. There are feeds mirroring the 3 Twitter streams as well as for virtually every other subset (users, tags, functions,…):
Subscribe to the feed for: