This dup finder saves time by comparing size first, then md5sum, it doesn't delete anything, just lists them.
Why remember? Generate! Up to 48 chars, works on any unix-like system (NB: BSD use md5 instead of md5sum) Show Sample Output
This one-liner will the *delete* without any further confirmation all 100% duplicates but one based on their md5 hash in the current directory tree (i.e including files in its subdirectories).
Good for cleaning up collections of mp3 files or pictures of your dog|cat|kids|wife being present in gazillion incarnations on hd.
md5sum can be substituted with sha1sum without problems.
The actual filename is not taken into account-just the hash is used.
Whatever sort thinks is the first filename is kept.
It is assumed that the filename does not contain 0x00.
As per the good suggestion in the first comment, this one does a hard link instead:
find . -xdev -type f -print0 | xargs -0 md5sum | sort | perl -ne 'chomp; $ph=$h; ($h,$f)=split(/\s+/,$_,2); if ($h ne $ph) { $k = $f; } else { unlink($f); link($k, $f); }'
Show Sample Output
Thanks to OpenSSL, you can quickly and easily generate MD5 hashes for your passwords.
Alternative (thanks to linuxrawkstar and atoponce):
echo -n 'text to be encrypted' | md5sum -
Note that the above method does not utlise OpenSSL.
Show Sample Output
Calculates md5 sum of files. sort (required for uniq to work). uniq based on only the hash. use cut ro remove the hash from the result.
Original author unknown (I believe off of a wifi hacking forum). Used in conjuction with ifconfig and cron.. can be handy (especially spoofing AP's) Show Sample Output
This can be much faster than downloading one or both trees to a common servers and comparing the files there. After, only those files could be copied down for deeper comparison if needed. Show Sample Output
usage: sitepass MaStErPaSsWoRd example.com description: An admittedly excessive amount of hashing, but this will give you a pretty secure password, It also eliminates repeated characters and deletes itself from your command history. tr '!-~' 'P-~!-O' # this bit is rot47, kinda like rot13 but more nerdy rev # this avoids the first few bytes of gzip payload, and the magic bytes. Show Sample Output
Command makes use of the Malware Hash Registry (http://www.team-cymru.org/Services/MHR/). It parses the current directory and subdirectories and calculates the md5 hash of the files, then prints the name and sends the hash to the MHR for a lookup in their database. The 3rd value in the result is the detection percentage across a mix of AV packages. Show Sample Output
[re]verify those burned CD's early and often - better safe than sorry - at a bare minimum you need the good old `dd` and `md5sum` commands, but why not throw in a super "user-friendly" progress gauge with the `pv` command - adjust the ``-s'' "size" argument to your needs - 700 MB in this case, and capture that checksum in a "test.md5" file with `tee` - just in-case for near-future reference. *uber-bonus* ability - positively identify those unlabeled mystery discs - for extra credit, what disc was used for this sample output? Show Sample Output
Found this little gem here: http://info.michael-simons.eu/2008/10/25/recursively-md5sum-all-files-in-a-directory-tree/
this string of commands will release your dhcp address, change your mac address, generate a new random hostname and then get a new dhcp lease.
Next time you are leaching off of someone else's wifi use this command before you start your bittorrent ...for legitimate files only of course. It creates a hexidecimal string using md5sum from the first few lines of /dev/urandom and splices it into the proper MAC address format. Then it changes your MAC and resets your wireless (wlan0:0). Show Sample Output
This is usefull to diff 2 paths in branches of software, or in different versions of a same zip file. So you can get the real file diff. Show Sample Output
All valid files are withheld so only failures show up. No output, all checks good.
Useful if you want get all the md5sum of files but you want exclude some directories. If your list of files is short you can make in one command as follow:
find . -type d \( -name DIR1 -o -name DIR2 \) -prune -o -type f -exec md5sum {} \;
Alternatively you can specify a different command to be executed on the resulting files.
Note: Replace 200000 with drive bytes/512, and /dev/sdx with the destination drive/partition. ;)
Note: You may need to install pipebench, this is easy with "sudo apt-get install pipebench" on Ubuntu.
The reason I hunted around for the pieces to make up this command is that I wanted to specifically flip all of the bits on a new HDD, before running an Extended SMART Self-Test (actually, the second pass, as I've already done one while factory-zeroed) to ensure there are no physical faults waiting to compromise my valuable data. There were several sites that came up in a Google search which had a zero-fill command with progress indicator, and one or two with a fill-with-ones command, but none that I could find with these two things combined (I had to shuffle around the dd command(s) to get this to happen without wasting speed on an md5sum as well).
For reference, these are the other useful-looking commands I found in my search:
Zero-fill drive "/dev/sdx", with progress indicator and md5 verification (run sudo fdisk -l to get total disk bytes, then divide by 512 and enter the resulting value into this command for a full wipe)
dd if=/dev/zero bs=512 count=<size/512> | pipebench | sudo tee /dev/sdx | md5sum
And this command for creating a file filled with ones is my other main source (besides the above command and man pages, that is - I may be a Linux newbie but I do read!):
tr '\000' '\377' < /dev/zero | dd of=allones bs=1024 count=2k
Hope someone finds this useful! :)
Cheers,
- Gliktch
Show Sample Output
This is mostly for my own notes but this command will compute a md5 message digest from the command line. You can also replace md5sum with other checksum commands (e.g., sha1sum) Show Sample Output
This was useful to generate random passwords to some webpage users, using the sample code, inside a bash script Show Sample Output
Improvement of the command "Find Duplicate Files (based on size first, then MD5 hash)" when searching for duplicate files in a directory containing a subversion working copy. This way the (multiple dupicates) in the meta-information directories are ignored.
Can easily be adopted for other VCS as well. For CVS i.e. change ".svn" into ".csv":
find -type d -name ".csv" -prune -o -not -empty -type f -printf "%s\n" | sort -rn | uniq -d | xargs -I{} -n1 find -type d -name ".csv" -prune -o -type f -size {}c -print0 | xargs -0 md5sum | sort | uniq -w32 --all-repeated=separate
Show Sample Output
commandlinefu.com is the place to record those command-line gems that you return to again and again. That way others can gain from your CLI wisdom and you from theirs too. All commands can be commented on, discussed and voted up or down.
Every new command is wrapped in a tweet and posted to Twitter. Following the stream is a great way of staying abreast of the latest commands. For the more discerning, there are Twitter accounts for commands that get a minimum of 3 and 10 votes - that way only the great commands get tweeted.
» http://twitter.com/commandlinefu
» http://twitter.com/commandlinefu3
» http://twitter.com/commandlinefu10
Use your favourite RSS aggregator to stay in touch with the latest commands. There are feeds mirroring the 3 Twitter streams as well as for virtually every other subset (users, tags, functions,…):
Subscribe to the feed for: