The command (above) will remove any duplicate rows based on the FIRST column of data in an un-sorted file.
The '$1' represents a positional parameter. You can change both instances of '$1' in the command to remove duplicates based on a different column, for instance, the third:
awk '{ if ($3 in stored_lines) x=1; else print; stored_lines[$3]=1 }' infile.txt > outfile.txt
Or you can change it to '$0' to base the removal on the whole row:
awk '{ if ($0 in stored_lines) x=1; else print; stored_lines[$0]=1 }' infile.txt > outfile.txt
** Note: I wouldn't use this on a MASSIVE file, unless you're RAM-rich ;) **
A shorter version of command #3014, using awk instead of sed. Useful when scraping websites with a script.
This one-liner will output installed packages sorted by size in Kilobytes. Show Sample Output
Make sure to run this command in your git toplevel directory. Modify `-j4` as you like. You can also run any arbitrary command beside `git pull` in parallel on all of your git submodules. Show Sample Output
When working with jailed environments you need to copy all the shared libraries to your jail environment. This is done by running ldd on a binary which needs to run inside the jail. This command will use the output from ldd to automatically copy the shared libraries to a folder of your choice. Show Sample Output
Note that the -i will not help in a script. Proper error checking is required. Show Sample Output
pyt 'Stairway to heaven - Led Zeppelin' pyt 'brain damage - Pink Floyd' No web browser or even X needed. Just a cli and internet connection! mplayer is pauseable and can skip ahead This may break if youtube changes their search html.
This command will add up RAM usage of all processes whose name contains "java" and output the sum of percentages in HRF. Also, unlike the original #15430, it wont fail on processes with a usage of >9.9%. Pleases note that this command wont work reliably in use cases where a significant portion of processes involved are using less than 0.1% of RAM, because they will be counted as "0", even though a great number of them could add up to significant amounts. Show Sample Output
Read all lines using decimal marker as point, then add all them up and outputs the result. Show Sample Output
List top 20 IP from which TCP connection is in SYN_RECV state. Useful on web servers to detect a syn flood attack. Replace SYN_ with ESTA to find established connections Show Sample Output
This will tell you who has the most Apache connections by IP (replace IPHERE with the actual IP you wish to check). Or if you wish, remove | grep -c IPHERE for the full list.
This is a 'killall' command equivalent where it is not available. Prior to executing it, set the environment variable USERNAME to the username, whose processes you want to kill or replace the username with the $USERNAME on the command above. Side effect: If any processes from other users, are running with a parameter of $USERNAME, they will be killed as well (assuming you are running this as root user) [-9] in square brackets at the end of the command is optional and strongly suggested to be your last resort. I do not like to use it as the killed process leaves a lot of mess behind.
Displays six rows and five columns of random numbers between 0 and 1. If you need only one column, you can dispense with the "for" loop. Show Sample Output
commandlinefu.com is the place to record those command-line gems that you return to again and again. That way others can gain from your CLI wisdom and you from theirs too. All commands can be commented on, discussed and voted up or down.
Every new command is wrapped in a tweet and posted to Twitter. Following the stream is a great way of staying abreast of the latest commands. For the more discerning, there are Twitter accounts for commands that get a minimum of 3 and 10 votes - that way only the great commands get tweeted.
» http://twitter.com/commandlinefu
» http://twitter.com/commandlinefu3
» http://twitter.com/commandlinefu10
Use your favourite RSS aggregator to stay in touch with the latest commands. There are feeds mirroring the 3 Twitter streams as well as for virtually every other subset (users, tags, functions,…):
Subscribe to the feed for: