Hide

What's this?

commandlinefu.com is the place to record those command-line gems that you return to again and again.

Delete that bloated snippets file you've been using and share your personal repository with the world. That way others can gain from your CLI wisdom and you from theirs too. All commands can be commented on, discussed and voted up or down.


If you have a new feature suggestion or find a bug, please get in touch via http://commandlinefu.uservoice.com/

Get involved!

You can sign-in using OpenID credentials, or register a traditional username and password.

First-time OpenID users will be automatically assigned a username which can be changed after signing in.

Hide

Stay in the loop…

Follow the Tweets.

Every new command is wrapped in a tweet and posted to Twitter. Following the stream is a great way of staying abreast of the latest commands. For the more discerning, there are Twitter accounts for commands that get a minimum of 3 and 10 votes - that way only the great commands get tweeted.

» http://twitter.com/commandlinefu
» http://twitter.com/commandlinefu3
» http://twitter.com/commandlinefu10

Subscribe to the feeds.

Use your favourite RSS aggregator to stay in touch with the latest commands. There are feeds mirroring the 3 Twitter streams as well as for virtually every other subset (users, tags, functions,…):

Subscribe to the feed for:

Hide

News

2011-03-12 - Confoo 2011 presentation
Slides are available from the commandlinefu presentation at Confoo 2011: http://presentations.codeinthehole.com/confoo2011/
2011-01-04 - Moderation now required for new commands
To try and put and end to the spamming, new commands require moderation before they will appear on the site.
2010-12-27 - Apologies for not banning the trolls sooner
Have been away from the interwebs over Christmas. Will be more vigilant henceforth.
2010-09-24 - OAuth and pagination problems fixed
Apologies for the delay in getting Twitter's OAuth supported. Annoying pagination gremlin also fixed.
Hide

Tags

Hide

Functions

Commands using uniq from sorted by
Terminal - Commands using uniq - 191 results
awk '{ print $9 }' access.log | sort | uniq -c | sort -nr | head -n 10
awk '/Dec\/2012/ {print $1,$8}' logfile | grep -ivE '(.gif|.jpg|.png|favicon|.css|.js|robots.txt|wp-l|wp-term)' | sort | uniq -c | sort -rn | head -n 20
uniq -c | sed -r 's/([0-9]+)\s(.*)/"\2": \1,/;$s/,/\n}/;1i{'
parallel -j 50 ssh {} "ls" ::: host1 host2 hostn | sort | uniq -c
2013-04-12 11:56:41
User: macoda
Functions: sort ssh uniq
1

parallel can be installed on your central node and can be used to run a command multiple times.

In this example, multiple ssh connections are used to run commands. (-j is the number of jobs to run at the same time). The result can then be piped to commands to perform the "reduce" stage. (sort then uniq in this example).

This example assumes "keyless ssh login" has been set up between the central node and all machines in the cluster.

bashreduce may also do what you want.

svn ls -R | egrep -v -e "\/$" | tr '\n' '\0' | xargs -0 svn blame | awk '{print $2}' | sort | uniq -c | sort -nr
2013-04-10 19:37:53
User: rymo
Functions: awk egrep ls sort tr uniq xargs
Tags: svn count
1

make usable on OSX with filenames containing spaces. note: will still break if filenames contain newlines... possible, but who does that?!

netstat -antu | awk '{print $5}' | awk -F: '{print $1}' | sort | uniq -c | sort -n
2013-04-08 19:46:41
User: wejn
Functions: awk netstat sort uniq
-1

Output contains also garbage (text parts from netstat's output) but it's good enough for quick check who's overloading your server.

git log | grep Date | awk '{print " : "$4" "$3" "$6}' | uniq -c
find /some/path -type f -printf '%f\n' | grep -o '\..\+$' | sort | uniq -c | sort -rn
2013-03-18 14:42:29
User: skkzsh
Functions: find grep sort uniq
2

Get the longest match of file extension (Ex. For 'foo.tar.gz', you get '.tar.gz' instead of '.gz')

find /some/path -type f | gawk -F/ '{print $NF}' | gawk -F. '/\./{print $NF}' | sort | uniq -c | sort -rn
2013-03-18 14:40:26
User: skkzsh
Functions: find gawk sort uniq
0

If you have GNU findutils, you can get only the file name with

find /some/path -type f -printf '%f\n'

instead of

find /some/path -type f | gawk -F/ '{print $NF}'
DATE=`date +"%H:%M" --date '-1 min'`; egrep "\ $DATE\:..\ " /var/log/dhcpd.log |awk '/DHCPREQUEST/ {split($3,t,":"); printf("%02d:%02d\n",t[1],t[2]);}' |uniq -c;
while (true); do date --utc; done | uniq -c
find . -type f -size +0 -printf "%-25s%p\n" | sort -n | uniq -D -w 25 | sed 's/^\w* *\(.*\)/md5sum "\1"/' | sh | sort | uniq -w32 --all-repeated=separate
2013-02-23 20:44:20
User: jimetc
Functions: find sed sh sort uniq
0

Avoids the nested 'find' commands but doesn't seem to run any faster than syssyphus's solution.

sort file.txt | uniq -c | sort -k1nr -k2d
2013-01-28 22:21:05
User: westonruter
Functions: sort uniq
Tags: bash sorting
0

I used to do this sorting with:

sort file.txt | uniq -c | sort -nr

But this would cause the line (2nd column) to be sorted in descending (reverse) order as well sa the 1st column. So this will ensure the 2nd column is in ascending alphabetical order.

find-duplicates () { find "$@" -not -empty -type f -printf "%s\0" | sort -rnz | uniq -dz | xargs -0 -I{} -n1 find "$@" -type f -size {}c -print0 | xargs -0 md5sum | sort | uniq -w32 --all-repeated=separate; }
2013-01-23 23:20:26
User: mpeschke
Functions: find md5sum sort uniq xargs
-1

This is a modified version of the OP, wrapped into a bash function.

This version handles newlines and other whitespace correctly, the original has problems with the thankfully rare case of newlines in the file names.

It also allows checking an arbitrary number of directories against each other, which is nice when the directories that you think might have duplicates don't have a convenient common ancestor directory.

find $folder -name "[1-9]*" -type f -print|while read file; do echo $file $(sed -e '/^$/Q;:a;$!N;s/\n //;ta;s/ /_/g;P;D' $file|awk '/^Received:/&&!r{r=$0}/^From:/&&!f{f=$0}r&&f{printf "%s%s",r,f;exit(0)}');done|sort -k 2|uniq -d -f 1
2013-01-21 22:50:51
User: lpb612
Functions: awk echo find read sed sort uniq
1

# find assumes email files start with a number 1-9

# sed joins the lines starting with " " to the previous line

# gawk print the received and from lines

# sort according to the second field (received+from)

# uniq print the duplicated filename

# a message is viewed as duplicate if it is received at the same time as another message, and from the same person.

The command was intended to be run under cron. If run in a terminal, mutt can be used:

mutt -e "push otD~=xq" -f $folder

tail -1000 `ls -ltr /var/log/CF* |tail -1|awk '{print $9}'`|cut -d "," -f 17|sort|uniq -c |sort -k2
2012-11-30 16:30:41
User: raindylong
Functions: awk cut sort tail uniq
0

count & sort one field of the log files , such as nginx/apache access log files .

find . -type f -print | awk -F'.' '{print $NF}' | sort | uniq -c
tshark -qr [cap] -z conv,tcp | awk '{printf("%s:%s:%s\n",$1,$3,$10)}' | awk -F: '{printf("%s %s %s\n",$1,$3,substr($5,1,length($5)-10))}' | sort | uniq -c | sort -nr
tcpdump -ntr NAME_OF_CAPTURED_FILE.pcap 'tcp[13] = 0x02 and dst port 80' | awk '{print $4}' | tr . ' ' | awk '{print $1"."$2"."$3"."$4}' | sort | uniq -c | awk ' {print $2 "\t" $1 }'
sort namesd.txt | uniq ?cd
2012-06-26 19:23:58
User: ankush108
Functions: sort uniq
0

The following displays only the entries that are duplicates.

netstat -tn | grep :80 | awk '{print $5}'| grep -v ':80' | cut -f1 -d: |cut -f1,2,3 -d. | sort | uniq -c| sort -n
2012-06-26 08:29:37
User: krishnan
Functions: awk cut grep netstat sort uniq
0

cut -f1,2 - IP range 16

cut -f1,2,3 - IP range 24

cut -f1,2,3,4 - IP range 24

git ls-files | xargs -n1 git blame --line-porcelain | sed -n 's/^author //p' | sort -f | uniq -ic | sort -nr
2012-06-02 22:04:36
User: hugopeixoto
Functions: sed sort uniq xargs
Tags: statistics git
1

Uses line-porcelain in git blame, which makes it easier to parse the output.

mysqlbinlog <logfiles> | grep exec | grep end_log_pos | cut -d' ' -f2- | cut -d: -f-2 | uniq -c
2012-05-30 09:42:21
User: theist
Functions: cut exec grep uniq
1

shows number of mysql bin log events (which are mysql server events) per minute, useful to check stress times postmortem

lsof +c 15 | awk '{print $1}' | sort | uniq -c | sort -rn | head
cat /var/log/nginx/access.log | grep -oe '^[0-9.]\+' | perl -ne 'system("geoiplookup $_")' | grep -v found | grep -oe ', [A-Za-z ]\+$' | sort | uniq -c | sort -n
2012-05-08 13:28:25
User: theist
Functions: cat grep perl sort uniq
Tags: sort uniq geoip
-1

Per country GET report, based on access log. Easy to transform to unique IP