Remove duplicate entries in a file without sorting.

awk '!x[$0]++' <file>
Using awk, find duplicates in a file without sorting, which reorders the contents. awk will not reorder them, and still find and remove duplicates which you can then redirect into another file.

By: din7
2009-12-20 02:33:21

4 Alternatives + Submit Alt

What Others Think

sort | uniq OR sort -u
KevinM · 631 weeks and 4 days ago
Yes, but that sorts all the rest of the data in as well. awk will leave the rest of the data alone.
din7 · 631 weeks and 4 days ago
It's very clever, din7, but you need to describe it better. It doesn't FIND the duplicates in a file, it REMOVES them.
flatcap · 631 weeks and 3 days ago
I generally pass stdout to this command then redirect into another file so I can just see duplicates. The command above is in its original context. Even so, having used this several times in its original context I haven't seen where it actually removes duplicates without further modification. It seems to me that it just prints the duplicates.
din7 · 631 weeks and 3 days ago
It prints the lines that aren't duplicated, too. That's why what it's doing is removing the duplicates. echo -e "aaa\nbbb\naaa"|awk \!'x[$0]++' outputs aaa bbb not just "aaa"
dennisw · 631 weeks and 2 days ago
I see what you mean now.
din7 · 631 weeks and 2 days ago
I thought flatcap was saying that it modifies the file when the command is executed.
din7 · 631 weeks and 2 days ago
Both solutions are very elegant and easily replicated in unix. thanks.
csj565 · 554 weeks and 2 days ago
This works great for cleaning up a large .bash_history
Sepero · 179 weeks and 1 day ago

What do you think?

Any thoughts on this command? Does it work on your machine? Can you do the same thing with only 14 characters?

You must be signed in to comment.

What's this? is the place to record those command-line gems that you return to again and again. That way others can gain from your CLI wisdom and you from theirs too. All commands can be commented on, discussed and voted up or down.

Share Your Commands

Stay in the loop…

Follow the Tweets.

Every new command is wrapped in a tweet and posted to Twitter. Following the stream is a great way of staying abreast of the latest commands. For the more discerning, there are Twitter accounts for commands that get a minimum of 3 and 10 votes - that way only the great commands get tweeted.


Subscribe to the feeds.

Use your favourite RSS aggregator to stay in touch with the latest commands. There are feeds mirroring the 3 Twitter streams as well as for virtually every other subset (users, tags, functions,…):

Subscribe to the feed for: