shell function to find duplicate lines in a series of files or in stdin

dups() { sort "$@" | uniq -d; }

By: bartonski
2011-10-04 16:46:00

These Might Interest You

  • Thanks to knoppix5 for the idea :-) Print selected lines from a file or the output of a command. Usage: every NTH MAX [FILE] Print every NTH line (from the first MAX lines) of FILE. If FILE is omitted, stdin is used. The command simply passes the input to a sed script: sed -n -e "${2}q" -e "0~${1}p" ${3:-/dev/stdin} print no output sed -n quit after this many lines (controlled by the second parameter) -e "${2}q" print every NTH line (controlled by the first parameter) -e "0~${1}p" take input from $3 (if it exists) otherwise use /dev/stdin {3:-/dev/stdin} Show Sample Output

    function every() { sed -n -e "${2}q" -e "0~${1}p" ${3:-/dev/stdin}; }
    flatcap · 2015-04-03 01:30:36 4
  • Some commands (such as sed and perl) have options to support in-place editing of files, but many commands do not. This shell function enables any command to change files in place. See the sample output for many examples. The function uses plain sh syntax and works with any POSIX shell or derivative, including zsh and bash. Show Sample Output

    inplace() { eval F=\"\$$#\"; "$@" > "$F".new && mv -f "$F".new "$F"; }
    inof · 2010-04-09 11:36:31 8
  • Identify Movies but NOT TV Series using find and regex While it's easy to find video files, it's not easy to check wheter they are Movies or part of TV Series; this could be important if you need to move files before cataloguing them. Using Regex this could become possibile. Normally TV Series are names with Season and Episode numbers in the file name, this way: "X-Files S01E12 - Gna gna gna.avi" or "3x04.Falling.Skies.-.The.Revenge.mkv" and so on. This RegEx will find correct Episodes if they have the structure "S00E00" or "0E00" or "S00x00" or "0x00". Inversing RegEx makes the trick to find out Movies.

    find . -type f -regextype posix-extended ! -regex '^.*[S|s|\.| ]{0,1}[0-9]{1,2}[e|x][0-9][0-9].*\.(avi|mkv|srt)$' \( -iname "*.mkv" -or -iname "*.avi"-or -iname "*.srt" \)
    marcolino · 2017-05-08 10:56:21 0
  • i wanted to delete all duplicate lines from .bash_history and keep the order of the other lines. the command cat's the file and adds line numbers, then sorts by the second column. afterwards uniq omits repeated lines, but skips the first field (the line number). then it sorts by the line numbers and at the end cuts the numbers off.

    cat -n <file> | sort -k 2 | uniq -f 1 | sort -n | cut -f 2-
    fpunktk · 2010-01-21 18:55:58 3

What Others Think

As in a lot of other commands on commandlinefu this doesn't work with filenames with spaces.
depesz · 350 weeks and 3 days ago
*DoH* Edited to use "$@", which should expand correctly for file names containing spaces.
bartonski · 350 weeks and 3 days ago

What do you think?

Any thoughts on this command? Does it work on your machine? Can you do the same thing with only 14 characters?

You must be signed in to comment.

What's this? is the place to record those command-line gems that you return to again and again. That way others can gain from your CLI wisdom and you from theirs too. All commands can be commented on, discussed and voted up or down.

Share Your Commands

Stay in the loop…

Follow the Tweets.

Every new command is wrapped in a tweet and posted to Twitter. Following the stream is a great way of staying abreast of the latest commands. For the more discerning, there are Twitter accounts for commands that get a minimum of 3 and 10 votes - that way only the great commands get tweeted.


Subscribe to the feeds.

Use your favourite RSS aggregator to stay in touch with the latest commands. There are feeds mirroring the 3 Twitter streams as well as for virtually every other subset (users, tags, functions,…):

Subscribe to the feed for: