Commands tagged awk (348)

  • This oneliner uses Imagemagic's identify utility to show the exif GPS information of an image an also converts Grad/MIn/Sec representation to a decimal degree number Show Sample Output


    5
    identify -verbose my_image.jpg | awk 'function cf(i){split(i,a,"/");if(length(a)==2){return a[1]/a[2]}else{return a[1]}}/GPS/{if($1~/GPSLatitude:|GPSLongitude:/){s=$0;gsub(/,/,"",$0);printf("%s (%f)\n", s, $2+cf($3)/60+cf($4)/3600)}else{print}}'
    ichbins · 2022-02-20 10:17:49 322
  • Outputs utf-8 smileys Show Sample Output


    5
    printf "$(awk 'BEGIN{c=127;while(c++<191){printf("\xf0\x9f\x98\\%s",sprintf("%o",c));}}')"
    ichbins · 2022-04-06 12:10:37 441
  • "seq 100" outputs 1,2,..,100, separated by newlines. awk adds them up and displays the sum. "seq 1 2 11" outputs 1,3,..,11. Variations: 1+3+...+(2n-1) = n^2 seq 1 2 19 | awk '{sum+=$1} END {print sum}' # displays 100 1/2 + 1/4 + ... = 1 seq 10 | awk '{sum+=1/(2**$1)} END {print sum}' # displays 0.999023 Show Sample Output


    4
    seq 100 | awk '{sum+=$1} END {print sum}'
    kaan · 2009-03-24 20:30:40 5
  • This appends a random number as a first filed of all lines in SOMEFILE then sorts by the first column and finally cuts of the random numbers.


    4
    awk 'BEGIN{srand()}{print rand(),$0}' SOMEFILE | sort -n | cut -d ' ' -f2-
    axelabs · 2009-05-29 01:20:50 11
  • just change the date following the -r flag, and/or the user name in the user== conditional statement, and substitute yms_web with the name of your module


    4
    svn log -v -r{2009-05-21}:HEAD | awk '/^r[0-9]+ / {user=$3} /yms_web/ {if (user=="george") {print $2}}' | sort | uniq
    jemptymethod · 2009-06-05 14:07:28 11
  • This will calculate a running standard deviation in one pass and should never have the possibility for overflow that can happen with other implementations. I suppose there is a potential for underflow in the corner case where the deltas are small or the values themselves are small.


    4
    awk '{delta = $1 - avg; avg += delta / NR; mean2 += delta * ($1 - avg); } END { print sqrt(mean2 / NR); }'
    ashawley · 2009-09-11 04:46:01 5

  • 4
    svn ci `svn stat |awk '/^A/{printf $2" "}'`
    realist · 2009-11-04 03:30:07 3

  • 4
    renice +5 -p $(pidof <process name>)
    0x2142 · 2010-01-19 22:16:24 3
  • (Apparently it is too long so I put it in sample output, I hope that is OK.) Run the long command (or put it in your .bashrc) in sample output then run: fbemailscraper YourFBEmail Password Voila! Your contacts' emails will appear. Facebook seems to have gotten rid of the picture encoding of emails and replaced it with a text based version making it easy to scrape! Needs curl to run and it was made pretty quickly so there might be bugs. Show Sample Output


    4
    fbemailscraper YourFBEmail Password
    dabom · 2010-01-31 00:44:35 45
  • Ever gone to a site that has an MP3 embedded into a pesky flash player, but no download link? Well, this one-liner will yank the names of those tunes straight out of FF's cache in a nice, easy to read list. What you do with them after that is *ahem* no concern of mine. ;) Show Sample Output


    4
    for i in `ls ~/.mozilla/firefox/*/Cache`; do file $i | grep -i mpeg | awk '{print $1}' | sed s/.$//; done
    BoxingOctopus · 2010-04-11 23:14:18 7
  • Case Insensitive! and Works even if the "<title>...</title>" spans over multiple line. Simple! :-) Show Sample Output


    4
    awk 'BEGIN{IGNORECASE=1;FS="<title>|</title>";RS=EOF} {print $2}' file.html
    sata · 2010-04-20 10:54:03 5
  • Searches for web radio by submitted keyword and returns the station name and the link for listing . May be enhanced to read user's selection and submit it to mplayer. Show Sample Output


    4
    echo "Keyword?";read keyword;query="http://www.shoutcast.com/sbin/newxml.phtml?search="$keyword"";curl -s $query |awk -F '"' 'NR <= 4 {next}NR>15{exit}{sub(/SHOUTcast.com/,"http://yp.shoutcast.com/sbin/tunein-station.pls?id="$6)}{print i++" )"$2}'
    benyounes · 2010-05-03 00:44:10 8

  • 4
    tail -n2000 /var/www/domains/*/*/logs/access_log | awk '{print $1}' | sort | uniq -c | sort -n | awk '{ if ($1 > 20)print $1,$2}'
    allrightname · 2010-05-10 19:08:37 3
  • This one is a bit more robust -- the remote machine may not have an .ssh directory, and it may not have an authorized_keys file, but if it does already, and you want to replace your ssh public key for some reason, this will work in that case as well, without duplicating the entry.


    4
    cat ~/.ssh/id_rsa.pub | ssh <REMOTE> "(cat > tmp.pubkey ; mkdir -p .ssh ; touch .ssh/authorized_keys ; sed -i.bak -e '/$(awk '{print $NF}' ~/.ssh/id_rsa.pub)/d' .ssh/authorized_keys; cat tmp.pubkey >> .ssh/authorized_keys; rm tmp.pubkey)"
    tamouse · 2011-09-30 07:39:24 18
  • Read all lines using decimal marker as point, then add all them up and outputs the result. Show Sample Output


    4
    awk '{a+=$0}END{print a}' file
    bugmenot · 2022-08-15 23:04:31 476
  • Displays six rows and five columns of random numbers between 0 and 1. If you need only one column, you can dispense with the "for" loop. Show Sample Output


    3
    seq 6 | awk '{for(x=1; x<=5; x++) {printf ("%f ", rand())}; printf ("\n")}'
    kaan · 2009-03-24 21:33:38 5
  • Sometimes jittery data hides trends, performing a rolling average can give a clearer view.


    3
    awk 'BEGIN{size=5} {mod=NR%size; if(NR<=size){count++}else{sum-=array[mod]};sum+=$1;array[mod]=$1;print sum/count}' file.dat
    mungewell · 2009-05-29 00:07:24 4
  • The grep switches eliminate the need for awk and sed. Modifying vim with -p will show all files in separate tabs, -o in separate vim windows. Just wish it didn't hose my terminal once I exit vim!!


    3
    grep -Hrli 'foo' * | xargs vim
    dere22 · 2009-09-03 15:44:05 12
  • This pipeline will find, sort and display all files based on mtime. This could be done with find | xargs, but the find | xargs pipeline will not produce correct results if the results of find are greater than xargs command line buffer. If the xargs buffer fills, xargs processes the find results in more than one batch which is not compatible with sorting. Note the "-print0" on find and "-0" switch for perl. This is the equivalent of using xargs. Don't you love perl? Note that this pipeline can be easily modified to any data produced by perl's stat operator. eg, you could sort on size, hard links, creation time, etc. Look at stat and just change the '9' to what you want. Changing the '9' to a '7' for example will sort by file size. A '3' sorts by number of links.... Use head and tail at the end of the pipeline to get oldest files or most recent. Use awk or perl -wnla for further processing. Since there is a tab between the two fields, it is very easy to process. Show Sample Output


    3
    find $HOME -type f -print0 | perl -0 -wn -e '@f=<>; foreach $file (@f){ (@el)=(stat($file)); push @el, $file; push @files,[ @el ];} @o=sort{$a->[9]<=>$b->[9]} @files; for $i (0..$#o){print scalar localtime($o[$i][9]), "\t$o[$i][-1]\n";}'|tail
    drewk · 2009-09-21 22:11:16 11
  • You set the file/dirname transfer variable, in the end point you set the path destination, this command uses pipe view to show progress, compress the file outut and takes account to change the ssh cipher. Support dirnames with spaces. Merged ideas and comments by http://www.commandlinefu.com/commands/view/4379/copy-working-directory-and-compress-it-on-the-fly-while-showing-progress and http://www.commandlinefu.com/commands/view/3177/move-a-lot-of-files-over-ssh Show Sample Output


    3
    file='path to file'; tar -cf - "$file" | pv -s $(du -sb "$file" | awk '{print $1}') | gzip -c | ssh -c blowfish user@host tar -zxf - -C /opt/games
    starchox · 2010-01-19 16:02:45 37
  • Alternatively: export MyVAR=84; awk '{ print ENVIRON["MyVAR"] }'


    3
    MyVAR=85 awk '{ print ENVIRON["MyVAR"] }'
    depesz · 2011-04-14 16:46:23 3
  • rename file name with fixed length nomeric format pattern Show Sample Output


    3
    ls *.jpg | awk -F'.' '{ printf "%s %04d.%s\n", $0, $1, $2; }' | xargs -n2 mv
    hute37 · 2011-05-01 13:32:58 6
  • Just an alternative with more advanced formating for readability purpose. It now uses colors (too much for me but it's a kind of proof-of-concept), and adjust columns. Show Sample Output


    3
    curl -u username --silent "https://mail.google.com/mail/feed/atom" | awk 'BEGIN{FS="\n";RS="(</entry>\n)?<entry>"}NR!=1{print "\033[1;31m"$9"\033[0;32m ("$10")\033[0m:\t\033[1;33m"$2"\033[0m"}' | sed -e 's,<[^>]*>,,g' | column -t -s $'\t'
    frntn · 2011-10-15 23:15:52 3
  • This is mostly for my own notes but this command will compute a md5 message digest from the command line. You can also replace md5sum with other checksum commands (e.g., sha1sum) Show Sample Output


    3
    echo -n "password"|md5sum|awk '{print $1}'
    windfold · 2011-11-08 21:34:50 5
  • Like the original version except it does not include the parent apache process or the grep process and adds "sudo" so it can be run by user.


    3
    ps h --ppid $(cat /var/run/apache2.pid) | awk '{print"-p " $1}' | xargs sudo strace
    colinmollenhour · 2012-03-21 01:59:41 3
  •  < 1 2 3 4 5 >  Last ›

What's this?

commandlinefu.com is the place to record those command-line gems that you return to again and again. That way others can gain from your CLI wisdom and you from theirs too. All commands can be commented on, discussed and voted up or down.

Share Your Commands


Check These Out

list all files in a directory, sorted in reverse order by modification time, use file descriptors.
It's both silly, and infinitely useful. Especially useful in logfile directories where you want to know what file is being updated while troubleshooting.

Convert seconds to [DD:][HH:]MM:SS
Converts any number of seconds into days, hours, minutes and seconds. sec2dhms() { declare -i SS="$1" D=$(( SS / 86400 )) H=$(( SS % 86400 / 3600 )) M=$(( SS % 3600 / 60 )) S=$(( SS % 60 )) [ "$D" -gt 0 ] && echo -n "${D}:" [ "$H" -gt 0 ] && printf "%02g:" "$H" printf "%02g:%02g\n" "$M" "$S" }

BASH: Print shell variable into AWK

See how many more processes are allowed, awesome!
There is a limit to how many processes you can run at the same time for each user, especially with web hosts. If the maximum # of processes for your user is 200, then the following sets OPTIMUM_P to 100. $ OPTIMUM_P=$(( (`ulimit -u` - `find /proc -maxdepth 1 \( -user $USER -o -group $GROUPNAME \) -type d|wc -l`) / 2 )) This is very useful in scripts because this is such a fast low-resource-intensive (compared to ps, who, lsof, etc) way to determine how many processes are currently running for whichever user. The number of currently running processes is subtracted from the high limit setup for the account (see limits.conf, pam, initscript). An easy to understand example- this searches the current directory for shell scripts, and runs up to 100 'file' commands at the same time, greatly speeding up the command. $ find . -type f | xargs -P $OPTIMUM_P -iFNAME file FNAME | sed -n '/shell script text/p' I am using it in my http://www.askapache.com/linux-unix/bash_profile-functions-advanced-shell.html especially for the xargs command. Xargs has a -P option that lets you specify how many processes to run at the same time. For instance if you have 1000 urls in a text file and wanted to download all of them fast with curl, you could download 100 at a time (check ps output on a separate [pt]ty for proof) like this: $ cat url-list.txt | xargs -I '{}' -P $OPTIMUM_P curl -O '{}' I like to do things as fast as possible on my servers. I have several types of servers and hosting environments, some with very restrictive jail shells with 20processes limit, some with 200, some with 8000, so for the jailed shells my xargs -P10 would kill my shell or dump core. Using the above I can set the -P value dynamically, so xargs always works, like this. $ cat url-list.txt | xargs -I '{}' -P $OPTIMUM_P curl -O '{}' If you were building a process-killer (very common for cheap hosting) this would also be handy. Note that if you are only allowed 20 or so processes, you should just use -P1 with xargs.

Count the total number of files in each immediate subdirectory
counts the total (recursive) number of files in the immediate (depth 1) subdirectories as well as the current one and displays them sorted. Fixed, as per ashawley's comment

for too many arguments by *
$ grep ERROR *.log -bash: /bin/grep: Argument list too long $ echo *.log | xargs grep ERROR /dev/null 20090119.00011.log:DANGEROUS ERROR

get total of inodes of root partition

A signal trap that logs when your script was killed and what other processes were running at that time
trap is the bash builtin that allows you to execute commands when the current script receives a particular signal. Uses $0 for the script name, $$ for the script PID, tee to output to STDOUT as well as a log file and ps to log other running processes.

take a look to command before action
add |sh when you agree the list, I often use that method to prevent typos in dangerous or long operations

Get gzip compressed web page using wget.
Like the original command, but the -f allows this one to succeed even if the website returns uncompressed data. From gzip(1) on the -f flag: If the input data is not in a format recognized by gzip, and if the --stdout is also given, copy the input data without change to the standard output: let zcat behave as cat.


Stay in the loop…

Follow the Tweets.

Every new command is wrapped in a tweet and posted to Twitter. Following the stream is a great way of staying abreast of the latest commands. For the more discerning, there are Twitter accounts for commands that get a minimum of 3 and 10 votes - that way only the great commands get tweeted.

» http://twitter.com/commandlinefu
» http://twitter.com/commandlinefu3
» http://twitter.com/commandlinefu10

Subscribe to the feeds.

Use your favourite RSS aggregator to stay in touch with the latest commands. There are feeds mirroring the 3 Twitter streams as well as for virtually every other subset (users, tags, functions,…):

Subscribe to the feed for: