Commands by zhangweiwu (9)

  • The exported TSV file of Google Adwords' first five columns are text, they usually should collapse into one cell, a multi-line text cell, but there is no guaranteed way to represent line-break within cells for .tsv file format, thus Google split it to 5 columns. The problem is, with 5 columns of text, there are hardly space to put additional fields while maintain printable output. This script collapses the first five columns of each row into one single multi-line text cell, for console output or direct send to printer.


    -1
    awk -F $'\t' '{printf $1 LS $2 LS $3 LS $4 LS $5; for (i = 7; i < NF; i++) printf $i "\t"; printf "\n--\n";}' LS=$'\n' 'Ad report.tsv' | column -t -s $'\t'
    zhangweiwu · 2011-02-28 10:52:16 0
  • The exported TSV file of Google Adwords' first five columns are text, they usually should collapse into one cell, a multi-line text cell, but there is no guaranteed way to represent line-break within cells for .tsv file format, thus Google split it to 5 columns. The problem is, with 5 columns of text, there are hardly space to put additional fields while maintain printable output. This script collapses the first five columns of each row into one single multi-line text cell. new line character we use Line-Separator character (unicode U+2028), which is respected by gnumeric. It outputs a new .tsv file that opens in gnumeric.


    0
    awk -F $'\t' '{printf $1 LS $2 LS $3 LS $4 LS $5; for (i = 7; i < NF; i++) printf $i "\t"; printf "\n";}' LS=`env printf '\u2028'` 'Ad report.tsv'
    zhangweiwu · 2011-02-28 10:48:46 0
  • To rip DVD movie to ogg format using ffmpeg, follow these steps. 1) find the vob files on the mounted video DVD in VIDEO_TS that stores the movie itself. There would be a few other VOB files that stores splash screen or special features, the vob files for the movie itself can be identified by its superior size. You can verify these vob files by playing them directly with a player (e.g. mplayer) 2) concatenate all such vob files, pipe to ffmpeg 3) calculate the video size and crop size. The ogg video size must be multiple of 16 on both width and height, this is inherit limitation of theora codec. In my case I took 512x384. The -vcodec parameter is necessary because ffmpeg doesn't support theora by itself. -acodec is necessary otherwise ffmpeg uses flac by default.


    5
    cat VIDEO_TS/VTS_01_[1234].VOB | nice ffmpeg -i - -s 512x384 -vcodec libtheora -acodec libvorbis ~/Videos/dvd_rip.ogg
    zhangweiwu · 2010-09-14 14:45:27 1
  • This command is suitable to use as application launching command for a desktop shortcut. It checks if the application is already running by pgrepping its process ID, and offer user to kill the old process before starting a new one. It is useful for a few x11 application that, if re-run, is more likely a mistake. In my example, x2vnc is an x11 app that does not quit when its connection is broken, and would not work well when a second process establish a second connection after the first broken one. The LC_ALL=C for xmesseng is necessary for OpenSUSE systems to avoid a bug. If you don't find needing it, remove the "env LC_ALL=C" part


    0
    sh -c 'if pgrep x2vnc && env LC_ALL=C xmessage -button "Kill it:0,Ignore it:1" "Another connection is already running. Should I kill it instead of ignoring it?"; then killall x2vnc; fi; x2vnc -passwd /home/Ariel/.vnc/passwd -east emerson:0'
    zhangweiwu · 2010-07-06 09:11:12 0
  • You might want to check what file and directory names would be renamed or chopped if you create iso 9660 level 2 image out of them. Use this command to check first. Show Sample Output


    1
    find . -regextype posix-extended -not -regex '.*/[A-Za-z_]*([.][A-Za-z_]*)?'
    zhangweiwu · 2010-06-25 00:27:09 0
  • This is useful when you got a reserved IP address like 192.168.0.100 and want to find out what IP address is used to access the Internet. You have to know a server with 'efingerd -n' configured, like www.linuxbanks.cn as above. Other method to find out this information are for example access www.tell-my-ip.com and grep the output. The finger method have the advantage that it is easy to deploy a service like www.tell-my-ip.com, as you only need to get efingerd installed.


    -4
    finger @www.linuxbanks.cn | grep -oE '([[:digit:]]{1,3}\.){3}[[:digit:]]{1,3}' | head -n1
    zhangweiwu · 2010-05-05 14:58:55 2
  • in "a.html", find all images referred as relative URI in an HTML file by "src" attribute of "img" element, replace them with "data:" URI. This useful to create single HTML file holding all images in it, as a replacement of the IE-created .mht file format. The generated HTML works fine on every other browser except IE, as well as many HTML editors like kompozer, while the .mht format only works for IE, but not for every other browser. Compare to the KDE's own single-file-web-page format "war" format, which only opens correctly on KDE, the HTML file with "data:" URI is more universally supported. The above command have many bugs. My commandline-fu is too limited to fix them: 1. it assume all URLs are relative URIs, thus works in this case: <img src="images/logo.png"/> but does not work in this case: <img src="http://www.my_web_site.com/images/logo.png" /> This may not be a bug, as full URIs perhaps should be ignored in many use cases. 2. it only work for images whoes file name suffix is one of .jpg, .gif, .png, albeit images with .jpeg suffix and those without extension names at all are legal to HTML. 3. image file name is not allowed to contain "(" even though frequently used, as in "(copy of) my car.jpg". Besides, neither single nor double quotes are allowed. 4. There is infact a big flaw in this, file names are actually used as regular expression to be replaced with base64 encoded content. This cause the script to fail in many other cases. Example: 'D:\images\logo.png', where backward slash have different meaning in regular expression. I don't know how to fix this. I don't know any command that can do full text (no regular expression) replacement the way basic editors like gedit does. 5. The original a.html are not preserved, so a user should make a copy first in case things go wrong.


    4
    grep -ioE "(url\(|src=)['\"]?[^)'\"]*" a.html | grep -ioE "[^\"'(]*.(jpg|png|gif)" | while read l ; do sed -i "s>$l>data:image/${l/[^.]*./};base64,`openssl enc -base64 -in $l| tr -d '\n'`>" a.html ; done;
    zhangweiwu · 2010-05-05 14:07:51 2
  • Sometimes you want to work on data sheets by using heirloom unic commands like cut, paste, sed, sort, wc and good old awk. But your user works on Microsoft Excel spreadsheet. The idea: 1) ask your user to save it as "Unicode Text" from Microsoft Excel and send the document to you; 2) use the given command to convert it to UTF-8 text. We carefully convert "\r\n" to local end-of-line character; and to convert "\n" (in Excel, means linebreak within the table cell") to "\r", which is carrier return but not end-of-line in Unix. If the "\n" is not replaced with "\r", for example, wc -l will report incorrect column number.


    0
    iconv -f UTF16LE -t UTF-8 < SOURCE | awk 'BEGIN { RS="\r\n";} { gsub("\n", "\r"); print;}' > TARGET
    zhangweiwu · 2010-04-04 07:16:57 0
  • The command gives size of all files smaller than 1024k, this information, together with disk usage, can help determin file system parameter (e.g. block size) or storage device (e.g. SSD v.s. HDD). Note if you use awk instead of "cut| dc", you easily breach maximum allowed number of records in awk. Show Sample Output


    1
    find dir -size -1024k -type f | xargs -d $'\n' -n1 ls -l | cut -d ' ' -f 5 | sed -e '2,$s/$/+/' -e '$ap' | dc
    zhangweiwu · 2009-12-28 04:23:01 1

What's this?

commandlinefu.com is the place to record those command-line gems that you return to again and again. That way others can gain from your CLI wisdom and you from theirs too. All commands can be commented on, discussed and voted up or down.

Share Your Commands


Check These Out

Search for files older than 30 days in a directory and list only their names not the full path

Writes ID3 tags using the file name as the title.
Assumes that the files are named as such: 01-Filename.mp3 If your files are named differently, change the number of periods in the sed 's/...\(.*\)/\1' bit to match the numbers of characters you need to cut off the front of the file. Note: This only writes the titles.

Which processes are listening on a specific port (e.g. port 80)
swap out "80" for your port of interest. Can use port number or named ports e.g. "http"

Get the full path of a bash script's Git repository head.
Rather than complicated and fragile paths relative to a script like "../../other", this command will retrieve the full path of the file's repository head. Safe with spaces in directory names. Works within a symlinked directory. Broken down: $cd "$(dirname "${BASH_SOURCE[0]}")" temporarily changes directories within this expansion. Double quoted "$(dirname" and ")" with unquoted ${BASH_SOURCE[0]} allows spaces in the path. $git rev-parse --show-toplevel gets the full path of the repository head of the current working directory, which was temporarily changed by the "cd".

Burn CD/DVD from an iso, eject disc when finished.
cdrecord -scanbus will tell you the (x,y,z) value of your cdr (for example, mine is 3,0,0)

Search some text from all files inside a directory

Get AWS temporary credentials ready to export based on a MFA virtual appliance
You might want to secure your AWS operations requiring to use a MFA token. But then to use API or tools, you need to pass credentials generated with a MFA token. This commands asks you for the MFA code and retrieves these credentials using AWS Cli. To print the exports, you can use: `awk '{ print "export AWS_ACCESS_KEY_ID=\"" $1 "\"\n" "export AWS_SECRET_ACCESS_KEY=\"" $2 "\"\n" "export AWS_SESSION_TOKEN=\"" $3 "\"" }'` You must adapt the command line to include: * $MFA_IDis ARN of the virtual MFA or serial number of the physical one * TTL for the credentials

Hide comments
Hide comments and empty lines, included XML comments,

Use md5sum to check your music and movie files. Also use diff.
This is a beginning script. You can create a file with > filename. You can also use diff to compare output run at different times to verify no change in your files. I apologize in advance if this is too simple. For some it should be a start.

analyze traffic remotely over ssh w/ wireshark
commandline for mac os x


Stay in the loop…

Follow the Tweets.

Every new command is wrapped in a tweet and posted to Twitter. Following the stream is a great way of staying abreast of the latest commands. For the more discerning, there are Twitter accounts for commands that get a minimum of 3 and 10 votes - that way only the great commands get tweeted.

» http://twitter.com/commandlinefu
» http://twitter.com/commandlinefu3
» http://twitter.com/commandlinefu10

Subscribe to the feeds.

Use your favourite RSS aggregator to stay in touch with the latest commands. There are feeds mirroring the 3 Twitter streams as well as for virtually every other subset (users, tags, functions,…):

Subscribe to the feed for: