Commands by zhangweiwu (9)

  • The exported TSV file of Google Adwords' first five columns are text, they usually should collapse into one cell, a multi-line text cell, but there is no guaranteed way to represent line-break within cells for .tsv file format, thus Google split it to 5 columns. The problem is, with 5 columns of text, there are hardly space to put additional fields while maintain printable output. This script collapses the first five columns of each row into one single multi-line text cell, for console output or direct send to printer.


    -1
    awk -F $'\t' '{printf $1 LS $2 LS $3 LS $4 LS $5; for (i = 7; i < NF; i++) printf $i "\t"; printf "\n--\n";}' LS=$'\n' 'Ad report.tsv' | column -t -s $'\t'
    zhangweiwu · 2011-02-28 10:52:16 5
  • The exported TSV file of Google Adwords' first five columns are text, they usually should collapse into one cell, a multi-line text cell, but there is no guaranteed way to represent line-break within cells for .tsv file format, thus Google split it to 5 columns. The problem is, with 5 columns of text, there are hardly space to put additional fields while maintain printable output. This script collapses the first five columns of each row into one single multi-line text cell. new line character we use Line-Separator character (unicode U+2028), which is respected by gnumeric. It outputs a new .tsv file that opens in gnumeric.


    0
    awk -F $'\t' '{printf $1 LS $2 LS $3 LS $4 LS $5; for (i = 7; i < NF; i++) printf $i "\t"; printf "\n";}' LS=`env printf '\u2028'` 'Ad report.tsv'
    zhangweiwu · 2011-02-28 10:48:46 4
  • To rip DVD movie to ogg format using ffmpeg, follow these steps. 1) find the vob files on the mounted video DVD in VIDEO_TS that stores the movie itself. There would be a few other VOB files that stores splash screen or special features, the vob files for the movie itself can be identified by its superior size. You can verify these vob files by playing them directly with a player (e.g. mplayer) 2) concatenate all such vob files, pipe to ffmpeg 3) calculate the video size and crop size. The ogg video size must be multiple of 16 on both width and height, this is inherit limitation of theora codec. In my case I took 512x384. The -vcodec parameter is necessary because ffmpeg doesn't support theora by itself. -acodec is necessary otherwise ffmpeg uses flac by default.


    5
    cat VIDEO_TS/VTS_01_[1234].VOB | nice ffmpeg -i - -s 512x384 -vcodec libtheora -acodec libvorbis ~/Videos/dvd_rip.ogg
    zhangweiwu · 2010-09-14 14:45:27 26
  • This command is suitable to use as application launching command for a desktop shortcut. It checks if the application is already running by pgrepping its process ID, and offer user to kill the old process before starting a new one. It is useful for a few x11 application that, if re-run, is more likely a mistake. In my example, x2vnc is an x11 app that does not quit when its connection is broken, and would not work well when a second process establish a second connection after the first broken one. The LC_ALL=C for xmesseng is necessary for OpenSUSE systems to avoid a bug. If you don't find needing it, remove the "env LC_ALL=C" part


    0
    sh -c 'if pgrep x2vnc && env LC_ALL=C xmessage -button "Kill it:0,Ignore it:1" "Another connection is already running. Should I kill it instead of ignoring it?"; then killall x2vnc; fi; x2vnc -passwd /home/Ariel/.vnc/passwd -east emerson:0'
    zhangweiwu · 2010-07-06 09:11:12 5
  • You might want to check what file and directory names would be renamed or chopped if you create iso 9660 level 2 image out of them. Use this command to check first. Show Sample Output


    1
    find . -regextype posix-extended -not -regex '.*/[A-Za-z_]*([.][A-Za-z_]*)?'
    zhangweiwu · 2010-06-25 00:27:09 4
  • This is useful when you got a reserved IP address like 192.168.0.100 and want to find out what IP address is used to access the Internet. You have to know a server with 'efingerd -n' configured, like www.linuxbanks.cn as above. Other method to find out this information are for example access www.tell-my-ip.com and grep the output. The finger method have the advantage that it is easy to deploy a service like www.tell-my-ip.com, as you only need to get efingerd installed.


    -4
    finger @www.linuxbanks.cn | grep -oE '([[:digit:]]{1,3}\.){3}[[:digit:]]{1,3}' | head -n1
    zhangweiwu · 2010-05-05 14:58:55 7
  • in "a.html", find all images referred as relative URI in an HTML file by "src" attribute of "img" element, replace them with "data:" URI. This useful to create single HTML file holding all images in it, as a replacement of the IE-created .mht file format. The generated HTML works fine on every other browser except IE, as well as many HTML editors like kompozer, while the .mht format only works for IE, but not for every other browser. Compare to the KDE's own single-file-web-page format "war" format, which only opens correctly on KDE, the HTML file with "data:" URI is more universally supported. The above command have many bugs. My commandline-fu is too limited to fix them: 1. it assume all URLs are relative URIs, thus works in this case: <img src="images/logo.png"/> but does not work in this case: <img src="http://www.my_web_site.com/images/logo.png" /> This may not be a bug, as full URIs perhaps should be ignored in many use cases. 2. it only work for images whoes file name suffix is one of .jpg, .gif, .png, albeit images with .jpeg suffix and those without extension names at all are legal to HTML. 3. image file name is not allowed to contain "(" even though frequently used, as in "(copy of) my car.jpg". Besides, neither single nor double quotes are allowed. 4. There is infact a big flaw in this, file names are actually used as regular expression to be replaced with base64 encoded content. This cause the script to fail in many other cases. Example: 'D:\images\logo.png', where backward slash have different meaning in regular expression. I don't know how to fix this. I don't know any command that can do full text (no regular expression) replacement the way basic editors like gedit does. 5. The original a.html are not preserved, so a user should make a copy first in case things go wrong.


    4
    grep -ioE "(url\(|src=)['\"]?[^)'\"]*" a.html | grep -ioE "[^\"'(]*.(jpg|png|gif)" | while read l ; do sed -i "s>$l>data:image/${l/[^.]*./};base64,`openssl enc -base64 -in $l| tr -d '\n'`>" a.html ; done;
    zhangweiwu · 2010-05-05 14:07:51 15
  • Sometimes you want to work on data sheets by using heirloom unic commands like cut, paste, sed, sort, wc and good old awk. But your user works on Microsoft Excel spreadsheet. The idea: 1) ask your user to save it as "Unicode Text" from Microsoft Excel and send the document to you; 2) use the given command to convert it to UTF-8 text. We carefully convert "\r\n" to local end-of-line character; and to convert "\n" (in Excel, means linebreak within the table cell") to "\r", which is carrier return but not end-of-line in Unix. If the "\n" is not replaced with "\r", for example, wc -l will report incorrect column number.


    0
    iconv -f UTF16LE -t UTF-8 < SOURCE | awk 'BEGIN { RS="\r\n";} { gsub("\n", "\r"); print;}' > TARGET
    zhangweiwu · 2010-04-04 07:16:57 4
  • The command gives size of all files smaller than 1024k, this information, together with disk usage, can help determin file system parameter (e.g. block size) or storage device (e.g. SSD v.s. HDD). Note if you use awk instead of "cut| dc", you easily breach maximum allowed number of records in awk. Show Sample Output


    1
    find dir -size -1024k -type f | xargs -d $'\n' -n1 ls -l | cut -d ' ' -f 5 | sed -e '2,$s/$/+/' -e '$ap' | dc
    zhangweiwu · 2009-12-28 04:23:01 6

What's this?

commandlinefu.com is the place to record those command-line gems that you return to again and again. That way others can gain from your CLI wisdom and you from theirs too. All commands can be commented on, discussed and voted up or down.

Share Your Commands


Check These Out

Bitcoin prices from the command line

Create sqlite db and store image
Creates a simple sqlite db (img.db) inserts /tmp/Q.jpg on base64 an recorvers it as /tmp/W.jpg

Simple server which listens on a port and prints out received data
Sometimes you need a simple server which listens on a port and prints out received data. Example: Consider you want to know, which data is posted by a homepage to a remote script without analysing the html code! A simple way to do this is to save the page to your computer, substitude all action="address" with action="localhost:portnumber", run 'ncat -l portnumber' and open the edited page with your browser. If you then submit the form, ncat will print out the http-protocol with all the posted data.

Benchmark SQL Query
Benchmark a SQL query against MySQL Server. The example runs the query 10 times, and you get the average runtime in the output. To ensure that the query does not get cached, use `RESET QUERY CACHE;` on top in the query file.

clear MyDNS-ng cache
Occasionally, to force zone updating, cache flush is necessary. This command is better than restart the mydns daemon.

Find files that are older than x days
Find files that are older than x days in the working directory and list them. This will recurse all the sub-directories inside the working directory. By changing the value for -mtime, you can adjust the time and by replacing the ls command with, say, rm, you can remove those files if you wish to.

put an unpacked .deb package back together
If any changes have been made to the package while it was unpacked (ie, conffiles files in /etc modi‐fied), the new package will inherit the changes. This way you can make it easy to copy packages from one computer to another, or to recreate packages that are installed on your system, but no longer available elsewhere. Note: dpkg-repack will place the created package in the current directory.

Cut flv video from minute 19 to minute 20 using flvtool2
We should calculate the video duration to milliseconds, 1 minute = 60000 milliseconds

Restart openssh-server on your Synology NAS from commandline.
The correct way to restart openssh-server on your synology nas.

Advanced python tracing
Trace python statement execution and syscalls invoked during that simultaneously


Stay in the loop…

Follow the Tweets.

Every new command is wrapped in a tweet and posted to Twitter. Following the stream is a great way of staying abreast of the latest commands. For the more discerning, there are Twitter accounts for commands that get a minimum of 3 and 10 votes - that way only the great commands get tweeted.

» http://twitter.com/commandlinefu
» http://twitter.com/commandlinefu3
» http://twitter.com/commandlinefu10

Subscribe to the feeds.

Use your favourite RSS aggregator to stay in touch with the latest commands. There are feeds mirroring the 3 Twitter streams as well as for virtually every other subset (users, tags, functions,…):

Subscribe to the feed for: