xz -dc bigdata.xz | complicated-processing-program > summary
.
This command uses lsof to see how much data xz has read from the file.
lsof -o0 -o -Fo FILENAME
Display offsets (-o), in decimal (-o0), in parseable form (-Fo)
This will output something like:
.
p12607
f3
o0t45187072
.
Process id (p), File Descriptor (f), Offset (o)
.
We stat the file to get its size
stat -c %s FILENAME
.
Then we plug the values into awk.
Split the line at the letter t: -Ft
Define a variable for the file's size: -s=$(stat...)
Only work on the offset line: /^o/
.
Note this command was tested using the Linux version of lsof.
Because it uses lsof's batch option (-F) it may be portable.
.
Thanks to @unhammer for the brilliant idea.
24%
Say you're started "xzcat bigdata.xz | complicated-processing-program >summary" an hour ago, and you of course forgot to enable progress output (you could've just put "awk 'NR%1000==0{print NR>"/dev/stderr"}{print}'" in the pipeline but it's too late for that now). But you really want some idea of how far along your program is. Then you can run the above command to see how many % along xzcat is in reading the file. Note that this is for the GNU/Linux version of lsof; the one found on e.g. Darwin has slightly different output so the awk part may need some tweaks. Show Sample Output
Any thoughts on this command? Does it work on your machine? Can you do the same thing with only 14 characters?
You must be signed in to comment.
commandlinefu.com is the place to record those command-line gems that you return to again and again. That way others can gain from your CLI wisdom and you from theirs too. All commands can be commented on, discussed and voted up or down.
Every new command is wrapped in a tweet and posted to Twitter. Following the stream is a great way of staying abreast of the latest commands. For the more discerning, there are Twitter accounts for commands that get a minimum of 3 and 10 votes - that way only the great commands get tweeted.
» http://twitter.com/commandlinefu
» http://twitter.com/commandlinefu3
» http://twitter.com/commandlinefu10
Use your favourite RSS aggregator to stay in touch with the latest commands. There are feeds mirroring the 3 Twitter streams as well as for virtually every other subset (users, tags, functions,…):
Subscribe to the feed for:
pv -L 10000 bigfile.xz > /dev/null
.F=bigdata.xz; lsof -o0 -o -Fo $F | awk -Ft -v s=$(stat -c %s $F) '/^o/{printf("%d%%\n", 100*$2/s)}'
3% ... If lsof gives warning messages, you can quieten them:...; lsof ... 2> /dev/null | ...