Commands tagged html sorted by votes

Commands tagged html (27)

sorted by

Save an HTML page, and covert it to a .pdf file

Uses htmldoc to perform the conversion
This is sample output - yours may be different.
18

wget $URL | htmldoc --webpage -f "$URL".pdf - ; xpdf "$URL".pdf &

darth10 · 2009-06-07 23:49:22 18
Quick HTML image gallery from folder contents

Setting: You have a lot of jpg files in a directory. Maybe your public_html folder which is readable on the net because of Apache's mod_userdir. All those files from the current folder will be dropped into a file called gallery.html as image tags that can be viewed within a web browser locally or or over the Internet. Original: find . -iname "*.jpg" -exec echo "<img src=\"{}\">" >> gallery.html \;
This is sample output - yours may be different.
13

find . -iname '*.jpg' -exec echo '<img src="{}">' \; > gallery.html

Schneckentreiber · 2010-07-03 16:36:15 7
Extract title from HTML files

Case Insensitive! and Works even if the "<title>...</title>" spans over multiple line. Simple! :-) Show Sample Output
This is sample output - yours may be different.
```
Extract title from HTML files | commandlinefu.com
```
4

awk 'BEGIN{IGNORECASE=1;FS="<title>|</title>";RS=EOF} {print $2}' file.html

sata · 2010-04-20 10:54:03 5
embed referred images in HTML files

in "a.html", find all images referred as relative URI in an HTML file by "src" attribute of "img" element, replace them with "data:" URI. This useful to create single HTML file holding all images in it, as a replacement of the IE-created .mht file format. The generated HTML works fine on every other browser except IE, as well as many HTML editors like kompozer, while the .mht format only works for IE, but not for every other browser. Compare to the KDE's own single-file-web-page format "war" format, which only opens correctly on KDE, the HTML file with "data:" URI is more universally supported. The above command have many bugs. My commandline-fu is too limited to fix them: 1. it assume all URLs are relative URIs, thus works in this case: <img src="images/logo.png"/> but does not work in this case: <img src="http://www.my_web_site.com/images/logo.png" /> This may not be a bug, as full URIs perhaps should be ignored in many use cases. 2. it only work for images whoes file name suffix is one of .jpg, .gif, .png, albeit images with .jpeg suffix and those without extension names at all are legal to HTML. 3. image file name is not allowed to contain "(" even though frequently used, as in "(copy of) my car.jpg". Besides, neither single nor double quotes are allowed. 4. There is infact a big flaw in this, file names are actually used as regular expression to be replaced with base64 encoded content. This cause the script to fail in many other cases. Example: 'D:\images\logo.png', where backward slash have different meaning in regular expression. I don't know how to fix this. I don't know any command that can do full text (no regular expression) replacement the way basic editors like gedit does. 5. The original a.html are not preserved, so a user should make a copy first in case things go wrong.
This is sample output - yours may be different.
4

grep -ioE "(url\(|src=)['\"]?[^)'\"]*" a.html | grep -ioE "[^\"'(]*.(jpg|png|gif)" | while read l ; do sed -i "s>$l>data:image/${l/[^.]*./};base64,`openssl enc -base64 -in $l| tr -d '\n'`>" a.html ; done;

zhangweiwu · 2010-05-05 14:07:51 13
Check if a web page has changed last time checked.

Checks if a web page has changed. Put it into cron to check periodically. Change http://www.page.de/test.html and mail@mail.de for your needs.
This is sample output - yours may be different.
4

HTMLTEXT=$( curl -s http://www.page.de/test.html > /tmp/new.html ; diff /tmp/new.html /tmp/old.html ); if [ "x$HTMLTEXT" != x ] ; then echo $HTMLTEXT | mail -s "Page has changed." mail@mail.de ; fi ; mv /tmp/new.html /tmp/old.html

Emzy · 2010-07-04 21:45:37 5
Extract title from HTML files

This command can be used to extract the title defined in HTML pages
This is sample output - yours may be different.
3

sed -n 's/.*<title>$.*$<\/title>.*/\1/ip;T;q' file.html

octopus · 2010-04-19 07:41:10 4
Email HTML content

Note, this works because smtp is running
This is sample output - yours may be different.
2

mailx bar@foo.com -s "HTML Hello" -a "Content-Type: text/html" < body.htm

ethanmiller · 2009-05-19 04:49:26 5
List out classes in of all htmls in directory

Lists out all classes used in all *.html files in the currect directory. usefull for checking if you have left out any style definitions, or accidentally given a different name than you intended. ( I have an ugly habit of accidentally substituting camelCase instead of using under_scores: i would name soemthing counterBox instead of counter_box) WARNING: assumes you give classnames in between double quotes, and that you apply only one class per element.
This is sample output - yours may be different.
2

find . -name '*.html' -exec 'sed' 's/.*class="$[^"]*\?$".*/\1/ip;d' '{}' ';' |sort -su

kamathln · 2009-09-06 18:43:18 8
Grab just the title of a youtube video

There's another version on here that uses GET but some people don't have lwp-request, so here's an alternative. It's also a little shorter and should work with most youtube URLs since it truncates at the first &
This is sample output - yours may be different.
2

url="[Youtube URL]"; echo $(curl ${url%&*} 2>&1 | grep -iA2 '<title>' | grep '-') | sed 's/^- //'

rkulla · 2010-04-29 02:03:36 4
Quick HTML image gallery

My take on the original: even though I like the other's use of -exec echo, sed just feels more natural. This should also be slightly easier to improve. I expanded this into a script as an exercise, which took about 35 minutes (had to look up some docs): http://bitbucket.org/kniht/nonsense/src/7c1b46488dfc/commandlinefu/quick_image_gallery.py
This is sample output - yours may be different.
2

find . -iname '*.jpg' | sed 's/.*/<img src="&">/' > gallery.html

kniht · 2010-07-04 00:50:32 8
Get all links from commandlinefu front page

You need to install WWW::Mechanize Perl module with # cpan -i WWW::Mezchanize or by searching mechanize | grep perl in your package manager With this command, you can get forms, images, headers too Show Sample Output
This is sample output - yours may be different.
```
http://www.commandlinefu.com/commands/using/stat
http://www.commandlinefu.com/commands/tagged/101/size
http://www.commandlinefu.com/commands/tagged/182/file
(...)
```
2

mech-dump --links --absolute http://www.commandlinefu.com

sputnick · 2011-11-19 03:40:52 18
Quick HTML image gallery

More compact and direct.
This is sample output - yours may be different.
2

find . -iname "*.jpg" -printf '<img src="%f" title="%f">\n' > gallery.html

unixmonkey28233 · 2012-01-04 14:20:21 3
parse html/stdin with lynx

strips html from stdin Show Sample Output
This is sample output - yours may be different.
```
some_html_generating_program | html2ascii
```
2

alias html2ascii='lynx -force_html -stdin -dump -nolist'

oernii2 · 2012-04-12 14:02:44 291
Extract title from HTML files

previous version leaves lots of blank lines
This is sample output - yours may be different.
1

awk 'BEGIN{IGNORECASE=1;FS="<title>|</title>";RS=EOF} {print $2}' | sed '/^$/d' > file.html

tamouse · 2010-04-20 13:27:47 4
Extract title from HTML files

not the best, uses 4 pipes!
This is sample output - yours may be different.
1

tr -d "\n\r" | grep -ioEm1 "<title[^>]*>[^<]*</title" | cut -f2 -d\> | cut -f1 -d\<

bandie91 · 2010-04-20 18:55:24 3
Quick HTML image gallery from folder contents with Perl

This includes a title attribute so you can see the file name by hovering over an image. Also will hoover up any image format - jpg, gif and png.
This is sample output - yours may be different.
1

find . | perl -wne 'chomp; print qq|<img src="$_" title="$_" /><br />| if /\.(jpg|gif|png)$/;'> gallery.html

spotrick · 2010-07-04 01:43:50 4
Get all files of particular type (say, PDF) listed on some wegpage (say, example.com)

This example command fetches 'example.com' webpage and then fetches+saves all PDF files listed (linked to) on that webpage. [*Note: of course there are no PDFs on example.com. This is just an example]
This is sample output - yours may be different.
1

curl -s http://example.com | grep -o -P "<a.*href.*>" | grep -o "http.*.pdf" | xargs -d"\n" -n1 wget -c

b_t · 2011-06-09 14:42:46 5
Create thumbnails and a HTML page for listing them (with links to sources)

The input images are assume to have the "JPG" extension. Mogrify will overwrite any gif images with the same name! Will not work with names with spaces. Show Sample Output
This is sample output - yours may be different.
```
<ul><li><a href='001_9440846553_ohm20131.JPG'><img src='001_9440846553_ohm20131.gif'></a>
```
1

mogrify -format gif -auto-orient -thumbnail 250x90 '*.JPG'&&(echo "<ul>";for i in *.gif;do basename=$(echo $i|rev|cut -d. -f2-|rev);echo "<li style='display:inline-block'><a href='$basename.JPG'><img src='$basename.gif'></a>";done;echo "</ul>")>list.html

ysangkok · 2013-08-25 20:45:49 7
Display sqlite results one column per line

Similar output to using MySQL with the \G at the end of a Query. Displays one column per line. Other modes include: -column Query results will be displayed in a table like form, using whitespace characters to separate the columns and align the output. -html Query results will be output as simple HTML tables. -line Query results will be displayed with one value per line, rows separated by a blank line. Designed to be easily parsed by scripts or other programs -list Query results will be displayed with the separator (|, by default) character between each field value. The default. From inside the command line this can be also changed using the mode command: .mode MODE ?TABLE? Set output mode where MODE is one of: csv Comma-separated values column Left-aligned columns. (See .width) html HTML code insert SQL insert statements for TABLE line One value per line list Values delimited by .separator string tabs Tab-separated values tcl TCL list elements Show Sample Output
This is sample output - yours may be different.
```
$ sqlite3 -line blackhatseo.sqlite 
SQLite version 3.6.22
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> select * from auth_user;
          id = 1
    username = blabla
  first_name = 
   last_name = 
       email = blablamailinator.com
    password = blabla
    is_staff = 1
   is_active = 1
is_superuser = 1
  last_login = 2010-10-09 12:48:07.493163
 date_joined = 2010-10-09 12:48:07.493163
```
0

sqlite3 -line database.db

pykler · 2010-10-09 16:10:19 8
convert html links into plain text with link anchor

reverse of my previous command 10006 Show Sample Output
This is sample output - yours may be different.
```
links

<a href="http://www.domain.com">anchor</a> becomes http://www.domain.com,anchor
```
0

sed 's!<[Aa] *href*=*"$[^"]*$"*>$[^<>]*$</[Aa]>!\1,\2!g' links.html

chrismccoy · 2012-01-30 15:11:22 3
extract XML RSS etc by tags such as <title> or <code> or <description>

set BLOCK to "title" or any other HTML / RSS / XML tag and curl URL to get everything in-between e.g. some text
This is sample output - yours may be different.
0

curl ${URL} 2>/dev/null|grep "<${BLOCK}>"|sed -e "s/.*\<${BLOCK}\>$.*$\<\/${BLOCK}\>.*/\1/g"

c3w · 2013-08-31 14:53:54 0

Highlight the plain text in XML (or HTML, SGML, etc)

Don't want to open up an editor just to view a bunch of XML files in an easy to read format? Now you can do it from the comfort of your own command line! :-) This creates a new function, xmlpager, which shows an XML file in its entirety, but with the actual content (non-tag text) highlighted. It does this by setting the foreground to color #4 (red) after every tag and resets it before the next tag. (Hint: try `tput bold` as an alternative). I use 'xmlindent' to neatly reflow and indent the text, but, of course, that's optional. If you don't have xmlindent, just replace it with 'cat'. Additionally, this example shows piping into the optional 'less' pager; note the -r option which allows raw escape codes to be passed to the terminal. Show Sample Output

xmlpager() { xmlindent "$@" | awk '{gsub(">",">'`tput setf 4`'"); gsub("<","'`tput sgr0`'<"); print;} END {print "'`tput sgr0`'"}' | less -r; }

hackerb9 · 2015-07-12 09:22:10 11

Powershell Curl Logs Signal Strength of Cable Modem

IMPORTANT: You need Windows PowerShell to run this command - in your Windows Command Prompt, type powershell Create a log file of your Motorola Surfboard SB6141 downstream signal strengths. Uses the built-in curl to request signal strength data from your SB6141 cable modem. HTML page 192.168.100.1/cmSignalData.htm has the signal strength numbers for the 8 downstreams. Some HTML/DOM processing parses out the 8 values from the above page. The eight extracted signal strengths are then logged to a file. A small while-loop watches the clock & repeats the process every 10 seconds. Show Sample Output
This is sample output - yours may be different.
```
PS C:\Users\user> type modemlog.txt
7:39:20 PM 1 dBmV  1 dBmV  1 dBmV  1 dBmV  0 dBmV  0 dBmV  -1 dBmV  -1 dBmV
7:39:30 PM 1 dBmV  1 dBmV  1 dBmV  1 dBmV  0 dBmV  0 dBmV  -1 dBmV  -1 dBmV
7:39:40 PM 1 dBmV  1 dBmV  1 dBmV  1 dBmV  0 dBmV  0 dBmV  -1 dBmV  -1 dBmV
7:39:50 PM 1 dBmV  1 dBmV  1 dBmV  1 dBmV  0 dBmV  0 dBmV  -1 dBmV  -1 dBmV
7:40:00 PM 1 dBmV  1 dBmV  1 dBmV  1 dBmV  0 dBmV  0 dBmV  -1 dBmV  -1 dBmV
7:40:10 PM 1 dBmV  1 dBmV  1 dBmV  1 dBmV  0 dBmV  0 dBmV  -1 dBmV  -1 dBmV
PS C:\Users\user>
```
0

while(1){while((date -f ss)%10-gt0){sleep -m 300} echo "$(date -u %s) $((curl 192.168.100.1/cmSignalData.htm).parsedhtml.body.childnodes.item(1).firstchild.firstchild.childnodes.item(5).outertext|%{$_ -replace '\D+\n',''})">>modemlog.txt;sleep 1;echo .}

omap7777 · 2015-12-24 02:12:10 11
Remove scripts tags from *.html and *.htm files under the current directory
This is sample output - yours may be different.
0

find ./ -type f $ -iname '*.html' -or -iname '*.htm' $ -exec sed -i '/<script/,/<\/script>/d' '{}' \;

mikhail · 2019-05-10 23:27:12 152
Check out Gate number for your flight from CLI with Chrome

Check out Gate number for your flight from CLI with Chrome, html2texgt and grep. Works on Arch Linux (Garuda) and probably will work on others. Requirements: * google chrome (might work with chromium as well) * installed html2text (on archlinux: sudo pacman -S python-html2text) * installed grep (comes by default with your OS) * the gate number should be visible at the given website (it's not existent too early before the flight and also disappears after the flight departed) Please don't forget to replace the link to appropriate one, matching your flight. You can also wrap this into something like `whlie true; do ...; sleep 60; done' and this will check and tell you the gate number maximum in 1 minute after it appears on Avinor website. Show Sample Output
This is sample output - yours may be different.
```
Gate

A14
```
0

google-chrome-stable --headless --dump-dom --disable-gpu "https://avinor.no/flight/?flightLegId=dy754-osl-trd-20220726&airport=OSL" 2>/dev/null | html2text | grep -A2 Gate

sxiii · 2022-07-26 11:50:59 395
1 2 >

What's this?

commandlinefu.com is the place to record those command-line gems that you return to again and again. That way others can gain from your CLI wisdom and you from theirs too. All commands can be commented on, discussed and voted up or down.

Share Your Commands

Check These Out

Show all current listening programs by port and pid with SS instead of netstat

Hardlink all identical files in the current directory (regain some disk space)

Meaning of switches (see man page too): v verbose p ignore mode (permissions) o ignore owner, group t ignore time of modification Disadvantage: If you modify any linked file, this will propagate to all other files which occupy the same space.

Retrieve the size of a file on a server

- Where $URL is the URL of the file. - Replace the $2 by $3 at the end to get a human-readable size. Credits to svanberg @ ArchLinux forums for original idea. Edit: Replaced command with better version by FRUiT. (removed unnecessary grep)

Rip DVD to YouTube ready MPEG-4 AVI file using mencoder

Rip DVD to YouTube ready AVI file, using MPEG-4 video codec and MP3 audio codec. Resizes to 320x240 and deinterlaces as needed.

get only time of execution of a command without his output

grep lines containing two consecutive hyphens

for all flv files in a dir, grab the first frame and make a jpg.

This is handy for making screenshots of all your videos for referring to in your flv player.

Which processes are listening on a specific port (e.g. port 80)

swap out "80" for your port of interest. Can use port number or named ports e.g. "http"

Advanced python tracing

Trace python statement execution and syscalls invoked during that simultaneously

Send remote command output to your local clipboard

This command will copy command's output into your local clipboard

Stay in the loop…

Follow the Tweets.

Every new command is wrapped in a tweet and posted to Twitter. Following the stream is a great way of staying abreast of the latest commands. For the more discerning, there are Twitter accounts for commands that get a minimum of 3 and 10 votes - that way only the great commands get tweeted.

» http://twitter.com/commandlinefu
» http://twitter.com/commandlinefu3
» http://twitter.com/commandlinefu10

Subscribe to the feeds.

Use your favourite RSS aggregator to stay in touch with the latest commands. There are feeds mirroring the 3 Twitter streams as well as for virtually every other subset (users, tags, functions,…):

Subscribe to the feed for:

» all commands
» commands with 3 up-votes (commandlinefu3)
» commands with 10 up-votes (commandlinefu10)
» commands tagged html

Commands tagged html (27) the last day the last week the last month all time sorted by date votes

What's this?

Check These Out

Stay in the loop…

Commands tagged html (27)

sorted by