Commands tagged wget (92)

  • See man wget if you want linked files and not only those hosted on the website.


    8
    wget -r -A .pdf -l 5 -nH --no-parent http://example.com
    houghi · 2011-06-09 17:17:03 1
  • This example command fetches 'example.com' webpage and then fetches+saves all PDF files listed (linked to) on that webpage. [*Note: of course there are no PDFs on example.com. This is just an example]


    1
    curl -s http://example.com | grep -o -P "<a.*href.*>" | grep -o "http.*.pdf" | xargs -d"\n" -n1 wget -c
    b_t · 2011-06-09 14:42:46 1
  • This will visit recursively all linked urls starting from the specified URL. It won't save anything locally and it will produce a detailed log. Useful to find broken links in your site. It ignores robots.txt, so just use it on a site you own!


    0
    wget --spider -o wget.log -e robots=off --wait 1 -r -p http://www.example.com
    lele · 2011-04-05 13:42:14 1
  • On a machine behind a firewall, it's possible to pass the proxy server address in as a prefix to wget to avoid having to set it as an environment variable first.


    0
    http_proxy=<proxy.server:port> wget <url>
    rdc · 2011-03-30 13:06:19 0

  • -2
    wget --mirror -A.jpg http://www.xs4all.nl/~dassel/wall/
    madrasi · 2011-02-08 03:15:35 0
  • yt2mp3(){ for j in `seq 1 301`;do i=`curl -s gdata.youtube.com/feeds/api/users/$1/uploads\?start-index=$j\&max-results=1|grep -o "watch[^&]*"`;ffmpeg -i `wget youtube.com/$i -qO-|grep -o 'url_map"[^,]*'|sed -n '1{s_.*|__;s_\\\__g;p}'` -vn -ab 128k "`youtube-dl -e ${i#*=}`.mp3";done;} squeezed the monster (and nifty ☺) command from 7776 from 531 characters to 284 characters, but I don't see a way to get it down to 255. This is definitely a kludge!


    0
    Command in description (Your command is too long - please keep it to less than 255 characters)
    __ · 2011-02-03 08:25:42 1
  • Substitute that 724349691704 with an UPC of a CD you have at hand, and (hopefully) this oneliner should return the $Artist - $Title, querying discogs.com. Yes, I know, all that head/tail/grep crap can be improved with a single sed command, feel free to send "patches" :D Enjoy! Show Sample Output


    -1
    wget http://www.discogs.com/search?q=724349691704 -O foobar &> /dev/null ; grep \/release\/ foobar | head -2 | tail -1 | sed -e 's/^<div>.*>\(.*\)<\/a><\/div>/\1/' ; rm foobar
    TetsuyO · 2011-01-30 23:34:54 0
  • Yep, now you can finally google from the command line! Here's a readable version "for your pleasure"(c): google() { # search the web using google from the commandline # syntax: google google query=$(echo "$*" | sed "s:%:%25:g;s:&:%26:g;s:+:%2b:g;s:;:%3b:g;s: :+:g") data=$(wget -qO - "https://ajax.googleapis.com/ajax/services/search/web?v=1.0&q=$query") title=$(echo "$data" | tr '}' '\n' | sed "s/.*,\"titleNoFormatting//;s/\":\"//;s/\",.*//;s/\\u0026/'/g;s/\\\//g;s/#39\;//g;s/'amp;/\&/g" | head -1) url="$(echo "$data" | tr '}' '\n' | sed 's/.*"url":"//;s/".*//' | head -1)" echo "${title}: ${url} | http://www.google.com/search?q=${query}" } Enjoy :) Show Sample Output


    -7
    The command is too big to fit here. :( Look at the description for the command, in readable form! :)
    hunterm · 2011-01-05 02:45:28 1
  • To learn more about Google Ngram Viewer: http://ngrams.googlelabs.com/info


    1
    wget -qO - http://ngrams.googlelabs.com/datasets | grep -E href='(.+\.zip)' | sed -r "s/.*href='(.+\.zip)'.*/\1/" | uniq | while read line; do `wget $line`; done
    sexyprout · 2010-12-20 17:46:04 0
  • Get gzip compressed web page using wget. Caution: The command will fail in case website doesn't return gzip encoded content, though most of thw websites have gzip support now a days.


    3
    wget -q -O- --header\="Accept-Encoding: gzip" <url> | gunzip > out.html
    ashish_0x90 · 2010-11-27 22:14:42 0
  • Just added view with the eog viewer.


    -1
    wget -O xkcd_$(date +%y-%m-%d).png `lynx --dump http://xkcd.com/|grep png`; eog xkcd_$(date +%y-%m-%d).png
    theanalyst · 2010-10-27 13:42:55 0
  • The command was too long for the command box, so here it is: echo $(( `wget -qO - http://i18n.counter.li.org/ | grep 'users registered' | sed 's/.*\<font size=7\>//g' | tr '\>' ' ' | sed 's/<br.*//g' | tr ' ' '\0'` + `curl --silent http://www.dudalibre.com/gnulinuxcounter?lang=en | grep users | head -2 | tail -1 | sed 's/.*<strong>//g' | sed 's/<\/strong>.*//g'` )) This took me about an hour to do. It uses wget and curl because, dudalibre.com blocks wget, and wget worked nicely for me. Show Sample Output


    -1
    Check the Description below.
    hunterm · 2010-10-07 04:22:32 0

  • -1
    wget -qO - http://i18n.counter.li.org/ | grep 'users registered' | sed 's/.*\<font size=7\>//g' | tr '\>' ' ' | sed 's/<br.*//g' | tr ' ' '\0'
    hunterm · 2010-10-07 03:19:17 0
  • I use this command in my Conky script to display the number of messages in my Gmail inbox and to list the from: and subject: fields. Show Sample Output


    -3
    wget -O - 'https://USERNAMEHERE:PASSWORDHERE@mail.google.com/mail/feed/atom' --no-check-certificate
    PLA · 2010-09-26 14:47:13 3
  • The "-k" flag will tell wget to convert links for local browsing; it works with mirroring (ie with "-r") or single-file downloads.


    -2
    wget -k $URL
    minnmass · 2010-08-21 17:39:53 0
  • extension to tali713's random fact generator. It takes the output & sends it to notify-osd. Display time is proportional to the lengh of the fact.


    2
    wget randomfunfacts.com -O - 2>/dev/null | grep \<strong\> | sed "s;^.*<i>\(.*\)</i>.*$;\1;" | while read FUNFACT; do notify-send -t $((1000+300*`echo -n $FUNFACT | wc -w`)) -i gtk-dialog-info "RandomFunFact" "$FUNFACT"; done
    mtron · 2010-04-02 09:43:32 1
  • Though without infinite time and knowledge of how the site will be designed in the future this may stop working, it still will serve as a simple straight forward starting point. This uses the observation that the only item marked as strong on the page is the single logical line that includes the italicized fact. If future revisions of the page show failure, or intermittent failure, one may simply alter the above to read. wget randomfunfacts.com -O - 2>/dev/null | tee lastfact | grep \<strong\> | sed "s;^.*<i>\(.*\)</i>.*$;\1;" The file lastfact, can then be examined whenever the command fails.


    13
    wget randomfunfacts.com -O - 2>/dev/null | grep \<strong\> | sed "s;^.*<i>\(.*\)</i>.*$;\1;"
    tali713 · 2010-03-30 23:49:30 1
  • Bash scrip to test if a server is up, you can use this before wget'ing a file to make sure a blank one isn't downloaded.


    -1
    if [ "$(ping -q -c1 google.com)" ];then wget -mnd -q http://www.google.com/intl/en_ALL/images/logo.gif ;fi
    alf · 2010-03-23 04:15:03 7
  • This one uses dictionary.com


    13
    pronounce(){ wget -qO- $(wget -qO- "http://dictionary.reference.com/browse/$@" | grep 'soundUrl' | head -n 1 | sed 's|.*soundUrl=\([^&]*\)&.*|\1|' | sed 's/%3A/:/g;s/%2F/\//g') | mpg123 -; }
    matthewbauer · 2010-03-13 04:23:56 4
  • The original was a little bit too complicated for me. This one does not use any variables.


    5
    pronounce(){ wget -qO- $(wget -qO- "http://www.m-w.com/dictionary/$@" | grep 'return au' | sed -r "s|.*return au\('([^']*)', '([^'])[^']*'\).*|http://cougar.eb.com/soundc11/\2/\1|") | aplay -q; }
    matthewbauer · 2010-03-12 17:44:16 3
  • Looks up a word on merriam-webster.com, does a screen scrape for the FIRST audio pronunciation and plays it. USAGE: Put this one-liner into a shell script (e.g., ~/bin/pronounce) and run it from the command line giving it the word to say: pronounce lek If the word isn't found in merriam-webster, no audio is played and the script returns an error value. However, M-W is a fairly complete dictionary (better than howjsay.com which won't let you hear how to pronounce naughty words). ASSUMPTIONS: GNU's sed (which supports -r for extended regular expressions) and Linux's aplay. Aplay can be replaced by any program that can play .WAV files from stdin. KNOWN BUGS: only the FIRST pronunciation is played, which is problematic if you wanted a particular form (plural, adjectival, etc) of the word. For example, if you run this: pronounce onomatopoetic you'll hear a voice saying "onomatopoeia". Playing the correct form of the word is possible, but doing so might make the screen scraper even more fragile than it already is. (The slightest change to the format of m-w.com could break it). Show Sample Output


    3
    cmd=$(wget -qO- "http://www.m-w.com/dictionary/$(echo "$@"|tr '[A-Z]' '[a-z]')" | sed -rn "s#return au\('([^']+?)', '([^'])[^']*'\);.*#\nwget -qO- http://cougar.eb.com/soundc11/\2/\1 | aplay -q#; s/[^\n]*\n//p"); [ "$cmd" ] && eval "$cmd" || exit 1
    hackerb9 · 2010-03-12 13:56:41 0
  • This will send the web page at $u to recipient@example.com . To send the web page to oneself, recipient@example.com can be replaced by $(whoami) . The "charset" is UTF-8 here, but any alternative charset of your choice would work. `wget -O - -o /dev/null $u` may be considered instead of `curl $u` . On some systems the complete path to sendmail may be necessary, for instance /sys/pkg/libexec/sendmail/sendmail for some NetBSD.


    -1
    { u="http://twitter.com/commandlinefu"; echo "Subject: $u"; echo "Mime-Version: 1.0"; echo -e "Content-Type: text/html; charset=utf-8\n\n"; curl $u ; } | sendmail recipient@example.com
    pascalv · 2010-02-24 04:18:30 1
  • Prompts the user for username and password, that are then exported to http_proxy for use by wget, yum etc Default user, webproxy and port are used. Using this script prevent the cleartext user and pass being in your bash_history and on-screen Show Sample Output


    1
    set-proxy () { P=webproxy:1234; DU="fred"; read -p "username[$DU]:" USER; printf "%b"; UN=${USER:-$DU}; read -s -p "password:" PASS; printf "%b" "\n"; export http_proxy="http://${UN}:${PASS}@$P/"; export ftp_proxy="http://${UN}:${PASS}@$P/"; }
    shadycraig · 2010-02-04 13:12:59 2
  • This is a minimalistic version of the ubiquitious Google definition screen scraper. This version was designed not only to run fast, but to work using BusyBox. BusyBox is a collection of basic Unix tools that have been compiled into a single binary to save space on tiny installations of Unix. For example, although my phone doesn't have perl or the GNU utilities, it does have BusyBox's stripped down versions of wget, tr, and sed. It turns out that those tools suffice for many tasks. Known Bugs: This script does not handle HTML entities at all. I don't think there's an easy way to do that within BusyBox, but I'd love to see it if someone could do it. Also, this script can only define a single word, not phrases. (Well, you could if you typed in %20, but that'd be gross.) Lastly, this script does not show the URL where definitions were found. Given the randomness of the Net, that last bit of information is often key. Show Sample Output


    0
    wget -q -U busybox -O- "http://www.google.com/search?ie=UTF8&q=define%3A$1" | tr '<' '\n' | sed -n 's/^li>\(.*\)/\1\n/p'
    hackerb9 · 2010-02-01 13:01:47 1
  • Only need to install Image Magick package. Display a xkcd comic with its title and save it in /tmp directory If you prefer to view the newest xkcd, use this command: wget -q http://xkcd.com/ -O-| sed -n '/<img src="http:\/\/imgs.xkcd.com\/comics/{s/.*\(http:.*\)" t.*/\1/;p}' | awk '{system ("wget -q " $1 " -O- | display -title $(basename " $1") -write /tmp/$(basename " $1")");}'


    0
    wget -q http://dynamic.xkcd.com/comic/random/ -O-| sed -n '/<img src="http:\/\/imgs.xkcd.com\/comics/{s/.*\(http:.*\)" t.*/\1/;p}' | awk '{system ("wget -q " $1 " -O- | display -title $(basename " $1") -write /tmp/$(basename " $1")");}'
    laugg · 2009-12-09 13:41:25 1
  •  < 1 2 3 4 > 

What's this?

commandlinefu.com is the place to record those command-line gems that you return to again and again. That way others can gain from your CLI wisdom and you from theirs too. All commands can be commented on, discussed and voted up or down.

Share Your Commands



Stay in the loop…

Follow the Tweets.

Every new command is wrapped in a tweet and posted to Twitter. Following the stream is a great way of staying abreast of the latest commands. For the more discerning, there are Twitter accounts for commands that get a minimum of 3 and 10 votes - that way only the great commands get tweeted.

» http://twitter.com/commandlinefu
» http://twitter.com/commandlinefu3
» http://twitter.com/commandlinefu10

Subscribe to the feeds.

Use your favourite RSS aggregator to stay in touch with the latest commands. There are feeds mirroring the 3 Twitter streams as well as for virtually every other subset (users, tags, functions,…):

Subscribe to the feed for: