Url Encode

Url Encode
uri_escape(){ echo -E "$@" | sed 's/\\/\\\\/g;s/./&\n/g' | while read -r i; do echo $i | grep -q '[a-zA-Z0-9/.:?&=]' && echo -n "$i" || printf %%%x \'"$i" done }
2010-02-13 01:39:51
User: infinull
Functions: echo grep printf read sed
Url Encode

This one uses hex conversion to do the converting and is in shell/sed only (should probably still use the python/perl version).


Terminal - Alternatives
echo "$url" | perl -MURI::Escape -ne 'chomp;print uri_escape($_),"\n"'
2010-02-13 00:44:48
User: eightmillion
Functions: echo perl
Tags: perl

Converts reserved characters in a URI to their percent encoded counterparts.

Alternate python version:

echo "$url" | python -c 'import sys,urllib;print urllib.quote(sys.stdin.read().strip())'
echo "$@" | sed 's/ /%20/g;s/!/%21/g;s/"/%22/g;s/#/%23/g;s/\$/%24/g;s/\&/%26/g;s/'\''/%27/g;s/(/%28/g;s/)/%29/g;s/:/%3A/g'
od -An -w999 -t xC <<< "$1" | sed 's/[ ]\?\(c[23]\) \(..\)/%\1%\2/g;s/ /\\\\\x/g' | xargs echo -ne
2010-05-31 16:35:52
Functions: echo od sed xargs

It only encodes non-Basic-ASCII chars, as they are the only ones not well readed by UTF-8 and ISO-8859-1 (latin-1).

It converts all

* C3 X (some latin symbols like ASCII-extended ones)

and * C2 X (some punctuation symbols like inverted exclamation)

...UTF-8 double byte symbols to escaped form that every parser understands to form the URLs. I didn't encode spaces and the rest of basic punctuation, but supposedly, space and others are coded as \x20, for example, in UTF-8, latin-1 and Windows-cp1252.... so its read perfectly.

Please feel free to correct, the application to which I designe that function works as expected with my assumption.

Note: I specify a w=999, I didn't find a flag to put unlimited value.

I just suppose very improbable surpass the de-facto 255 (* 3 byte max) = 765 bytes length of URL

$ php -r "echo urlencode('$1');"
2012-01-07 19:35:33
User: Kataklysmos

Returns URL Encoded string from input ($1).

I like what you've done here, but I wanted to see if I could do it with only bash. This is what I came up with:

uri_escape(){ local y;y="$@";echo -n ${y/\\/\\\\} | while read -n1;do [[ $REPLY =~ [A-Za-z] ]] && printf "$REPLY" || printf "%%%x" \'"$REPLY";done;echo;}

It's also quite a bit quicker. You're version also seems to be missing some semicolons. And when you use read with a variable specified, when it hits a space it sets that variable to null, causing printf to output "%0"s instead of "%20"s. You can avoid that, like I've done, by not supplying a variable to the read builtin.

Comment by eightmillion 270 weeks and 1 day ago

yeah, it's actually 3 lines in my .zshrc; and copy/paste doesn't preserve the newlines. (and I'm using zsh which might account for the differences in the "read" builtin that you noticed).

Similar concept, but in zsh, uses the zsh [[ ]] and =~ instead of the bashes [ ] and ~=

And I remembered the semicolons this time :-)

uri_escape(){ local y=$@:s/\\/\\\\/; for i in `seq 1 ${#y}`; do [[ "${y[i]}" =~ '[a-zA-Z0-9/.:?&=]' ]] && echo -n ${y[i]} || printf %%%x \'${y[i]}; done }
Comment by infinull 270 weeks and 1 day ago

I thought that you might be using a different shell. That new version does do the job.

Comment by eightmillion 270 weeks and 1 day ago

