Convert camelCase to underscores (camel_case)

sed -r 's/([a-z]+)([A-Z][a-z]+)/\1_\l\2/g' file.txt
Useful for switching over someone else's coding style who uses camelCase notation to your style using all lowercase with underscores.
Sample Output
$ cat file.txt
$ sed -r 's/([a-z]+)([A-Z][a-z]+)/\1_\l\2/g' file.txt

By: atoponce
2009-04-28 22:44:45

What Others Think

A great one. Thanks. You missed the underscore though, this should be: sed -r 's/([a-z]+)([A-Z][a-z]+)/\1_\l\2/g' file.txt
AmirWatad · 569 weeks and 5 days ago
Fixed. I had it in the sample output, but must have missed it in the command itself. Thanks.
atoponce · 569 weeks and 5 days ago
Good job
kaedenn · 569 weeks and 5 days ago
Where's the reverse? :-)
furicle · 569 weeks and 5 days ago
btw, it's good for names like camelCase, but not for camelCaseLong :)
AmirWatad · 569 weeks and 5 days ago
I think this is a little more robust: It converts CamelCaseWord or camelCaseWord to camel_case_word (the last pipe is needed to handle the CamelCaseWord case) sed 's/\([A-Z]\)/_\l\1/g' file.txt | sed 's/^_\([a-z]\)/\1/g'
AmirWatad · 569 weeks and 5 days ago
and this is the reverse: camel_case_word to camelCaseWord: sed 's/_\([a-z]\)/\u\1/g' file.txt camel_case_word to CamelCaseWord sed 's/_\([a-z]\)/\u\1/g' file.txt | sed 's/^\([a-z]\)/\u\1/g'
AmirWatad · 569 weeks and 5 days ago
LC_COLLATE=C LC_CTYPE=C export LC_CTYPE LC_COLLATE The exports above fix a problem where [a-z] is case-insensitive Here's an explanation from the sed man page (Gnu Sed 4) [a-z] is case insensitive You are encountering problems with locales. POSIX mandates that [a-z] uses the current locale's collation order - in C parlance, that means using strcoll(3) instead of strcmp(3). Some locales have a case-insensitive collation order, others don't. Another problem is that [a-z] tries to use collation symbols. This only happens if you are on the GNU system, using GNU libc's regular expression matcher instead of compiling the one supplied with GNU sed. In a Danish locale, for example, the regular expression ^[a-z]$ matches the string `aa', because this is a single collating symbol that comes after `a' and before `b'; `ll' behaves similarly in Spanish locales, or `ij' in Dutch locales. To work around these problems, which may cause bugs in shell scripts, set the LC_COLLATE and LC_CTYPE environment variables to `C'. Here's an example of a line that was having problems due to the case-insensitive problem. Exporting LC_COLLATE and LC_CTYPES fixed the problem. It has to be done each time you log in though. I'm hesitant to put this export into my .profile, as I'm not sure what it will do to the rest of the system programs. sed -r 's/([a-z]+)([A-Z][a-z]+)/\1_\l\2/g' invalid_name2.txt
Christian_Long · 509 weeks and 4 days ago
sed -r 's/([^A-Z-])([A-Z])/\1_\2/g' file.txt Replace CamelCaseWord by Camel_Case_Word
franek · 492 weeks and 3 days ago

What do you think?

Any thoughts on this command? Does it work on your machine? Can you do the same thing with only 14 characters?

You must be signed in to comment.

What's this? is the place to record those command-line gems that you return to again and again. That way others can gain from your CLI wisdom and you from theirs too. All commands can be commented on, discussed and voted up or down.

Share Your Commands

Stay in the loop…

Follow the Tweets.

Every new command is wrapped in a tweet and posted to Twitter. Following the stream is a great way of staying abreast of the latest commands. For the more discerning, there are Twitter accounts for commands that get a minimum of 3 and 10 votes - that way only the great commands get tweeted.


Subscribe to the feeds.

Use your favourite RSS aggregator to stay in touch with the latest commands. There are feeds mirroring the 3 Twitter streams as well as for virtually every other subset (users, tags, functions,…):

Subscribe to the feed for: