Put uppercase letters in curly brackets in a BibTeX database

sed '/^\s*[^@%]/s=\([A-Z][A-Z]*\)\([^}A-Z]\|},$\)={\1}\2=g' literature.bib > output.bib
It is often recommended to enclose capital letters in a BibTeX file in braces, so the letters will not be transformed to lower case, when imported from LaTeX. This is an attempt to apply this rule to a BibTeX database file. DO NOT USE sed '...' input.bib > input.bib as it will empty the file! How it works: /^\s*[^@%]/ Apply the search-and-replace rule to lines that start (^) with zero or more white spaces (\s*), followed by any character ([...]) that is *NOT* a "@" or a "%" (^@%). s=<some stuff>=<other stuff>=g Search (s) for some stuff and replace by other stuff. Do that globally (g) for all matches in each processed line. \([A-Z][A-Z]*\)\([^}A-Z]\|},$\) Matches at least one uppercase letter ([A-Z][A-Z]*) followed by a character that is EITHER not "}" and not a capital letter ([^}A-Z]) OR (|) it actually IS a "}", which is followed by "," at the end of the line ($). Putting regular expressions in escaped parentheses (\( and \), respectively) allows to dereference the matched string later. {\1}\2 Replace the matched string by "{", followed by part 1 of the matched string (\1), followed by "}", followed by the second part of the matched string (\2). I tried this with GNU sed, only, version 4.2.1.
Sample Output
  author = {{D}onald {D}uck and {D}aisy {D}uck and {S}crooge {M}c{D}uck},
  title = {{T}he danger of magnetic storms for {D}uckburg},
  journal = {{D}uckburg {T}ribune},
  year = {2006},
  volume = {56},
  pages = {386--394},
  number = {2},
  month = {{A}ug},
  owner = {{D}ucky {D}uke}

2013-01-15 22:24:17

What Others Think

Note to self: The regex after the "\|" (OR) in the rule [^}A-Z]\|},$ which was designed to match capitals like in title = {an upper case letter A}, has the disadvantage that it also matches A and B here: abstract = { lorem ipsum A, dolor sit amet wtf C consectetur adipiscing elit B, ... } How to fix...?
michelsberg · 464 weeks ago
ARGH... I mean, the regex (accidentally) matches even this: abstract = { lorem ipsum {A}, dolor sit amet wtf {C} consectetur adipiscing elit {B}, ... } So after applying sed we have that: abstract = { lorem ipsum {{A}}, dolor sit amet wtf {C} consectetur adipiscing elit {{B}}, ... }
michelsberg · 464 weeks ago

What do you think?

Any thoughts on this command? Does it work on your machine? Can you do the same thing with only 14 characters?

You must be signed in to comment.

What's this?

commandlinefu.com is the place to record those command-line gems that you return to again and again. That way others can gain from your CLI wisdom and you from theirs too. All commands can be commented on, discussed and voted up or down.

Share Your Commands

Stay in the loop…

Follow the Tweets.

Every new command is wrapped in a tweet and posted to Twitter. Following the stream is a great way of staying abreast of the latest commands. For the more discerning, there are Twitter accounts for commands that get a minimum of 3 and 10 votes - that way only the great commands get tweeted.

» http://twitter.com/commandlinefu
» http://twitter.com/commandlinefu3
» http://twitter.com/commandlinefu10

Subscribe to the feeds.

Use your favourite RSS aggregator to stay in touch with the latest commands. There are feeds mirroring the 3 Twitter streams as well as for virtually every other subset (users, tags, functions,…):

Subscribe to the feed for: