Apr 11, 2010

TextMate Command for Umlautification of LaTeX Sources

Before LaTeX gained solid support for UTF-8 encoded text, it was common practice in LaTeX sources to use a "-prefix to turn regular letters into umlauts. For example, the letter ‘ä’ is written as "a in this notation.

The prime advantage of this method is that it works with plain ASCII-encoded LaTeX source files; also it enables users to use umlauts even if their keyboard does not feature umlaut keys. However, the drawback of this notation is that texts with many umlauts become very difficult to read since the ubiquitous " prefixes add a lot visual noise to the text.

In the last years, LaTeX’s support for text in Unicode UTF-8 encoding has become very solid and can be easily activated using the inputenc package like this:

\usepackage[utf8]{inputenc}

The following Ruby script converts LaTeX source files from the old notation into UTF-8 encoded Umlauts, which is much more readable. The script is intended to be used as a command in the TextMate editor. To add the script to the LaTeX bundle, bring up the “Bundle Editor”, select the LaTeX bundle and add a “New Command” using the plus symbol below the bundle list. Chose “Selected Text or Document” as Input, chose “Replace Selected Text” as output, and copy the script below to the “Command(s)” field. After reloading the bundle, the command will appear as command of the LaTeX bundle available through the “Gears”-menu in TextMate’s status bar.


#!/usr/bin/env ruby
$KCODE = 'U'

text = STDIN.read

text.gsub!(/\\"a/,'ä')
text.gsub!(/\\"o/,'ö')
text.gsub!(/\\"u/,'ü')

text.gsub!(/\\"A/,'Ä')
text.gsub!(/\\"O/,'Ö')
text.gsub!(/\\"U/,'Ü')

text.gsub!(/\\"s/,'ß')

print text
About
Subscribe via RSS.