Most commercial translation memory applications now include an aligner as part of a suite of tools. Aligners are used to produce translation memory files (e.g. in the industry-standard TMX format) from legacy translations and their corresponding source files. The resulting file can then be used within a translation memory application, which provides easy access to the source/target segment pairs in the translation memory file.
Although the advantage of aligning relevant legacy texts is obvious, a major drawback is that such tools generally align non-intelligently. Where the source and target texts have different numbers of segments (e.g. because at some point, the original translator merged two sentences to form a single sentence in the translation), the resulting translation memory file is misaligned. Aligners generally provide an interface through which such misalignments can be corrected manually. This can be a time-consuming process, however, and is usually worthwhile only when the legacy translation is known to be useful, such as when the original source text has been modified and a new translation of it is required.
Launched in January 2006, bitext2tmx is a Java application which is able to align two plain-text files. It features a clean graphical user interface and self-explanatory functions for splitting and merging cells etc. Bitext2tmx is distributed under the GNU Public License and is the work of Susana Santos, with the support of other members of the team associated with Mikel L. Forcada.
Developed by Hungarian translator András Farkas. Written mainly in Perl; also makes use of other open-source utilities such as hunalign and pdftotext.
LF Aligner is a command-line tool, but interactive. It promises "intelligent" sentence-level segmenting.
An alignment tool from Maxprograms, the company responsible for the Swordfish translation memory application and the work of Rodolfo Raya, formerly programmer for Heartsome. Stingray can align files in a wide range of file formats. It currently costs €70, and a free, fully functional 30-day trial version is available.
A tool for "geometric mapping and alignment". Runs on Java and os licensed under the GPL. (With thanks to Patrick Hall for the tip.)
A simple Python script for creating a TMX file from two texts. Written by Dmitri Gabinski.
A simple Python script for creating a TMX file from two texts. Written by Didier Briel.
Links to more resources on alignment. (Thanks again to Patrick Hall.)