Installing apertium on Debian

Installing apertium and lttoolbox

It is important to install both apertium and lttoolbox packages.
root@feynman:~# apt-get install apertium lttoolbox

Select and install language packs

Searching and installing language pairs may be done in the orthodox manner.
root@feynman:~# apt-cache search apertium language pair
apertium - Shallow-transfer machine translation engine
apertium-es-pt - Apertium language pair: Spanish<->Portuguese
apertium-fr-ca - Apertium language pair: French<->Catalan
apertium-es-ca - Apertium language pair: Spanish<->Catalan

root@feynman:~# apt-get install apertium-fr-ca

Using apertium

The language packs are installed into "/usr/share/apertium-1.0/pairs/" Create a file with source language text in,
spectre@feynman:~$ cat > /tmp/french
La traduction est le fait d'interpréter le sens d'un texte dans une langue (langue 
source, ou langue de départ), et de produire un texte ayant un sens et un effet équivalents 
sur un lecteur ayant une langue et une culture différentes (langue cible, ou langue d'arrivée).
Use "apertium-translator" to translate the text. The basic format of the command is as follows (see further instructions using apertium-translator --help or man apertium-translator):
apertium-translator <path to language pair data> <translation direction> 
So for example, to translate our text from French to Catalan, we can use the following command:
spectre@feynman:~$ apertium-translator /usr/share/apertium-1.0/pairs/fr-ca fr-ca < /tmp/french
La traducció és el fet d'interpretar el sentit d'un text en una llengua (llengua font, 
o llengua de sortida), i de produir un text havent-hi un sentit i un efecte *équivalents 
sobre un lector havent-hi una llengua i una cultura diferents (llengua *cible, o llengua 
d'arribada).

Common problems

Unsupported locale

The following error message indicates that an ISO-8859-1 compatible locale is not installed.
spectre@feynman:~$ apertium-translator /usr/share/apertium-1.0/pairs/fr-ca fr-ca < /tmp/french
Warning: unsupported locale, fallback to "C"
Warning: unsupported locale, fallback to "C"
Currently apertium uses the ISO-8859-1 encoding. If your debian installation is not configured to enable this encoding, you can enable it by reconfiguring the "locales" package. e.g.
root@feynman:~$ dpkg-reconfigure locales
Enable an ISO-8859-1 locale, such as "es_ES ISO-8859-1". The next menu will ask you to set your default locale. You can leave this as it is. Press "ok", and your new locales will be generated. Output will be like:
Generating locales (this might take a while)...
  en_GB.ISO-8859-1... done
  en_GB.UTF-8... done
  es_ES.ISO-8859-1... done
Generation complete.

Nonsense characters

If nonsense characters appear in the translation, e.g. "La traducció és el fet de *interpré*ter ...", first check that the file you are trying to translate is in the right encoding. You can use "file" to check,
spectre@feynman:~$ file /tmp/french
/tmp/french: UTF-8 Unicode text
If a file is in UTF-8, nonsense characters can appear. In order to change the encoding to ISO-8859-1, the GNU "iconv" program can be used.
spectre@feynman:~$ iconv -f UTF-8 -t ISO_8859-1 /tmp/french > /tmp/french.txt
Run "file" again, to check:
spectre@feynman:~$ file /tmp/french.txt
/tmp/french.txt: ISO-8859 text
If nonsense characters still appear, check that your terminal is set to the correct character encoding. In Gnome Terminal, this is done by going to "Terminal" -> "Set character encoding". Make sure this is set to an ISO-8859-1 compatible setting. In PuTTY you can do this by going to "Change settings" -> "Translation" and make sure that it is set to "ISO-8859-1".