OmegaT 4.3.1 - User's Guide

How To...

Set up a Team Project

Setting up a team project requires some knowledge of servers and the SVN or Git versioning systems. It should thus be carried out by a project manager, a project leader or a localisation engineer.

As plenty of information about SVN and Git is easily available, we won't describe how they work here, but only how OmegaT works with them.

Step 1: Create an empty project on a server

Creating an empty project on a server

  1. Create an SVN or Git repository on a server that will be accessible by the translators.

  2. Create a local copy of the repository ( check out with SVN, clone with Git).

  3. Create a new, empty OmegaT project in the local repository. This can be done in two ways:

    • ProjectNew...

    • on the command line: java -jar OmegaT.jar team init [lang1] [lang2]

  4. Add the new OmegaT project to the versioning system ( add with SVN and Git)

    Note: If the project was created with the command line in step 3, this step has already been done by the program.

  5. Publish the new OmegaT project on the server ( commit with SVN, commit followed by push with Git).

Specific parameters

If the project uses specific filters and segmentation parameters, both the filters.xml and segmentation.conf files must be added to the versioning system and published on the server.

Step 2: Add files to translate and other resources

Use an SVN or Git client to add the files to translate.

This can also be done within OmegaT:

  1. copy the files to the source folder

  2. use ProjectCommit Source Files

To add other resources (dictionaries, TMXs or glossaries), use an SVN or Git client.

To delete files , use an SVN or Git client.

Note that only two files are modified by OmegaT during translation:

  • omegat/project_save.tmx

  • glossary/glossary.txt

All other files are read-only. If the translator attempts to modify them, they will go back to their original state whenever the project is opened, closed, saved or reloaded.

Step 3: Send an invitation to the translator

Once the project is set up on the server, the project manager can invite the translator to work on it in either of two different ways:

  • sending the URL of the project and asking the translator to create a local copy with ProjectDownload Team Project.

  • sending an omegat.project file containing a reference to the URL and asking the translator to copy it into a dedicated folder and open it with OmegaT.

    The reference to the URL is specified as below (here to a Git repository):

    <repositories>
     <repository type="git" url="https://repo_for_OmegaT_team_project.git">
      <mapping local="" repository=""/>
     </repository>
    </repositories>

In both cases, the project manager must send the translator their ID and password to access the repository.

Checking statistics

The Project Manager should check with the translator that the statistics are identical on both sides (server side and translator side).

If there are any differences, check that the filters.xml and segmentation.conf files are under version control.

Special case: selective sharing

The process above describes the usual case, where the project manager wants to have full control of the project and where the files (and the statistics) are identical on both sides (server side and translator side).

OmegaT team projects can also be set up in a different way, where several translators share the project_save.tmx file but not (all of) the (source) files.

In this case, the procedure is the same, but the project manager does not add (all) files to version-controlled project. Instead, the translators copy the files themselves, or add mappings to synchronize files from other locations.

The mappings can be added via the UI: ProjectPropertiesRepository Mapping or by editing omegat.project.

Mapping parameters

repository type

This can be either http (which includes https), svn, git or file.

repository url

Remote location or directory of the files to translate.

mapping local

Name of the local folder or file, relative to the root of the OmegaT project.

mapping repository

Name of the remote folder or file, relative to the repository url.

excludes

Add patterns using wildcards (Apache Ant style): *, ?, **. Separate different patterns with a semicolon.

Example: **/excludedfolder/**;*.txt excludes files that have /excludedfolder/ in the path, and files with .txt extension.

includes

As above.

Example: **/*.docx to add all .docx files, wherever they are located in the project, even in excluded folders

By default, all files that are not excluded are included. You only need to specify this to override some exclusions.

Example mappings

Default project mapping:

<repositories>
 <repository type="svn" url="https://repo_for_OmegaT_team_project">
  <mapping local="" repository=""/>
 </repository>
</repositories>

All the contents of https://repo_for_OmegaT_team_project are mapped to the local OmegaT project

Mapping for projects in a subdirectory of the repository:

<repositories>
 <repository type="svn" url="https://repo_for_All_OmegaT_team_projects">
  <mapping local="" repository="En-US_DE_project"/>
 </repository>
</repositories>

All the contents of https://repo_for_All_OmegaT_team_projects/En-US_DE_project are mapped to the local OmegaT project.

Mapping for additional sources from remote repository, with filters:

<repositories>
 <repository type="svn" url="https://repo_for_All_OmegaT_team_project_sources">
  <mapping local="source/subdir" repository="">
   <excludes>**/*.bak</excludes>
   <includes>readme.bak</includes>
  </mapping>
 </repository>
</repositories>

All the contents of https://repo_for_All_OmegaT_team_project_sources are mapped to the local OmegaT project source folder, except all *.bak files but readme.bak.

Mapping for extra source files from the web: <repository type="http" url="https://github.com/omegat-org/omegat/raw/master/">
 <mapping local="source/Bundle.properties" repository="src/org/omegat/Bundle.properties"/>
</repository>

The remote file https://github.com/omegat-org/omegat/raw/master/src/org/omegat/Bundle.properties is mapped to the local file source/Bundle.properties.

Mapping with renaming:

<repository type="http" url="https://github.com/omegat-org/omegat/raw/master/">
 <mapping local="source/readme_tr.txt" repository="release/readme.txt"/>
</repository>

The remote file https://github.com/omegat-org/omegat/raw/master/release/readme.txt is mapped to the local file source/readme_tr.txt.

This makes it possible to rename the file to be translated.

Local file mapping:

<repository type="file" url="/home/me/myfiles">
 <mapping local="source/file.txt" repository="my/file.txt"/>
 <mapping local="source/file2.txt" repository="some/file.txt"/>
</repository>

The local file /home/me/myfiles/my/file.txt is mapped to the local file source/file.txt and /home/me/myfiles/some/file.txt is mapped to the local file source/file2.txt.

Warning: if a file does not exist, the project won't load.

You can add as many mappings as you want, but only in the context of a team project, i.e. one of the mappings includes omegat.project. This feature is intended for gathering source files, but you're not restricted to source files.

Note about omegat.project and mappings

When you create a new project and commit that to a repository, the omegat.project file doesn't contain any mapping. When you download the project, the project is converted to a team project locally and a default mapping is added.

Note that the omegat.project file in the repository is not changed automatically and still doesn't contain the mappings. When you load a project, all changes in the repository are copied to the local project, including the omegat.project file with project settings. If the file doesn't contain mappings, the existing local mappings are re-applied, and only those. All other local changes to the project are reverted. If the file does contain mappings, then local changes to mappings are lost.

Use a Team Project

OmegaT team projects must first be set up on a server.

To use a team project for the first time, follow the procedure provided by the project manager.

Once the project has been opened, translation proceeds in the same way as for a non-team project, except the following points.

Automatic saving

Every 3 minutes (by default), the local project is synchronised with the remote repository so that the project manager or other translators can see and use translations added during that period.

The interval of 3 minutes can be changed in OptionsPreferencesSaving and Output .

Synchronised files

Whenever the project is automatically saved, but also when it is opened, closed and reloaded, only two files are actually synchronised:

  • omegat/project_save.tmx

  • glossary/glossary.txt

All other files will be replaced by the files in the remote repository.

Adding new source files

To add a new source file:

  1. copy the files to the source folder

  2. use ProjectCommit Source Files

Existing source files can be modified, but the commit operation must be carried out before an automatic save and before the project is reloaded or closed.

Deleting source files

Files must be deleted by the project manager.

Changing segmentation rules or file filters

Project parameters must be changed by the project manager.

Working offline

A team project can be opened and translated offline. All the changes will be synchronised the next time a connection is available.

To work offline:

  • Disconnect from the network before opening the project,

  • or open the project with the command line using the --no-team option.

Reuse Translation Memories

Initially, that is when the project is created, the main TM of the project, project_save.tmx is empty. This TM gradually becomes filled during the translation. To speed up this process, existing translations can be reused. If a given sentence has already been translated once, and translated correctly, there is no need for it to be retranslated. Translation memories may also contain reference translations: multinational legislation, such as that of the European Community, is a typical example.

When you create the target documents in an OmegaT project, the translation memory of the project is output in the form of three files in the root folder of your OmegaT project (see the above description). You can regard these three tmx files (-omegat.tmx, -level1.tmx and -level2.tmx) as an "export translation memory", i.e. as an export of your current project's content in bilingual form.

Should you wish to reuse a translation memory from a previous project (for example because the new project is similar to the previous project, or uses terminology which might have been used before), you can use these translation memories as "input translation memories", i.e. for import into your new project. In this case, place the translation memories you wish to use in the /tm or tm/auto folder of your new project: in the former case you will get hits from these translation memories in the fuzzy matches viewer, and in the latter case these TMs will be used to pre-translate your source text.

By default, the tm folder is below the project's root folder (e.g. ...MyProject/tm), but you can choose a different folder in the project properties dialog if you wish. This is useful if you frequently use translation memories produced in the past, for example because they are on the same subject or for the same customer. In this case, a useful procedure would be:

  • Create a folder (a "repository folder") in a convenient location on your hard drive for the translation memories for a particular customer or subject.

  • Whenever you finish a project, copy one of the three "export" translation memory files from the root folder of the project to the repository folder.

  • When you begin a new project on the same subject or for the same customer, navigate to the repository folder in the Project > Properties > Edit Project dialog and select it as the translation memory folder.

Note that all the tmx files in the /tm repository are parsed when the project is opened, so putting all different TMs you may have on hand into this folder may unnecessarily slow OmegaT down. You may even consider removing those that are not required any more, once you have used their contents to fill up the project-save.tmx file.

Importing and exporting translation memories

OmegaT supports imported tmx versions 1.1-1.4b (both level 1 and level 2). This enables the translation memories produced by other tools to be read by OmegaT. However, OmegaT does not fully support imported level 2 tmx files (these store not only the translation, but also the formatting). Level 2 tmx files will still be imported and their textual content can be seen in OmegaT, but the quality of fuzzy matches will be somewhat lower.

OmegaT follows very strict procedures when loading translation memory (tmx) files. If an error is found in such a file, OmegaT will indicate the position within the defective file at which the error is located.

Some tools are known to produce invalid tmx files under certain conditions. If you wish to use such files as reference translations in OmegaT, they must be repaired, or OmegaT will report an error and fail to load them. Fixes are trivial operations and OmegaT assists troubleshooting with the related error message. You can ask the user group for advice if you have problems.

OmegaT exports version 1.4 TMX files (both level 1 and level 2). The level 2 export is not fully compliant with the level 2 standard, but is sufficiently close and will generate correct matches in other translation memory tools supporting TMX Level 2. If you only need textual information (and not formatting information), use the level 1 file that OmegaT has created.

Creating a translation memory for selected documents

In case translators need to share their TMX bases while excluding some of their parts or including just translations of certain files, sharing the complete ProjectName-omegat.tmx is out of question. The following recipe is just one of the possibilities, but simple enough to follow and without any dangers for the assets.

  • Create a project, separate for other projects, in the desired language pair, with an appropriate name - note that the TMXs created will include this name.

  • Copy the documents, you need the translation memory for, into the source folder of the project.

  • Copy the translation memories, containing the translations of the documents above, into tm/auto subfolder of the new project.

  • Start the project. Check for possible Tag errors with Ctrl + T and untranslated segments with Ctrl + U . To check everything is as expected, you may press Ctrl + D to create the target documents and check their contents.

  • When you exit the project. the TMX files in the main project folder (see above) now contain the translations in the selected language pair, for the files, you have copied into the source folder. Copy them to a safe place for future referrals.

  • To avoid reusing the project and thus possibly polluting future cases, delete the project folder or archive it away from your workplace.

Sharing translation memories

In cases where a team of translators is involved, translators will prefer to share common translation memories rather than distribute their local versions.

OmegaT interfaces to SVN and Git, two common team software versioning and revision control systems (RCS), available under an open source license. In case of OmegaT complete project folders - in other words the translation memories involved as well as source folders, project settings etc - are managed by the selected RCS. see more in Chapter

Using TMX with alternative language

There may be cases where you have done a project with e.g. Dutch sources, and a translation in say English. Then you need a translation in e.g. Chinese, but your translator does not understand Dutch; she, however, understands perfectly English. In this case, the NL-EN translation memory can serve as a go-between to help generate NL to ZH translation.

The solution in our example is to copy the existing translation memory into the tm/tmx2source/ subfolder and rename it to ZH_CN.tmx to indicate the target language of the tmx. The translator will be shown English translations for source segments in Dutch and use them to create the Chinese translation.

Important: the supporting TMX must be renamed XX_YY.tmx, where XX_YY is the target language of the tmx, for instance to ZH_CN.tmx in the example above. The project and TMX source languages should of course be identical - NL in our example. Note that only one TMX for a given language pair is possible, so if several translation memories should be involved, you will need to merge them all into the XX_YY.tmx.

Prevent data loss

OmegaT is a robust application. However, you should take precautions against data loss when using OmegaT, just as with any other application. When you translate your files, OmegaT stores all your progress in the translation memory project_save.tmx that resides in the project's omegat subfolder.

OmegaT also backs up the translation memory to project_save.tmx.YEARMMDDHHNN.bak in the same subfolder each time a project is opened or reloaded. YEAR is the 4-digit year, MM is the month, DD the day of the month, and HH and NN are the hours and minutes when the previous translation memory was saved.

If you believe that you have lost translation data, you can use the following procedure to restore the project to its most recently saved state, usually not older than approximately 10 minutes or so:

  1. close the project

  2. rename the current project_save.tmx file (e.g. to project_save.tmx.temporary)

  3. select the backup translation memory that is the most likely to contain the data you are looking for

  4. rename it project_save.tmx

  5. open the project

To avoid losing important data:

  • Make regular copies of the file /omegat/project_save.tmx to backup media, such as CD or DVD.

  • Until you are familiar with OmegaT, create translated files at regular intervals and check that the translated file contains the latest version of your translation.

  • Take particular care when making changes to the files in source while in the middle of a project. If the source file is modified after you have begun translating, OmegaT may be unable to find a segment that you have already translated.

  • Use these Help texts to get started. Should you run into problems, post a message in the OmegaT user group. Do not hesitate to post in the language you feel the most familiar with.

Translate a PDF file

PDF files are a special case. They contain text formatting information, but such information cannot be reused by OmegaT in order to create target files. Thus, PDF files are handled as plain text files, and output files are plain text files.

If you need to reproduce text formatting (as well as other things such as drawings) in your translation, there are three ways to try:

  1. Use OmegaT’s default filter (PDF input), translate, create a target file (it will be a plain text file), add relevant formatting and items manually.

  2. Use the Iceni Infix filter. See Howto - Translating PDF files with Iceni Infix and OmegaT.

  3. Import the source file to LibreOffice Draw, save it as an ODG file, translate, export to PDF as needed.

Note: the above information applies only to PDF files with a text layer. If you have a PDF file made of scanned pages (sometimes such files are referred to as ‘dead’ PDFs), you need to use an OCR (optical character recognition) program to recognize the text and convert it to a format that can be handled by OmegaT.

Other file formats

Other plain text or formatted text file formats suitable for processing in OmegaT may also exist.

External tools can be used to convert files to supported formats. The translated files will then need to be converted back to the original format. For example, if you have an outdated Microsoft Word version, that does not handle the ODT format, here's a round trip for Word files with the DOC extension:

  • import the file into ODF writer

  • save the file in ODT format

  • translate it into the target ODT file

  • load the target file in ODF writer

  • save the file as a DOC file

The quality of formatting of the translated file will depend on the quality of the round-trip conversion. Before proceeding with such conversions, be sure to test all options. Check the OmegaT home page for an up-to-date listing of auxiliary translation tools.

Manage Right-To-Left languages

Justification of source and target segments depends upon the project languages. By default, left justification is used for Left-To-Right (LTR) languages and right justification for Right-To-Left (RTL) languages. You can toggle between different display modes by pressing Shift + Ctrl + O (this is the letter O and not the numeral 0). The Shift + Ctrl + O toggle has three states:

  • default justification, that is as defined by the language

  • left justification

  • right justification

Using the RTL mode in OmegaT has no influence whatsoever on the display mode of the translated documents created in OmegaT. The display mode of the translated documents must be modified within the application (such as Microsoft Word) commonly used to display or modify them (check the relevant manuals for details). Using Shift + Ctrl + O causes both text input and display in OmegaT to change. It can be used separately for all three panes (Editor, Fuzzy Matches and Glossary) by clicking on the pane and toggling the display mode. It can also be used in all the input fields found in OmegaT - in the search window, for segmentation rules etc.

Mac OS X users, note: use Shift + Ctrl + O shortcut and not cmd + Ctrl + O .

Mixing RTL and LTR strings in segments

When writing purely RTL text, the default (LTR) view may be used. In many cases, however, it is necessary to embed LTR text in RTL text. For example, in OmegaT tags, product names that must be left in the LTR source language, place holders in localization files, and numbers in text. In cases like these it becomes necessary to switch to RTL mode, so that the RTL (in fact bidirectional) text is displayed correctly. It should be noted that when OmegaT is in RTL mode, both source and target are displayed in RTL mode. This means that if the source language is LTR and the target language is RTL, or vice versa, it may be necessary to toggle back and forth between RTL and LTR modes to view the source and enter the target easily in their respective modes.

OmegaT tags in RTL segments

As stated above, OmegaT tags are LTR strings. When translating between RTL and LTR languages, correctly reading the tags from the source and entering them properly in the target may require the translator to toggle between LTR and RTL modes numerous times.

If the document allows, the translator is strongly encouraged to remove style information from the original document so that as few tags as possible appear in the OmegaT interface. Follow the indications given in Hints for tags management. Frequently validate tags (see Tag validation) and produce translated documents (see below and Menu) at regular intervals to make it easier to catch any problems that arise. A hint: translating a plain text version of the text and adding the necessary style in the relevant application at a later stage may turn out to be less hassle.

Creating translated RTL documents

When the translated document is created, its display direction will be the same as that of the original document. If the original document was LTR, the display direction of the target document must be changed manually to RTL in its viewing application. Each output format has specific ways of dealing with RTL display; check the relevant application manuals for details.

For .docx files, a number of changes are however done automatically:

  • Paragraphs, sections and tables are set to bidi
  • Runs (text elements) are set to RTL

To avoid changing the target files display parameters each time the files are opened, it may be possible to change the source file display parameters such that such parameters are inherited by the target files. Such modifications are possible in ODF files for example.