OmegaT 6.0.1 - User Manual

Appendices

File Filters

Warning

File filters are either local and specific to a given project, or global and available to all the projects that share a configuration folder.

For details, see:

Filters in bold are used in the current project.

Disable a filter by unchecking its box if you prefer not to translate the files that are associated to it. Their contents will not be displayed for translation.

Note

You can sort the filters by name or by whether they are enabled. Click on the relevant header to sort them in ascending or descending order.

To modify the file extensions, target file name and encodings associated to a filter, select it in the list and click the Edit... button.

Some filters provide a Options... button to further customize their settings.

Click the Restore Defaults button to reset the file filters to their default settings.

Modified global file filter preferences are saved in filters.xml, in the configuration folder. See Configuration Folder for details. Deleting that file also resets the filter preferences.

Modified local file filters are saved in the filters.xml file, located in the project folder. See the Project Folder chapter for details. Deleting that file also resets the filter preferences and reverts the project to global file filters.

Common preferences

Hide leading and trailing tags

Leading and trailing tags are generally required by OmegaT to properly recreate the translated segment. Hiding them from the translatable contents ensures that you will not erase or modify them by mistake.

If you keep the leading and trailing tags, make sure you also include them in the translated text.

Remove leading and trailing whitespace in non-segmented projects

By default, OmegaT removes any leading and trailing whitespace from the translatable contents. In non-segmented projects, disable this option to make leading and trailing whitespace modifiable in the translation.

Preserve spaces for all tags

If the source documents contain whitespace used to control the layout, the whitespace that must be will retained in the translated document.

Do not use the file name to identify alternate translations

The source file name is one of the elements that characterize an alternative translation. If this option is checked, only the previous/next segments or a segment identifier will be used to characterize an alternative translation.

Segments with the same characteristics located in other files will be translated the same way.

Edit

Double-click the editable fields to make simple modifications or click on the Edit... button to access the modification dialog.

To add a filter pattern, click on Add... to open a similar dialog.

Both dialogs allow you to customize the filename patterns for the source and target files associated to this the filter, and select their respective encoding.

Use the Filename Variables drop-down menu to customize the target file name.

Source filename pattern

To associate a filter to a file, OmegaT checks its file extension and attempts to match it to a source filename patterns in a filter.

For example, the pattern .xhtml registered in the XHTML filter matches any file with the xhtml extension. If such a file is found in the source folder, the file will be handled by the XHTML filter.

You can change or add filename patterns to associate different files to a filter.

Warning

Associating a file extension to a filter is not sufficient to have the filter properly handle the file. The file structure must also be compatible with the filter: even if you associate .odt to the XHMTL filter, the filter will not be able to understand the contents of a LibreOffice Writer file.

Source filename patterns use wild card characters : The * character matches zero or more characters, while the ? character matches exactly one character.

For example, use the pattern read* if you want to have the text filter handle readme files (readme, read.me, or readme.txt).

Source and translated file encoding

Most file formats allow various possible encodings. By default, the encoding of the translated file is the same as that of the source file.

The source and target encoding fields use drop-down menus listing all supported encodings. Selecting the <auto> option leaves the choice of encoding to OmegaT, based on the following criteria:

  • OmegaT uses the encoding declaration in the source file, if present, to identify the encoding (HTML or XML based files).

  • OmegaT is instructed to use a mandatory encoding for certain file formats (Java properties, for example).

  • OmegaT uses the default encoding of the operating system for text files.

Translated filename

Files in the target folder are overwritten every time you create them if they are created with the same name.

OmegaT can automatically create new file names for the files you create, by adding a language code or a time stamp, for example.

The target filename pattern uses a special syntax. The easiest way to modify it is to use the Edit Pattern dialog. The dialog offers various options:

${filename}

The default pattern. It represents the complete filename of the source file, including the extension. Using this pattern assigns the translated file the exact same name as the source file.

${nameOnly}

name of the source file, without the extension

${extension}

original file extension

${targetLocale}

target language+region code (xx_YY)

${targetLanguage}

target language+region (xx-YY)

${targetLanguageCode}

target language code (xx)

${targetCountryCode}

target region code (YY)

${timestamp-????}

system time when the file was created

See the Oracle documentation for examples.

${system-os-name}

name of the operating system

${system-user-name}

user’s login name

${system-host-name}

host name on the system

${file-source-encoding}

encoding of the source file

${file-target-encoding}

encoding of the target file

${targetLocaleLCID}

Microsoft target locale

Additional variants are available for ${nameOnly} and ${extension}.

If the use of multiple periods makes identifying the file name and extension ambiguous, you can use variables of the form ${nameOnly- number } or ${extension- number} to specify which portions are part of the name or extension, as shown in the example below.

Example 26. Target file names

For a source file named Document.xx.docx, using the variable variants below will produce the following results:

  • ${nameOnly-0}: Document

  • ${nameOnly-1}: Document.xx

  • ${nameOnly-2}: Document.xx.docx

  • ${extension-0}: docx

  • ${extension-1}: xx.docx

  • ${extension-2}: Document.xx.docx


Options

Several filters offer options. Select the filter in the list and click Options... to modify them.

The available options are:

Text files

Create paragraphs on:

Text files do not have generic paragraph markers. Choose here the way OmegaT creates paragraphs in your text files.

Line length in target files (0 = no limit)

Line length

specifies the maximum number of characters before breaking a long line. A value of 0 sets no limits.

Maximum line length

specifies the maximum number of characters before cutting a line and ignoring the rest. A value of 0 sets no limits.

Microsoft Office Open XML files

Warning

The Microsoft Office Open XML (legacy filter) is the original OmegaT filter. You should only use it to avoid compatibility issues with previous projects containing files you handled with that filter.

You can choose additional document elements to translate. They will appear as separate segments in the editor.

Word

Non-visible instruction text, comments, footnotes, endnotes, footers, duplicate fallback text, and document properties.

Excel

Comments and sheet names.

Power Point

Slide comments, slide masters, and slide layouts.

Global

External links, charts, diagrams, drawings, and WordArt.

Other Options:
Aggregate tags

Tags that do not enclose translatable text will be aggregated into a single tag.

Preserve spaces for all tags

Whitespace (i.e., spaces and newlines) will be preserved, even if this option is not defined in the document.

Start a new paragraph on Word soft-returns

Enable this option if soft-returns are intended to be paragraph starters.

XHTML Files
Translate the following attributes

The selected attributes will appear as translatable segments in the Editor pane.

Start a new paragraph on

The <br> HTML tag will constitute a paragraph break for segmentation purposes.

Ignored paragraphs (regular expression)

Any paragraph matching the regular expression is ignored while loading and is not displayed for translation.

This option is convenient when dealing with HTML parts that only contain non translatable text.

Ignored <meta> tags "content" attribute

Define the <meta> tag attribute values for which the associated "content" attribute will not be translated.

Do not add quotation marks and separate the values with a comma.

Example 27. Ignore the content part of <meta name="robots" content="index, follow">

To ignore this content:

<meta name="robots" content="index, follow">

use:

name=robots


Ignored tags (attribute=value)

Define the attribute values that make a tag non-translatable.

Do not add quotation marks and separate the values with a comma.

Example 28. Ignore tags that contain translate="no"

To ignore this content:

<span translate="no">This content is not translatable</span>

use: translate=no.

All the tags that are marked with translate="no" will be ignored.


HTML and XHTML files

Only the options not available under the XHTML files filter (see above) are described here.

Modify encoding declaration

The encoding of an HTML document is generally declared within a <meta> element situated in the <head> element.

Source and target files sometimes require a different encoding.

Here, you can decide whether to add or modify the declaration of the target file

  • always, based on the file filter settings,

  • only if the file already has a <head> tag,

  • only if the file already has a declaration,

  • or never and only save the target file in the encoding specified in the file filter settings.

Compress whitespace in translated file

Whitespace outside the tags is considered non significant in HTML/XHTML.

This option converts such multiple continuous whitespace characters into a single space in the translated document.

Remove HTML comments

Comments in an HTML file are generally addressed to developers. Use this option to remove them. If unchecked, the comments are displayed as tags.

Text in HTML comments (between <!-- and -->) are not copied into the translated document.

Mozilla FTL
Remove untranslated strings in the target files

Having untranslated contents in the translated files sometimes creates compatibility issues.

Mozilla DTD
Remove untranslated strings in the target files

Having untranslated contents in the translated files sometimes creates compatibility issues.

PO files

The filter checks printf variables ('%s', etc.) by default. See the Check printf function variables preference for details.

Allow blank target segments

OmegaT always reproduces the source contents when a segment is not provided. Use this option to leave a non translated segment blank.

Translate blank source segments

Blank source segments sometimes act as placeholders for parts that do not exist in the source language but are necessary in the target language. Use this option to provide a translation based on the associated comments.

Ignore PO header

The PO header will not be displayed for translation.

Auto replace plural specification

Override the plural specification in the header and use the target language default.

Format:
Standard

PO files that use msgid as the source container and expect the translation to be put in msgstr

Monolingual

PO files that use msgid as an ID code, use msgstr as the source container and expect the translation to overwrite msgstr

Moodle PHP
Remove untranslated strings in the target files

Having untranslated contents in the translated files sometimes create compatibility issues.

Java Resource bundle

The filter checks Java MessageFormat patterns (e.g. \{0\}) by default. See the Check printf function variables preference for details.

Force Unicode literals compatibility with Java 8

Java 8 requires ISO-8859-1 encoding and uses Unicode literals for characters outside that character set. Java 9 and above requires UTF-8 encoding. This option forces Java 8 compatibility.

Remove untranslated strings in the target files

Having untranslated contents in the translated files sometimes create compatibility issues.

Keep Unicode literals (\\uXXXX)

Some applications require some Unicode literals to be kept. This option allows for that.

Open Document Format (ODF) files
Translate the following elements

Index entries, bookmarks, bookmark references, notes, comments, presentation notes, links (URL), and sheet names.

XLIFF (legacy filter)

Warning

This filter is the original OmegaT XLIFF filter. You should only use it to avoir compatibility issues with previous projects containing files you handled with that filter.

Compatibility with OmegaT 2.6

Enable this option if you need to work with XLIFF files created with OmegaT 2.6.

Identifier used for alternative translations

User can select from three options, Previous and next paragraphs, <trans unit> ID, or <trans-unit> resname attribute when available, when unavailable, the ID will be used as a fallback.

Tag shortcuts

These options specify the way OmegaT creates tags from the XLIFF contents.

Target segment status

if checked, OmegaT changes the XLIFF target state to “needs-review-translation” instead of “translated”.

Segmentation

Paragraph or sentence?

Translation memory tools work with textual units called segments. When a translation is entered, the segment containing the source text is stored with its translation in the project memory, and subsequently used to match other source segments in the project.

To specify the type of segmentation, use the Sentence-level segmenting project property.

Segments are by default paragraphs defined by the file format itself.

Not using sentence segmentation on a document is equivalent to using paragraph segmentation. In that case, each paragraph (as defined in the original document format) is displayed as a single segment, and the translator is free to reorganize the sentences within the segment in the translation.

Paragraph segmentation works well with more literary or creative texts, as well as, more generally, with documents for which translation memory matches are not so important.

Sentence segmentation relies on a number of rules (called segmentation rules ) that define what constitutes a sentence in the source language. This setting works well with documents where repetitions or similar sentences are common, such as technical or legal documents.

Paragraph-Level segmentation

OmegaT first parses the text for paragraph-level segmentation. This process relies only on the structure of the source file to produce segments.

For example, text files may be segmented on line breaks, empty lines, or not at all. Files containing formatting (ODF, HTML, or other documents) are divided at block-level (paragraph) tags. Translatable object attributes in XHTML or HTML files can be extracted as separate "paragraphs".

Sentence-level segmentation

After dividing the source file into structural units, OmegaT further divides those units into segments.

You can visualize segmentation as the process of moving the cursor along the text, one character at a time, and looking for the position where a break will occur, or where a break will not be allowed.

Each time the cursor moves to the next character, OmegaT checks whether:

  • the text before the location corresponds to a Before rule,

  • and the text after the location corresponds to the associated After rule.

If the location matches both rules, it is considered either as a break, or as a non-break, depending on what the rule defined.

Global or local?

Note

The same mechanisms and dialogs are used to define global and local segmentation rules.

By default, segmentation settings are global and shared by all projects.

Use the Local Segmentation Rules... project property to limit the scope of the segmentation rules to the current project.

You can achieve a similar result by starting OmegaT from the command line. See the Command line launch how-to for details.

If you use local rules, you can still access the global rules, but modifying them will have no effect on your project.

Rules

OmegaT provides predefined segmentation rules, and the translator can use regular expressions to modify them. See the Regular expressions appendix for details.

As a reminder, rules work the following way: when a rule matches, OmegaT puts a marker at the match location so that rules that come after ignore that location. That is the reason why exception rules must come before segmentation rules.

Warning

If you change the segmentation while translating, you will have to reload the project for the new segmentation to take effect. This will split or merge some previously translated segments, which will therefore no longer be considered translated. Nonetheless, their original translation will still be in the project memory.

Table 5. A few simple examples
Category Intention Before After Explanation
Exception rule, box unchecked, higher in the list Do not segment after Ms. Ms\. \s Ms, followed by a period, followed by a whitespace.
Exception rule, box unchecked, higher in the list Excel cells with lines breaks that do not represent segments \n . Line break, followed by anything.
Break rule, box checked, lower in the list Start a new segment after a period followed by a space, tab, or other whitespace. \. \s A period followed by a whitespace
Break rule, box checked, lower in the list Start a new segment after “。” (Japanese period).   Note that the Pattern After field can be empty.

Regular expressions

This appendix is intended for users interested in exploring a powerful way to boost their productivity. Although seen as daunting and complex, even the simplest regular expressions (often abbreviated regex or regexp ) are extremely useful, not only in OmegaT, but in many other applications you might use on a day-to-day basis, with some variations.

Only the fundamentals most useful to translators are covered. The References section at the end of this appendix provides a few starting points to explore advanced or complex uses beyond the scope of this manual. If you need help for a specific case, you can also ask questions in the various support channels.

Regular expressions use a combination of letters, digits, and symbols (collectively known as characters ) to define an expression that represents a specific text pattern.

Here are a few examples.

[0-9]

Any single digit from 0 to 9.

\w+

Represents one or more “word characters”, namely the letters of the alphabet, digits, and underscore symbols.

\h?

Represents zero or one horizontal whitespace character (this includes regular and non-breaking spaces as well as tabs, but not line break characters, which belong to the “vertical whitespace” category: \v.)

Many OmegaT functions rely on regular expressions or make them available as an option:

Searches

Searches include a Regular expressions option that allows you to make extremely powerful searches across your files.

The same option in the Text Replace dialog allows you to apply regular expressions to both the search and replaced text.

Custom tags

Custom tags are tags defined with regular expressions that are handled exactly like native OmegaT tags. See the Custom tags preference for details.

Use the | (OR) character to separate individual tag definitions.

Flagged text

The Flagged text preference allows you to define strings that OmegaT will mark in red by default, and treat as extraneous tags for validation purposes.

Use the | (OR) character to separate individual fragment definitions.

Text highlighting in alignments

Visual cues can help verifying that your alignment is correct. The Highlight setting allows you to define strings that OmegaT will highlight in the aligned documents.

Use the | (OR) character to separate individual expressions.

Segmentation

Segmentation rules and language patterns are defined with regular expressions. You can modify them freely to improve the segmentation of a document or add additional general rules. See the Segmentation appendix for details.

Segmentation or exception rules define the position in a segment where a split will, or will not, be made. Two regular expressions are required to define that position: a “before” expression to define the text pattern ahead of where the rule should apply, and an “after” expression to define the text pattern following that position.

A language pattern that matches the source language of the project will apply to that project.

The 4 rules

Regular expressions are used to find text, including characters that are not visible on the screen or when printed out, such as spaces, tabs, or line breaks. Any given expression either matches , or does not match a word, phrase, or other sequence of text.

Each and every character in the expression is relevant when determining a match.

A number of characters or combinations of characters have a special meaning in a regular expression.

Warning

Regular expressions only match text. They cannot match decorations such as bold , italics , or other stylistic effects .

There are four rules to keep in mind.

Most characters simply match themselves

The majority of characters in a regular expression simply look for themselves in the text sequence.

For example, the seven letters spelling out the word “ example ” simply tell the search function to match exactly those letters, in that order. Simply put, the search just looks for the word “ example ”.

Digits and letters of the alphabet preceded by a backslash (\) take on a special meaning

Unlike a letter on its own, which simply represents itself as noted above, a letter preceded by a \ has a special function in a regular expression.

For example, r is just a normal character but preceding it with \ to make it \r turns it into a special combination that matches a carriage return character . Similarly, \R matches any line break character .

Note

Only the letters i j l m o , and y , in both lower- and uppercase, have no special meaning when preceded by a backslash. This manual only describes a small subset of letters that take on a special meaning.

Consult the sites in the References section below for information on combinations not covered here.

Twelve characters have a special meaning by default

That special meaning has to be cancelled by another character to match the character itself.

The full list of characters is presented below. One example is .: on its own, it has the special meaning of matching any single character .

To find a normal period, that meaning has to be cancelled using the \, to make the expression \., which just matches a period.

The \ character is a very special character

As stated above, the \ character has the default special meaning of either cancelling or activating the special meaning of other characters. It has no effect if placed before a character with no special meaning (either by default or by addition).

The \ can cancel its own special meaning by doubling up to form \\, which simply matches the backslash character itself.

The 12 characters

The twelve special characters are the backslash \, the caret ^, the dollar sign $, the period (or dot ) ., the vertical bar (or pipe symbol ) |, the question mark ?, the asterisk (or star ) *, the plus sign +, the opening parenthesis (, the closing parenthesis , ), the opening square bracket [, and the opening curly brace {.

Each character is briefly described below with examples of regular expressions that rely on the character as well as of text that they do, or do not, match.

The BACKSLASH: \

This character either cancels or activates the special meaning of the following character.

  0\.[0-9]
Matches

A number between 0.0 and 0.9 , or just the final 0.5 in numbers such as 10.5 or 560.5.

The \. cancels the “any character” meaning of the period to match the decimal point, while the \d turns the ordinarily lowercase “d” letter into an expression that matches any digit between 0 and 9.

Does not match

Sequences such as 0,1, 0-3, or the first three characters of 0x002E, which would be matched if the expression was just 0.[0-9], with no backslash before the period

The CARET: ^

When it is the first character in the expression, the caret character matches the beginning of a line.

When it is the first character in a character class enclosed in brackets, it matches all the characters that are not part of that class.

 
  1. ^A

  2. [^abc]

Matches
  1. The uppercase “A” in the following sentence: “A long, but exciting journey was about to begin”.

  2. Any character that is not “a”, “b”, or “c”. In the word “back”, for instance, only the “k” is matched.

Does not match
  1. The uppercase “A” in the following sentence: “My friend is writing a book called A Long Journey ”.

  2. The lowercase “a”, “b”, or “c” in the word “back”.

The DOLLAR sign: $

When it is the last character in an expression, the dollar sign matches the end of a line.

  ^\w+:$
Matches

A line that consists of a single word and ends with a colon:

Questions:

Does not match

A line that consists of a single word, but does not end in a colon:

Questions?

The PERIOD: .

Matches any single character.

  c.t
Matches

Any combinations of three letters starting with “c” and ending with “t”: “ cat ”, “ cut ”, “ cot ”, or even nonsensical combinations such as “ czt ” or “ cqt ”.

Does not match

Combinations containing three letters that start with “c” and ending with “t”, but are split across more than one line.

What is the missing letter?

c
t
The VERTICAL BAR: |

This character functions as an “OR” and matches either of the expressions that precede or follow it.

  ^An|^The
Matches

The initial “An” or “The” in phrases such as:

“An apple a day…”
“The apple of my eye…”

Does not match

The initial “An” or “The” in phrases such as:

“A story called An Unsung Hero .”
“They work for The Daily Post .”

The QUESTION MARK: ?

This character specifies that either zero or one instance of the preceding character or expression should be matched.

  an?␣ (where “␣” represents a single space).
Matches

Either the “a ” or the “an ” in:

“I have a question.”
“I know an excellent doctor.”

It will also find the final “an ” of “Can ” in a sentence such as “Can I help you?”, or the final “a ” of “pasta ” in “We had pasta for lunch.”

Does not match

Neither the “a” nor the “an” in:

The indefinite article: “a” (or “an”).

They are not followed by a space.

The ASTERISK: *

This character specifies that zero or more instances of the preceding character or expression should be matched.

  run\w*
Matches

The word “run”, as well as “runs”, “runner”, “runway”, “runt” in “grunt” or “brunt”, and any other word or sequence of characters containing “run” followed by zero or more “ word characters ” (which include digits and the underscore, so the part before the “@” in an email address such as run_123@example.email.org is also a match).

Does not match

The complete phrase in “run-on” or “run'n'gun”, because the hyphen and apostrophe are not included in \w. Only the initial “run” in those phrases is matched.

The PLUS sign: +

This character specifies that one or more instances of the preceding character or expression should be matched.

  \d+.d
Matches

Numbers such as “1.5”, “23.2” or “5235.8” with a single decimal place and any number of digits before the decimal point.

Does not match

The entire value of numbers such as “5,235.8” or “21,571.9”. Only the portion of the after the thousands separator will be matched.

The OPENING PARENTHESIS: (

This character starts a group , which is a set of characters treated as a single unit. Groups are numbered, and their contents group are stored in memory. They can be reused later in the search expression using \n , where n is the number of the group.

Note

The content of the group can also be used in the replacement text. Use $n , where n is the number of the group defined in the search.

Parentheses are always used in opening and closing pairs. Trying to use only the opening or closing parenthesis on its own will cause an error.

  (\b\w+\b)\h\1\b
Matches

Doubled up words separated by a space, such as the consecutive “an” in the following sentence:

“I bought an an apple.”

Does not match

The “that, that” in the following sentence:

“But that, that is just unbelievable”, because the first “that” is followed by both a comma and a space rather than only a space.

The CLOSING PARENTHESIS: )

This character closes a group. It is special because it can never be used on its own. It must be preceded by the \ if you need to match the closing parenthesis character itself.

  ^\d+\)
Matches

The sequence number (including the parenthesis) at the beginning of each line in a list such as:

1) Apples
2) Oranges
3) Pears
Does not match

Sequence numbers that are not at the beginning of a line.

Follow these steps:

Step 1) Preparation
Step 2) …
The OPENING SQUARE BRACKET: [

This character must be paired with the closing square bracket to enclose a set of individual characters that each represent a valid potential match.

Only the opening bracket is special and needs to be preceded by a backslash to search for the bracket character itself. If you only want to match the closing bracket as itself, you do not need to precede it with a backslash. (You can still add it, but it will have no effect on the expression or the result.)

  li[cs]en[cs]e
Matches

The correct “licence” and “license” spellings, as well as the potential “lisence” and “lisense” misspellings

Does not match

More egregious misspellings such as “licensse” or “lissense”.

The OPENING CURLY BRACE: {

This character must be paired with the closing curly brace to encloses an exact number , minimum , maximum , or range specifying how many instances of the preceding character or group should be matched.

Only the opening brace is special and needs to be preceded by a backslash to search for the brace character itself. If you only want to match the closing brace as itself, you do not need to precede it with a backslash. (You can still add it, but it will have no effect on the expression or the result.)

  \d{4}/\d{1,3}
Matches

Codes such as “1234/5”, “1472/69”, or “9513/842” consisting of four digits, a forward slash, and one to three more digits.

Does not match

Codes such as “123/45”, “1472/6985”, or “95133/15746”.

Caution: Although the last two codes above are not matched completely, the expression will return the “ 1472/698 ” portion of “1472/6985”, as well as the “ 5133/157 ” of “95133/15746”.

Lots of expressions

This section presents various types of regular expression, ranging from the simple to the complex.

Note

Remember that most alphabetic characters preceded by a \ turn into an expression that represents not the character itself, but its associated special meaning .

Simple expressions

The simplest regular expression consist of a single character, or combination of a \ and a character constituting a unit with a single meaning.

Table 6. Characters
Expression Match
x

The character “x” itself

Most characters match themselves.

\t

The tab character, not the letter “t”.

\n The newline (line feed) character, not the letter “n”.
\r

The carriage-return character, not the letter “r”.

similarly, \R is any line break character.


Case

Ordinary OmegaT searches are case insensitive by default: they match both uppercase and lowercase characters, unless you choose to enable the Options option. Doing so makes the entire search expression case sensitive.

In contrast, Regular expressions are case sensitive by default. This means that a regular expression search for “OmegaT”, for example, will not match “omegat”. However, regular expressions also provide special modifiers to specify case sensitivity within the expression:

(?i)

Makes the part of the expression to the right of the modifier case insensitive.

(?-i)

Makes the part of the expression to the right of the modifier case sensitive.

You can take advantage of this to apply a fine degree of control to case sensitivity in searches. Suppose, for example, that you want to find instances of “OmegaT” and “omegat”, but not “OMEGAT”. You can do so with the following expression: (?i)o (?-i)mega (?i)t, which represents a case insensitive “o” followed by a case sensitive “mega”, followed by a case insensitive “t”.

Classes

Regular expressions allow you to create sets of characters—known as classes . Searches will match any of the characters in the set.

Classes are defined by enclosing the desired characters in square brackets, and can be specified either by listing each individual character to include, or by specifying a range of characters. For example, you could create the [£€$] class to find any of those three currency symbols in the text, or [1-3] to find the number 1, 2 or 3.

Note

Inside a class, only the backslash (\), caret (^), closing bracket (]) and hyphen (-) are special. The rest of the twelve characters are normal, and do not have to be preceded by a backslash if you want to search for those characters themselves.

You can search for any of the four class special characters as normal characters by preceding them with a backslash. You can also search for the caret, closing bracket, and hyphen as themselves by placing them at a position that does not trigger their special meaning: anywhere except right after the opening bracket for the caret, immediately after either opening bracket or the caret following it for the closing bracket, and either just after the opening bracket or just before the closing bracket for the hyphen.

Many frequently used sets have a shorthand form consisting of a backslash followed by a letter of the alphabet. For example, \d is a shorthand for [0-9], which matches any digit between 0 and 9. In many cases, the corresponding uppercase later is used to negate the class: \D matches any character that is not a digit.

The table below provides various additional examples. These classes never represent only the actual letter used to form the shorthand.

Table 7. Examples of classes
Expression Match
[abc]

The letter “a”, “b”, or “c”.

A simple class consists of any number of characters enclosed by [ and ].

[C-X]

A character in the range of letters from “C” through “X”.

A range is defined by the first character in a series, followed by a hyphen, followed by last character in the series. Any number of ranges can be defined: [a-zA-Z0-9] means any lowercase character from “a” to “z”, or any uppercase character from “A” to “Z”, or any digit from “0” to “9”. A hyphen placed outside the series is just a hyphen: [a-z-] means any lowercase character from “a” to “z”, or the hyphen character itself.

[^\n\r\t]

Any character except a newline, a carriage return, or tab.

The caret placed immediately after the opening square bracket excludes the rest of the characters in the class.

\w

A word character, generally defined as [A-Za-z0-9_].

\W is any character that is not a word character ([^\w]).

\s

A whitespace character, including the space and tab characters, as well as line breaks.

\S is any character that is not a whitespace character ([^\s]).

\h and \v

Horizontal and vertical whitespace (generally preferred to \s).

\H is any character that is not a horizontal white space, and \V any character that is not a vertical white space ([^\h] and [^\v], respectively).


Regular expressions are not limited to alphanumeric characters. They cover the entire Unicode character set. Use Unicode blocks, scripts and categories to specify character classes outside the alphanumeric character range. A few examples are presented in the table below.

See also Unicode Regular Expressions for a thorough review of Unicode regular expressions.

Table 8. Unicode blocks, scripts and categories
Expression Match
\p{InGreek}

A character in the Greek block (Unicode block)

\P{InGreek} is any character that is not in the Greek block.

\p{IsHan}

A logogram ( Han / kanji / hanja character) found in CJK languages (Unicode script)

\p{Lu}

An uppercase letter (Unicode category)

\p{Sc}

A currency symbol, which is also a Unicode category.


More advanced expressions

Some expressions specify a position rather than a character. They indicate where in the text to look for the match, but do not include any characters in that match. The table below list a few of the more common examples. Consult the sites in the References section for more information.

Table 9. Expressions that mark a position
Expression Match
^

The beginning of a line

$

The end of a line

\b

A word boundary

\B

Not a word boundary

(?=u)

A character followed by a “u”.

For example, q(?=u) matches the letter “q” when it is followed by a “u”. It therefore matches the “q” in “equal” or “question”, but not the one in “qigong” or “Iraq”.

(?!u)

A character that is not followed by the letter “u”.

For example, q(?!u) matches the letter “q” when it is not followed by a “u”. It therefore matches the “q” in “qigong” or “Iraq”, but not the one in “equal” or “question”.

(?<=q)

A character preceded by the letter “q”.

For example, (?<=q)u matches the letter “u” if it comes after “q”. It therefore matches the “u” in “quick”, but not the one in “run”.

(?<!q)

A character that is not preceded by the letter “q”.

For example, (?<!q)u matches the letter “u” if it does not come after “q”. It therefore matches the “u” in “run”, but not the one in “quick”.


More examples

This section presents a few examples demonstrating how the various expressions described above can be combined to perform powerful searches in OmegaT.

Table 10. Examples of regular expressions that use the above expressions
Expression Purpose
(\b\w+\b)\h\1\b

Find double words.

,\h[\h(\w+\.\w+)\w,'ʼ"“”-]+[\.,]

Find clauses that start with a comma followed by a whitespace character, contain one or more words (including words in quotation marks, contractions, and filenames with a file extension), and end either with a comma or period.

\. \h+$

Find extra whitespace after the period at the end of a line.

\h+a\h+[aeiou]

Find words starting with a vowel that come after the article “a” rather than “an”.

\h+an\h+[^aeiou]

The flip side of the preceding example. Find words starting with consonant that come after “an” rather than “a”.

\d{4}([/\.-]\d{1,2}){2}

Find numerical dates in year, month, and day order with the month and day separated by a slash, period, or hyphen, such as:

  • 2002/11/8

  • 1969.7.20

  • 2022-10-31

Note

This expression finds number and separator patterns matching possible dates, but does not validate them. It will also find patterns such as “5136/36/71”.

\.[A-Z]

Find a period followed by an uppercase letter. Useful to find possible missing spaces between the period and the start of a new sentence

\bis\b

Find “is” as a whole word in a sentence, without matching “this”, “isn’t”, or even “Is”.

[\w\.-]+@[\w\.-]+

Find an email address. This simple expression may not cover every possible valid email address format.


References

Although OmegaT does not offer fancy colouring for your regular expressions, you can get a lot of practice by using the Text Search window since OmegaT does colour the matching results.

A few additional resources are presented below.

The Java technical reference is useful as a canonical reference.

Java Regex documentation

The official reference for regular expressions used in Java.

If you want to learn more about using regular expressions, the two following sites have proven very useful.

https://regex101.com

An online regular expression matcher that lets you enter the text you want to search and the regular expressions you want to test.

https://www.regular-expressions.info

One of the most thorough regular expression tutorial and reference on the web.

Note

OmegaT does not support either site in any way. If you find other interesting references—in any language—the OmegaT team would love to hear about them.

Glossaries

Glossaries are terminology files stored in the glossary folder.

All terms in a segment with a match in any of the glossaries will be displayed in the Glossaries pane.

Source terms can be multi-word expressions.

There are 2 kinds of glossary files:

The project glossary

Use Edit Create Glossary Entry... C + S + G to enter new terms in this glossary. It is called the writable glossary for this reason.

Use Project Access Project Contents to directly access it. You can then open it in a text editor and modify it.

You do not need to prepare the file in advance.

It will be created the first time you add an entry to the glossary.

Note

If you choose to use an existing file as the default glossary, all new entries will be recorded in tab-separated format and saved in UTF-8 by default.

If you want to specify a different encoding, you can do so by adding a “magic” comment that takes the following form:

# -*- coding: <charset> -*-,

where <charset> is typically one of the sets listed in the IANA Charset Registry.

Reference glossaries

They are terminology files in a format recognized by OmegaT. You cannot modify them from the OmegaT interface like the project glossary, but you can do so in a text editor.

Note

Modifications made to any glossary are immediately recognized by OmegaT displayed in the Glossaries pane.

The glossaries folder

By default, each project contains a glossary folder to store the writable glossary and any reference glossaries you want to add to the project. See the Glossary files folder project property for details.

All glossaries must be located in the glossary folder. Glossaries located in nested folders are also recognized.

Within that reference glossaries folder, you can create multiple terminology subfolders organized by topic, client, or any other category that suits your workflow.

Use the Glossary files folder project property to set the location of the reference glossaries folder. This folder can be set outside the project, enabling you to use it, or one of the specific subfolders, in other projects.

Project glossary

The writable project glossary is located in the glossary folder by default and called glossary.txt.

You can change its name and location in the Writable Glossary File dialog, but you must give it a .txt or .utf8 extension, and store it within the glossary folder or in one of its subfolders.

File format

OmegaT glossary files are simple plain text files containing three-column lists, with the source term in the first column, an optional target term in the second column, and an optional comment in the third column.

Glossaries can be “tab-separated values” (TSV) or “comma-separated values” (CSV) files or can also use the TermBase eXchange (TBX 2) format.

A writable glossary created for the project by OmegaT will be a TSV file saved in UTF-8. User-created files that use only latin characters may be recognized and treated as ISO-8859-1 if it does not contain non-ASCII characters or other characters interpreted as UTF-8.

The encoding used to read reference glossaries depends on their file extension:

Table 11. Format, extensions and expected encoding
Format Extension Encoding
TSV .txt UTF-8
TSV .utf8 UTF-8
TSV .tab OS default encoding
TSV .tsv OS default encoding
CSV .csv UTF-8
TBX .tbx UTF-8

Directional Formatting Characters

Bidi control characters are available from Edit Insert Bidi Control Character . They can be used to:

  • Insert an invisible character with a strong directionality to force a specific position for a character with weak or neutral directionality.

  • Create a section of text that flows in the direction opposite that of the segment.

These control characters change directionality but are invisible. Use View Display Bidi Control Characters to show a visual indication of their position.

Marks

To change the position of a character with weak or neutral directionality (like punctuation symbols), insert an LRM or RLM character after the character, depending on the directionality of the segment:

  • Insert a LRM after a weak-directionality character that must run left-to-right in a right-to-left segment (e.g. an English excerpt inside Arabic text).

  • Insert a RLM after a weak-directionality character that must run right-to-left in a left-to-right segment (e.g. an Arabic excerpt inside English text).

Embeddings

Embeddings can be used to create a longer section of text (containing several words and spaces) that must flow in the direction opposite that of the segment. You can create two kinds of embeddings depending on the directionality of the segment:

  • To create a left-to-right embedding in a right-to-left segment, insert a left-to-right embedding (LRE) character, type or insert the left-to-right text, and then insert the pop directional formatting (PDF) character.

  • To create a right-to-left embedding in a left-to-right segment, insert a right-to-left embedding (RLE) character, type or insert the right-to-left text, and then insert the PDF character.

Post-Processing Commands

See the Local post-processing commands projet property for project specific commands.

See the Global post-processing commands preference for global commands.

Template Variables

The command is passed to Java runtime exec as a string with the template values expanded. All the arguments should be quoted, e.g. "${fileName}".

The following template variables are always available. The other items on the template list are environment variables for your system.

Table 12. Template Variables
Variable name Value
${projectName} The name of the project directory
${projectRoot} Full path to the project folder
${sourceRoot} Full path to the source folder
${targetRoot} Full path to the target folder
${glossaryRoot} Full path to the glossary folder
${tmRoot} Full path to the TM root folder
${tmAutoRoot} Full path to the TM auto folder
${dictRoot} Full path to the dictionary folder
${tmOtherLangRoot} TM Root + tmx2source (See the Bridge two languages how-to for details.)
${sourceLang} Source language
${targetLang} Target language
${filePath} Full path to source file
${fileShortPath} Source file name relative to given root
${fileName} Full name of source file
${fileNameOnly} Name of source file without extension
${fileExtension} Extension of source file without a dot

Local Scripts

In addition to a regular command, you can call a script. Never run post-processing scripts from untrusted sources. For security reasons, local post-processing commands are disabled by default.

Template variables can be used with both regular commands and custom scripts. You may need to use an absolute path for your script. The PATH OmegaT uses may not be the same as the current user’s PATH.

STDOUT and STDERR are written to the omegat.log file. The exit code and STDERR or the last STDOUT will appear on the status bar.

Linux and macOS

You should use a shebang, e.g. #! /bin/bash or #! /usr/bin/env python3. And the script must be executable. Chaining commands with && or || or pipes | will not work here.

Example 29. Simple example of a post-processing commands:
Open the target folder in Linux
xdg-open ${targetRoot}
Open the target folder in macOS
open ${targetRoot}
Open the target folder in Windows Powershell
Invoke-Item ${targetRoot}

OmegaT Shortcuts

Description

The OmegaT interface generally does not rely on buttons to give access to its functions. Instead, they are called from the menus or, for the majority of functions, from their assigned default shortcut.

Learning the most frequent shortcuts will not take long once you start working with OmegaT. The shortcuts are indicated next to each menu item, allowing them to learn new shortcuts gradually as you use the software.

You can customize the majority of the shortcuts in OmegaT. See the Customization section for details.

OmegaT runs on any platform that runs a Java Runtime Environment (Windows, macOS, and Linux being the most mainstream). The modifier keys that form the shortcuts vary slightly depending between platforms. To make reading easier we have adopted the following convention for modifier keys:

Table 13. Modifier key identifiers
Linux/Windows Key identifier macOS
Shift S shift or
Ctrl or Control C command or
Alt A alt / option or
  Ctrl control or

The Key identifiers above enable us to avoid listing multiple notations for every shortcut.

Example 30. How we avoid listing multiple notations:

On Windows and Linux: Ctrl + Shift + N

On macOS: Shift + Command + N

In this manual: C + S + N


Customization

OmegaT assigns shortcuts to most of the functions available in the Project , Edit , and Go To menus and to a number of functions in the Editor pane. You can also add or modify the shortcuts for most of the functions.

To do so, you have to put the appropriate shortcut definition file in your OmegaT configuration folder. See the Configuration Folder appendix for details.

There are two shortcut definition files.

MainMenuShortcuts.properties

The shortcut definition file for the menus and a few other items.

EditorShortcuts.properties

The shortcut definition file for the editor.

OmegaT must be restarted after a shortcut definition file has been modified for the new shortcuts to take effect.

Note

You can copy the default OmegaT shortcut files from the OmegaT development site on Sourceforge to your configuration folder and modify them to suit your needs:

The macOS files must be renamed MainMenuShortcuts.properties and EditorShortcuts.properties for OmegaT to recognize them.

The next section describes the syntax used in the shortcut definition files, and provides an example modification.

Syntax

The basic syntax of the shortcut definitions files is simply:

function code=shortcut

Use the tables in the Lists of functions and codes section below to find the values for function code.

The shortcut represents the key combination pressed by the user. It takes the following form:

  • 0 or more modifier

  • followed by 0 or 1 event

  • followed by 1 key

  • where modifier can be: shift, ctrl, meta, alt, or altGraph

    Note

    meta refers to the key with the Windows logo on most keyboards for Windows or Linux systems, and to the command on macOS.

    altGraph refers to the Alt key to the right of the spacebar on keyboards with two Alt keys.

  • event can be: typed, pressed, released

  • and key can be any key available on your keyboard. You can refer to the table presenting the different editor shortcuts to find the values for keys such as Home, Page Up, or the arrow keys.

Empty lines and comments can be added to organize the list and make it easier to read. A comment line starts with a #, and everything after that is ignored by the application.

The easiest way to modify the shortcuts is to download copies of the default files to your configuration folder, as noted above, and make the changes you want there.

Example 31. Modifying a shortcut

The default shortcut for closing a project is defined on Windows and Linux as:

projectCloseMenuItem=ctrl shift W

and on macOS as:

projectCloseMenuItem=meta shift W

However, you may want to remove the S key from the shortcut to make it only Ctrl + W (or Command + W on macOS) to match the shortcut you use in other applications.

To do so, modify the MainMenuShortcuts.properties as follows for Windows or Linux:

projectCloseMenuItem=ctrl W

or as follows for macOS:

projectCloseMenuItem=meta W


Example 32. Adding a shortcut

If your language pair calls for the frequent use of alternative translations, you may want to assign a shortcut to that function since it does not have one by default.

The steps below demonstrate how to assign the Alt + X shortcut to the Edit Create Alternative Translation menu item.

  1. Open the MainMenuShortcuts.properties you have copied to your configuration folder in a text editor.

  2. As shown in the Edit menu table below, the function code for the Create Alternative Translation function is editMultipleAlternate.

    Searching for that code in the file will bring you to the following line:

    # editMultipleAlternate=

  3. The line is currently a comment. Delete the # at the beginning of the line so OmegaT will recognize the shortcut, and add alt X after the = sign at the end of the line:

    editMultipleAlternate=alt X

  4. Save and close the file. The next time you start OmegaT, your new shortcut should be active and displayed next to the name of the function in the menu.


Save the file after you have finished making your changes. If OmegaT is open, you will have to restart it for your changes to take effect.

Your modified or added shortcuts should now be displayed next to the menu items you have changed. They will now be available in OmegaT as long as there are no conflicts with other functions or with system-wide shortcuts.

The next section presents tables with the function codes and corresponding default shortcut for each menu or editor function in OmegaT.

Lists of functions and codes

Menu functions

The functions that can be modified in the MainMenuShortcuts.properties file, along with their default values, are presented in the tables below. Shortcuts in parentheses are alternatives for the same function found in the EditorShortcuts.properties.

Table 14. Project Menu
Menu item Function code Windows/Linux macOS
New... projectNewMenuItem ctrl shift N meta shift N
Download Team Project... projectTeamNewMenuItem    
Open... projectOpenMenuItem ctrl O meta O
Open Recent... projectOpenRecentMenuItem    
Reload projectReloadMenuItem F5
Close projectCloseMenuItem ctrl shift W meta shift W
Save projectSaveMenuItem ctrl S meta S
Add Files... projectImportMenuItem    
Add MediaWiki Page... projectWikiImportMenuItem    
Commit Source Files projectCommitSourceFiles    
Commit Target Files projectCommitTargetFiles    
Create Translated Files projectCompileMenuItem ctrl D meta D
Create Current Translated File projectSingleCompileMenuItem ctrl shift D meta shift D
Open MED Project... projectMedOpenMenuItem    
Create MED Project projectMedCreateMenuItem    
Properties... projectEditMenuItem ctrl E meta E
Source Files... viewFileListMenuItem ctrl L meta L
Access Project Contents/Project Folder projectAccessRootMenuItem    
Access Project Contents/Dictionaries projectAccessDictionaryMenuItem    
Access Project Contents/Glossaries projectAccessGlossaryMenuItem    
Access Project Contents/Source Files projectAccessSourceMenuItem    
Access Project Contents/Target Files projectAccessTargetMenuItem    
Access Project Contents/TMs projectAccessTMMenuItem    
Access Project Contents/Exported TMs projectAccessExportTMMenuItem    
Access Project Contents/Current Source Document projectAccessCurrentSourceDocumentMenuItem    
Access Project Contents/Current Target Document projectAccessCurrentTargetDocumentMenuItem    
Access Project Contents/Writable Glossary projectAccessWriteableGlossaryMenuItem    
Restart projectRestartMenuItem    
Quit projectExitMenuItem ctrl Q meta Q

Table 15. Edit Menu
Menu item Function code Windows/Linux macOS
Undo editUndoMenuItem ctrl Z meta Z
Redo editRedoMenuItem ctrl Y meta Y
Replace With Match or Selection editOverwriteTranslationMenuItem ctrl R meta R
Insert Match or Selection editInsertTranslationMenuItem ctrl I meta I
Replace With Source editOverwriteSourceMenuItem ctrl shift R meta shift R
Insert Source editInsertSourceMenuItem ctrl shift I meta shift I
Select Source Text editSelectSourceMenuItem ctrl shift A meta shift A
Replace with Machine Translation editOverwriteMachineTranslationMenuItem ctrl M meta M
Insert Missing Tags editTagPainterMenuItem ctrl shift T meta shift T
Insert Next Missing Tag editTagNextMissedMenuItem ctrl T meta T
Export Selection editExportSelectionMenuItem ctrl shift C meta shift C
Create Glossary Entry editCreateGlossaryEntryMenuItem ctrl shift G meta shift G
Search... editFindInProjectMenuItem ctrl F meta F
(Call Last Search Window) findInProjectReuseLastWindow ctrl shift F meta shift F
Replace... editReplaceInProjectMenuItem ctrl K meta K
Search Dictionaries editSearchDictionaryMenuItem alt shift D
Switch Case to/lower case lowerCaseMenuItem    
Switch Case to/UPPER CASE upperCaseMenuItem    
Switch Case to/Title Case titleCaseMenuItem    
Switch Case to/Sentence case sentenceCaseMenuItem    
Switch Case to/Cycle cycleSwitchCaseMenuItem shift F3
Select Match/Select Previous Match editSelectFuzzyPrevMenuItem ctrl UP meta UP
Select Match/Select Next Match editSelectFuzzyNextMenuItem ctrl DOWN meta DOWN
Select Match/Select Match #1 editSelectFuzzy1MenuItem ctrl 1 meta 1
Select Match/Select Match #2 editSelectFuzzy2MenuItem ctrl 2 meta 2
Select Match/Select Match #3 editSelectFuzzy3MenuItem ctrl 3 meta 3
Select Match/Select Match #4 editSelectFuzzy4MenuItem ctrl 4 meta 4
Select Match/Select Match #5 editSelectFuzzy5MenuItem ctrl 5 meta 5
Insert Bidi Control Character/Left-to-Right Mark (LRM U+200E) insertCharsLRM    
Insert Bidi Control Character/Right-to-Left Mark (RLM U+200F) insertCharsRLM    
Insert Bidi Control Character/Left-to-Right Embedding (LRE U+202A) insertCharsLRE    
Insert Bidi Control Character/Right-to-Left Embedding (RLE U+202B) insertCharsRLE    
Insert Bidi Control Character/Pop Directional Formatting (PDF U+202C) insertCharsPDF    
Use as Default Translation editMultipleDefault    
Create Alternative Translation editMultipleAlternate    
Remove Translation editRegisterUntranslatedMenuItem ctrl shift X meta shift X
Set Empty Translation editRegisterEmptyMenuItem    
Register Identical Translation editRegisterIdenticalMenuItem ctrl shift S meta shift S

Table 16. Go To Menu
Menu item Function code Windows/Linux macOS
Next Untranslated Segment gotoNextUntranslatedMenuItem ctrl U meta U
Next Translated Segment gotoNextTranslatedMenuItem ctrl shift U meta shift U
Next Segment gotoNextSegmentMenuItem ctrl N (Enter/Tab) meta N (Enter/Tab)
Previous Segment gotoPreviousSegmentMenuItem ctrl P (ctrl Enter / ctrl Tab) meta P (meta Enter / shift Tab)
Segment number... gotoSegmentMenuItem ctrl J meta J
Next Note gotoNextNoteMenuItem ctrl alt N meta alt N
Previous Note gotoPreviousNoteMenuItem ctrl alt P meta alt P
Next Unique Segment gotoNextUniqueMenuItem ctrl shift K meta shift K
Source of Selected Match gotoMatchSourceSegment ctrl shift M meta shift M
Autopopulated Segments/Next Segment from tm/auto/ gotoNextXAutoMenuItem ctrl alt COMMA meta alt COMMA
Autopopulated Segments/Previous Segment from tm/auto/ gotoPrevXAutoMenuItem ctrl alt shift COMMA meta alt shift COMMA
Autopopulated Segments/Next Segment from tm/enforce/ gotoNextXEnforcedMenuItem ctrl alt PERIOD meta alt PERIOD
Autopopulated Segments/Previous Segment from tm/enforce/ gotoPrevXEnforcedMenuItem ctrl alt shift PERIOD meta alt shift PERIOD
Back in History gotoHistoryBackMenuItem ctrl shift P meta shift P
Forward in History gotoHistoryForwardMenuItem ctrl shift N meta shift N
Notepad gotoNotesPanelMenuItem ctrl alt 9 meta alt 9
Editor gotoEditorPanelMenuItem ctrl alt 0 meta alt 0

Table 17. View Menu
Menu item Function code Windows/Linux macOS
Highlight Translated Segments viewMarkTranslatedSegmentsCheckBoxMenuItem    
Highlight Untranslated Segments viewMarkUntranslatedSegmentsCheckBoxMenuItem    
Display Paragraph Delimitations viewMarkParagraphStartCheckBoxMenuItem    
Display Source Segments viewDisplaySegmentSourceCheckBoxMenuItem    
Highlight Repeated Segments viewMarkNonUniqueSegmentsCheckBoxMenuItem    
Highlight Segments with Notes viewMarkNotedSegmentsCheckBoxMenuItem    
Display Non-Breakable Spaces viewMarkNBSPCheckBoxMenuItem    
Display Whitespace viewMarkWhitespaceCheckBoxMenuItem    
Display Bidirectional Algorithm Control Characters viewMarkBidiCheckBoxMenuItem    
Highlight Auto-Populated Segments viewMarkAutoPopulatedCheckBoxMenuItem    
Mark Glossary Matches viewMarkGlossaryMatchesCheckBoxMenuItem    
Mark Language Checker Issues viewMarkLanguageCheckerCheckBoxMenuItem    
Use Aggressive Font Fallback viewMarkFontFallbackCheckBoxMenuItem    
Display Modification Info/None viewDisplayModificationInfoNoneRadioButtonMenuItem    
Display Modification Info/for Current Segment viewDisplayModificationInfoSelectedRadioButtonMenuItem    
Display Modification Info/for All Segments viewDisplayModificationInfoAllRadioButtonMenuItem    
Restore OmegaT Window viewRestoreGUIMenuItem    

Table 18. Tools Menu
Menu item Function code Windows/Linux macOS
Check Issues... toolsCheckIssuesMenuItem ctrl shift V meta shift V
Check Issues for Current File toolsCheckIssuesCurrentFileMenuItem    
Statistics toolsShowStatisticsStandardMenuItem    
Match Statistics toolsShowStatisticsMatchesMenuItem    
Match Statistics per File toolsShowStatisticsMatchesPerFileMenuItem    
Align Files... toolsAlignFilesMenuItem    

Note

The Scripting item in the Tools menu is an exception.

It is not possible to add a shortcut to call the script editor, or to modify the shortcuts assigned to the scripts.

Table 19. Options Menu
Menu item Function code Windows/Linux macOS
Preferences optionsPreferencesMenuItem    
Machine Translation/Automatically Fetch Translations optionsMTAutoFetchCheckboxMenuItem    
Glossaries/Use Fuzzy Matching optionsGlossaryFuzzyMatchingCheckBoxMenuItem    
Dictionaries/Use Fuzzy Matching optionsDictionaryFuzzyMatchingCheckBoxMenuItem    
Auto-Completion/Show Relevant Suggestions Automatically optionsAutoCompleteShowAutomaticallyItem    
Auto-Completion/History Completion optionsAutoCompleteHistoryCompletionMenuItem    
Auto-Completion/History Prediction optionsAutoCompleteHistoryPredictionMenuItem    
Global File Filters... optionsSetupFileFiltersMenuItem    
Global Segmentation Rules... optionsSentsegMenuItem    
Editor... optionsWorkflowMenuItem    
Access Configuration Folder... optionsAccessConfigDirMenuItem    

Table 20. Help Menu
Menu item Function code Windows/Linux macOS
User Manual... helpContentsMenuItem F1
About... helpAboutMenuItem    
Last Changes... helpLastChangesMenuItem    
Log... helpLogMenuItem    
Check for Updates... helpUpdateCheckMenuItem    

Table 21. Search Window
Menu item Function code Windows/Linux macOS
Jump to Entry in Editor jumpToEntryInEditor ctrl J meta J

Editor functions

The shortcuts that can be modified in the EditorShortcuts.properties file, along with their default values, are presented in the table below

Table 22. Editor
Function Function code Windows/Linux macOS
Open Context Menu editorContextMenu CONTEXT_MENU shift ESCAPE
Go to Next Segment editorNextSegment TAB
Go to Previous Segment editorPrevSegment shift TAB
Go to Next Segment (not TAB) editorNextSegmentNotTab ENTER
Go to Previous Segment (not TAB) editorPrevSegmentNotTab ctrl ENTER meta ENTER
Insert Linebreak editorInsertLineBreak shift ENTER
Select All editorSelectAll ctrl A meta A
Delete Previous Token editorDeletePrevToken ctrl BACK_SPACE alt BACK_SPACE
Delete Next Token editorDeleteNextToken ctrl DELETE alt DELETE
Go to First Segment editorFirstSegment ctrl PAGE_UP meta PAGE_UP
Go to Last Segment editorLastSegment ctrl PAGE_DOWN meta PAGE_DOWN
Skip Next Token editorSkipNextToken ctrl RIGHT alt RIGHT
Skip Previous Token editorSkipPrevToken ctrl LEFT alt LEFT
Skip Next Token with Selection editorSkipNextTokenWithSelection ctrl shift RIGHT alt shift RIGHT
Skip Previous Token with Selection editorSkipPrevTokenWithSelection ctrl shift LEFT alt shift LEFT
Toggle Cursor Lock editorToggleCursorLock F2
Toggle Overtype editorToggleOvertype INSERT F3

Table 23. Autocompleter
Function Function code Windows/Linux macOS
Open Autocompleter autocompleterTrigger ctrl SPACE ESCAPE
Open Autocompleter Next View autocompleterNextView ctrl SPACE ctrl DOWN
Open Autocompleter Previous View autocompleterPrevView ctrl shift SPACE ctrl UP
Confirm and Close Autocompleter autocompleterConfirmAndClose ENTER
Confirm Autocompleter Without Closing autocompleterConfirmWithoutClose INSERT
Close Autocompleter autocompleterClose ESCAPE
Go Up The List autocompleterListUp UP
Go Down The List autocompleterListDown DOWN
Go Up One Page autocompleterListPageUp PAGE_UP
Go Down One Page autocompleterListPageDown PAGE_DOWN
Go Up in Table autocompleterTableUp UP
Go Down in Table autocompleterTableDown DOWN
Go Left in Table autocompleterTableLeft LEFT
Go Right in Table autocompleterTableRight RIGHT
Go Up One Page in Table autocompleterTablePageUp PAGE_UP
Go Down One Page in Table autocompleterTablePageDown PAGE_DOWN
Go to First Table autocompleterTableFirst ctrl HOME meta HOME
Go to Last Table autocompleterTableLast ctrl END meta END
Go to First Row in Table autocompleterTableFirstInRow HOME
Go to Last Row in Table autocompleterTableLastInRow END

Configuration Folder

The configuration folder stores the majority of the OmegaT options and preferences for the user.

Use Options Access Configuration Folder to access it directly.

Location

The location of the default configuration folder varies by system (the ~ character represents your home folder):

Linux

~/.omegat

macOS

~/Library/Preferences/OmegaT

Windows

~\AppData\Roaming\OmegaT

Note

You can specify a configuration folder other than the default when you start OmegaT from the command line. See the Command line launch how-to for details.

Modified preferences are stored in the configuration folder used by the project. If you do not use the default configuration folder, all modifications made in the preferences will be stored in the specified configuration folder and will not appear when you resume work with the default configuration folder.

Default contents

omegat.prefs

This file includes a number of important user preferences.

Some preferences do not have an equivalent in the user interface. They must be modified manually.

Example 33. Do not automatically display the Source Files list

To prevent the Source Files list window from automatically opening when a project is loaded, find <project_files_show_on_load> and replace true to false:

<project_files_show_on_load>false</project_files_show_on_load>

Note

Only this preference currently requires manual modification.

uiLayout.xml

This file describes the overall OmegaT layout.

logs/

This folder contains a number of log files. The most current is OmegaT.log.

These files record various internal state and program event messages generated while OmegaT is running. If OmegaT behaves erratically, include this file, or the relevant part thereof, to your report.

Use Help Log... to view the contents of the file.

script/

If the applicable functions are used, this folder can contain up to three text files:

selection.txt

This file stores the currently selected text when Edit Export Selection C + S + C is used. The text in the file is replaced each time this function is called.

source.txt

This file contains the original text from of the current segment when the Export the segment to a text file preference is enabled. The text in the file is replaced each time a new segment is entered.

target.txt

This file contains the translated text from the current segment when the Export the segment to a text file preference is enabled. The text in the file is replaced each time a new segment is entered.

Those three files provide as a simple way to access some OmegaT content and process it with local programs such as shell scripts.

Additional contents

EditorShortcuts.properties

This parameter file contains customized editor shortcuts. See the Customization appendix for details.

MainMenuShortcuts.properties

This parameter file contains customized user interface shortcuts. See the Customization appendix for details.

filters.xml

This parameter file contains customized file filters. See the Global File Filters preferences for details.

finder.xml

This parameter file contains customized external search parameters. See the Global External Searches preferences for details.

omegat.autotext

This parameter file contains customized autotext parameters. See the Auto-Completion preferences for details.

repositories.properties

This file contains the login information for your team project repositories.

Warning

The file contents are not encrypted.

See the Set up a team project how-to for details.

segmentation.conf

This parameter file contains customized segmentation parameters. See the Global Segmentation Rules preferences for details.

plugins/

This folder provides the standard location for manually installed OmegaT extension plugins. See the Plugins preference for details.

It is also possible to install plugins in the application plugins/ folder.

spelling/

This folder contains your spelling dictionaries. See the Spellchecker preferences for details.

Application Folder

The application folder contains the OmegaT.jar application and a number of other important files.

Note

The application folder location depends on your platform and on the way you have installed OmegaT. The recommended or default locations are the following:

Windows

C:\Program Files\OmegaT\

Linux

/opt/omegat/

macOS

/Applications/OmegaT.app/Contents/Java/

other platforms

/opt/omegat/

OmegaT-license.txt

OmegaT's distribution license. The GPL version 3.

OmegaT.jar

OmegaT's executable. Used when launching OmegaT from the command line. See the Command line launch chapter for details.

changes.txt

The list of improvements and bug fixes. Check it if you need information on OmegaT's evolution.

doc-license.txt

The documentation distribution license. The GPL version 3.

index.html

The index to the multilingual user manual.

join.html

A link to the support page.

readme.txt

Simple installation and running instructions.

docs/

The documentation folder.

contributors.txt

The list of individual contributors.

en/

Each translated user manual comes in a different language folder.

index.html

The index to the multilingual user manual.

libraries.txt

The list of libraries used by OmegaT.

images/

The folder where images used by OmegaT are stored.

lib/

The folder where the libraries used by OmegaT are stored.

plugins/

The folder where you can install external plugins. The prefered location for plugin manual installs is the plugins/ folder located in the configuration folder. See the Plugins preferences for details.

scripts/

The folder where distributed scripts are located. See the Scripting window for details.