2011-12-08

How to achieve 100% updating of the TM

1) Segmentation mismatching:
Every CAT tool offers settings for segmentation rules. These rules define the length and structure of the text to be identified and treated as a segment in the translation memory. These preset values persist even if the segmentation of individual sentences is changed during translation by splitting or merging segments. As a result, the TM gets updated with the split or merged segments, rather than with those originally counted. Thus a subsequent analysis of the source files against the updated TM may not recognise the originally counted segments as 100% matches.

2) Incomplete segmentation: A similar issue occurs if sentences are not segmented at all, for example because their contents were overwritten manually or already included in previous segments. Such sentences are not included in the TM update and recognized as No Matches during subsequent analysis comparisons.

Solution:
– Do not change the segmentation of the source text manually during translation.
– Do not split up or merge segments.

– Do not edit parts of the source text manually without segmentation.
– Never leave text unsegmented. If you already included its content in another segment, segment the superfluous text nonetheless and fill it with a plain space instead of a translation.

1 comment:

  1. The first problem (segmentation mismatching) depends very much on which translation environment tool is used. With memoQ, for example, the option for "TM-driven segmentation" can be selected, which splits and joins up to three elements in the TM to improve matches. This gives better leverage and enables you to "find" the 100% matches after segments have been split or joined.

    ReplyDelete