2011-08-12

How to match analysis results

Some of the most common misunderstandings during handoff are related to the volume of translatable text. Depending on the type and version of the CAT tool applied and the workflow process itself, analysis results can vary significantly.
Choosing the appropriate settings for analyses helps achieving the desired analysis results.


1) Matching segmentation rules:
Every CAT tool offers settings for segmentation rules. These rules define the length and structure of the sentences that will be identified and treated as one segment in the translation memory. For example, SDL Trados uses sentence based segmentation by default (a full stop determined as end of segment), whereas Across uses paragraph based segmentation by default (a paragraph mark determined as end of segment). As a result, Trados TM segments are much shorter than Across tank segments. Subsequently, a default Across analysis against a Trados TM imported into an Across tank will show a much lower matching values than an analysis adjusted to Trados segmentation rules.


Solution: To approximate your analyses to the word counts of your provider, adjust the segmentation rules of your local CAT tool to the values, which are valid for the provided TM, at the very beginning of a new project.


2) Setting penalties:
Every CAT tool allows you to apply penalties for deviating TM entries. These penalties have a significant impact on the achievable leverage. Generally, the more penalties are set, the less matching will be achieved. Conversely, setting no penalties at all returns maximum leverage. If all penalties are set to 0%, though, the recognition accuracy is rather diffuse and identifies even those segments as 100% matches, which actually differ in formatting, context or attributes.


Solution: To adjust analyses best to the required translation task, the penalties settings should be agreed upon at the very beginning of a new project.