When should you use machine translation?

Over the past few years, automated translation, also known as “machine translation,” (MT) has become a viable, cost-efficient enhancement to translation workflows. However, human translators remain essential to the translation process.

What tools do translators already use?

Most professional linguists use computer assisted translation (CAT) tools to manage translation projects.  One of the benefits of CAT tools is their ability to create and maintain client-specific translation memories (TMs). Each project is saved in a database of matching source/target segments. When a client submits a new translation request, the content is compared to the content in the TM that was saved from prior jobs. The TM retrieves and auto-populates target text for segments whose translations were already on file. 

The value of recycled translations is defined in terms of matches that are assessed using percentages. If there is a sentence that previously translated in full, the match percentage will be 100%. In case of segments to translate with a 100% match against the TM, the translator’s workload is reduced to only checking / proofreading in context. Matches between 99% and 75%, called “fuzzy matches,” alert the translator that small but potentially significant changes were made. Segments without existing matches (lower than 75%) need to be translated in full, and these translations become part of the TM for the next projects. Long-term clients who use a lot of boilerplate language benefit most from their TMs because they improve consistency and reduce translation costs.

How does Neural Machine Translation work?

Neural Machine Translation works alongside CAT tools, especially for content with less than 75% TM match. In addition to auto-populating content from previously translated texts, NMT uses deep learning methods to analyze the content of a corpus and use those insights to predict and propose translations for new source content. For non-specialized content, generic NMT engines often produce satisfactory automated translation output. However, for specialized subjects, the engines need to be trained for each language pair. This “training” (via a deep learning process) requires a significant volume of pre-existing approved translations. This is why NMT is not equally suited for all language pairs. During training, the output is reviewed by professional translators who carefully review and edit the automated output and feed the corrections back into the system to hone its accuracy. This process requires an investment of time and resources.

Even after the system has been trained, there is no guarantee of accuracy and fluency if there is no “human in the loop.” Post-editing of NMT output by human translators is essential to ensure accuracy and fluency.

What impacts NMT quality?

The accuracy and fluency of machine translation output depends on several factors. Beyond the quality of the training corpus, a key factor is the quality of the source content. Unambiguous, plainly written source texts are best suited for automated translation. Source materials created using style guides, glossaries and controlled language result in better translations.

What is automated translation good for?

Non-specialized NMT engines produce acceptable translations for general content, but neural machine translation will produce the best results when trained to operate for a specific client within a particular industry, working from source material that was created using glossaries and controlled language. Overall, it is better suited for unvarnished technical content.

According to research published by Intent.to in July 2020, most NMT engines do well with English, Spanish, French, Chinese and Russian, but not as well with Finnish, Japanese and Russian. In terms of content, quality of NMT tends to be higher for subject domains of IT-related content with the exception of UI strings, legal services, and telecommunications. NMT translations for professional and business services receive lower quality rankings.

In order to decide whether NMT makes sense for any particular purpose, take these factors into account:

Is the project high-volume with a fast turnaround?  If you need millions of words translated in a short period of time, NMT will prove a valuable assist. 

Is the material repetitive and frequently updated?  Some technical manuals, product descriptions, and software documentation may be candidates for NMT; one-off publications may not be.

Is the content durable or is it ephemeral?  Customer feedback, emails, knowledge bases, FAQs, and reviews see a lot of “churn:” they are constantly being updated or replaced. In addition, the quality expectations for user-generated content are much lower than for professionally published content. The quality of the translation also matters less if the translations are to be used for in-house review or research.

For these types of content, the ROI of machine translation may be higher than that of human translation.

What are the drawbacks of NMT?

No matter how good it gets, automated translated content can’t match professional human translation in its ability to persuade, reassure, or delight the reader. Marketing materials, websites, and any other customer-facing assets require professionally trained, native-language translators with subject matter knowledge. NMT might be used for technical instructions, but when absolute accuracy is crucial, these also require post-editing. If the content is complex, and the translation engine is untrained, post-editing can be as labor-intensive (and expensive) as straight translation. What took only 15 minutes to “translate” could take an additional 48 hours of labor to correct. Automated translation has a place between pre-translation project preparation and post-editing by professional linguists.

What about SEO penalties for auto-generated content?

Google confirms that unedited machine translation, even when generated by its own online tool, would be considered auto-generated content and would reduce site ranking accordingly. However, there are many tools that connect with Google Translate and populate the site with optimizable content. It would seem that Google considers this unedited, auto-generated content, but it’s unclear how much of a penalty it would bring. In any case, the best way to rank on foreign-language search engines is by creating a foreign-language domain, subdomain, or subdirectory using human translation or carefully post-edited machine-translated content.

Should I use Neural Machine Translation?

If you are interested in using machine translation, you should do so in consultation with a language partner who understands the efforts it takes to customize an NMT engine for a given client and industry. In some ways, neural machine translation has much in common with the cost-saving translation-management technologies we’ve been using for years. .

As the demand for translated content increases across the globe, NMT is an exciting development. NMT significantly increases how much and how fast content can be translated. However, it requires a “human in the loop.”  Human translators and editors are necessary to successfully create translations that meet the expectations of the intended audiences. No automated neural translation engine can replace this human dimension of the translation process.