Machine translation at the LDS Church

"For it shall come to pass in that day, that every man shall hear the fulness of the gospel in his own tongue, and in his own language..." - Doctrine and Covenants Section 90, Verse 11

It may come as no surprise that +The Church of Jesus Christ of Latter-day Saints maintains an impressive translation program in an effort to facilitate the fulfillment of this prophecy. The average church member may or may not know that the Church's distinguishing scripture, The Book of Mormon, has been translated into 107 languages (and counting). They may or may not be aware that the Church produced 100 million words of translated content in 2013. And they may or may not be aware that the Church actively publishes in over 100 languages. Here's a look at one technology the Church leverages to assist in this monumental task.

The term "machine translation" (MT) usually evokes one of several responses--enthusiasm, skepticism, or scoffing being the most common. If you are unfamiliar with the concept of MT, think Google Translate. Now you understand. Surely you've seen videos like this one (which are really awesome, by the way).

If so, you might be wondering whether MT has any practical application. Indeed, it does. The following information comes from a presentation given by Steve Richardson on February 25, 2014. The presentation was given at the Church Office Building in Salt Lake City, UT as part of a bi-monthly meeting of the Silicon Slopes Localization group.

Why does the Church invest in machine translation?
Demand for translation of church materials is rapidly increasing as membership continues to grow in non-English speaking locations around the world. The Church has a goal for MT to reduce the time required to translate by half. This, of course, helps reduce cost and increase total capacity. Translators in several languages have successfully achieved this goal.

What languages currently use machine translation?
As of this presentation, the Church supports MT for 19 languages. Not all 19 languages are in production, but all 19 have MT engines trained on church data and are producing results accurate enough to be useful to translators. About 9 are currently used in production, depending on content type. Initially, romance languages are proving most effective while Germanic, Slavic, Asian, and morphologically rich languages present a more difficult challenge.

What content currently leverages machine translation?, the main Church website, is published in 10 languages, including English. MT is applied in each target language. Other content, including, other Family History content, and letters and notices also utilize MT.

What process is followed to generate a finalized translation?
Unlike in traditional translation, linguists involved in this process are not asked to produce translations from scratch. Rather, they are shown a translation generated by MT and asked to correct, if needed, grammar mistakes and any major errors. This is called post editing. Translators receive training to help them learn this new skill. If they can't decide whether to use the MT within 3-5 seconds, they are encouraged to delete it and create an original human translation. On the whole, post editing reduces cost for two reasons. Linguists produce a higher volume in a shorter period of time, and there are fewer review steps in the overall process.

How much volume has been produced using this process?
To date, about 4 million words have been produced using MT and post editing.

Machine translation represents only one facet of technology that helps accomplish the work of translation at the Church. I've shared just a brief review of my notes from a presentation that was chock full of more juicy details. If you want to learn more about this topic, or if you have an interest in helping the translation effort of the LDS Church, please leave a comment.

No comments:

Post a Comment