Thoughts from Localization World Berlin - Machine Translation

July 5, 2009
Kevin Fountoukidis

At Localization World in Berlin last week and we spent some time learning about the latest trends in the industry. I have a couple topics I’d like to blog about after this event and I’ll start with the latest news from the world of machine translation (MT). Many of you might be surprised to hear it, but MT has really improved over the last couple of years. Its actually possible to include some MT into translation workflows. Of course its not possible to just replace human translators with machine translation, but it is possible, with certain types of texts and with certain clients, to use machine translation and generate comparable quality to human translation. The benefits are obviously lower costs and quicker turnaround times.

In my opinion there is a real opportunity for us to use MT at Argos. To begin with there are two different types of MT engines, statistical and rules based. The basic concept of statistical based translations is that you put huge volumes of text that has already been translated into the engine and it learns based on statistics. The basic idea is that with huge volumes of text the engine can learn trends in a language by seeing how many different texts have been translated. This is how Google Translate works, its very impressive. It works quite well with basic content and content that has enormous volumes. Of course Google has more volume than anyone. The second type of MT works with a rules based engine. The way it works is by defining the rules of the language, so rather than learning based on statistical probability, it produces sentences on the basis of preprogrammed rules. I won’t go into a huge analysis of the differences here. Its enough to say that with certain content statistical works better (lots of volume), with other content rules based performs better (lots of glossaries, highly specific content, lots of TMs).

I would summarize the whole MT thing like this, you need to choose the tool depending on the content you have. Some tools will be better for some content, some tools will be better for some languages and in many cases it won’t make sense to use MT at all. So we are interested in exploring this at Argos we agreed that Systrans is probably the best vendor for us to test at the beginning. This is basically a rules based system, but their latest version incorporates statistical methods as well, so its sort of a hybrid now. Anyway, Systrans have the most experience with Russian because the company was actually founded to develop MT for the US government during the cold war. I think Russian would be an excellent language for us test it out an MT workflow because we have significant volumes into this language.

In addition, MT is also a great sales tool. I think I mentioned it to every client I spoke to at Loc World. One of our biggest clients told us all about their success with it. They are able to do FIGS (French, Italian, German, Spanish) documentation at the same quality or higher with MT and they have managed to cut costs and turnaround time significantly. It’s a huge success. This is real. This client is very quality oriented. You need to score at least and 8 out of 10 for them to accept the translation. Apparently the MT is scoring better than human translations for FIGS. This is because the client has plenty of existing TMs and very well defined terminology. In addition they work on their source English so that its simplified and easier for the MT engine to translate.

I just want to make it clear, MT will never replace human translators. I think in the best case scenario we’ll only be able to implement it with certain clients, for certain languages in certain situations. Even if we do manage to implement it, it won’t work automatically. We will need to do plenty of post editing. You can achieve comparable quality after post editing, but the real benefit is in the speed of the work. A translator who post edits machine translation can process 8000-9000 words per day as opposed to the normal 2000 word per day output for most traditional translators. That is a significant increase.

I do find the whole subject interesting and I think its important that Argos is forward thinking when it comes to adopting new technologies. We need to embrace the future and right now I see lots of opportunities developing in this area. Its exciting!

click the +1 button if you like our content