Machine Translation Revisited

by Jost Zetzsche

while back I spent a week with some "real geeks" at the AMTA (Association of Machine Translation in the Americas) conference in Boston. If I'm not mistaken, I was the only translator among the 150 or so participants. That's probably not too surprising. After all, there is a vast gap between the translator community and the machine translation community.

Don't worry, I have not "sold out" to machine translation, but I would like to propose (again) a somewhat different approach to how we view ourselves as translators and how we view our products (our translation).

Post-editing fuzzy matches from TM databases is, in fact, not different from post-editing fuzzy matches from any other MT system.

My dilemma as a translator, which I think I share with a lot of colleagues, is that I value my work (and expect others to value it as well). In fact, I value it so much that no matter what I translate, be it a marketing text, legal disclaimers, news releases, or user manuals, I try to apply the same kind of excellence. In fact, I even frown at emails from clients that tell me to "really spend every effort" to make a certain translation impeccable because it is part of a bid or some other high-level job. I don't like to be told that because it obviously implies the assumption that I'm not always working on that level.

So why is this a dilemma? Well, first and foremost it's a good thing and really should not be changed. However, what it also does is to somehow muddy the waters as to what purposes different texts have, what audience they are intended for, and what the respective quality requirements are.

Marketing content or literature lose their very purpose and meaning if they are not translated in a way that impacts the user (the reader) far beyond the actual information. In fact, the language in these kinds of text has to be so powerful that it manipulates the user beyond that which he can control (be it through emotions, value propositions, or shopping behavior).

Compare that to a legal text. In this case, information in all its detailed nuances is of the utmost importance. Readability is of secondary concern (in fact, it often seems to me that the lack of readability is a requirement in the source texts that I get to translate), but ambiguities have to be avoided.

For user guides, information is also very important, but readability or stylistic concerns differ, depending on the user type. If it's for engineers or developers, there is less concern about style than there would be if it's an end user. After all, any communication with end users also carries some marketing message that would be thwarted by terrible writing.

And if there are different kinds of expectation by human users, there are also computers. For instance, most of the vast amounts of translated intelligence material is being processed by computers. Could you imagine yourself as a translator in that kind of scenario, translating something for no one but a computer to ever "read"?

Quite frankly, it makes no sense to have materials translated by highly qualified human translators when it can be done by computers. But that's the essence of the question: Can it be done by computers? The answer is that often it cannot, but sometimes it can. In a unique project, Microsoft has machine translated tens of thousands of knowledge-base articles into several languages. For an example, go to http://support.microsoft.com/kb/281925/en-us and then click on one of the translation links on the right-hand side. You will see a machine-translated version of the article in the respective language that is preceded by a disclaimer informing the user of possible pitfalls of the translation. The translation is not pretty. But it communicates (most of the time) what otherwise would not have been communicated at all.

So what we need is to develop usage criteria for translation. For the majority of usage criteria, a human translation is of utmost importance. For others it may be computerized translation with human post-editing, and for still others it may be machine translation only. And would this really be desirable? I absolutely think so. I don't want to waste my talents on stuff that a computer can do. And I also know that computers will not take away my job security. They may at some point take away certain kinds of jobs. But there is plenty of interesting material that currently is not being translated because it would be too expensive. That's what I would like to do.

Many people came up to me during the conference and asked what could be done to make machine translation more palatable for translators in appropriate scenarios. I hope I didn't sound too esoteric when I gave them this answer: In a speech of the Dalai Lama that I happened to hear several years ago, he described the meaning of the many spiritual beings in Mahayana, and in particular in Tibetan Buddhism. He said that the very essence of Buddhism is the nothingness. There is nothing. Not even spiritual beings. But how could someone go from a very real perception of the flesh and blood that we live in to the understanding of nothingness? People need stepping stones to gain that understanding. Buddhism's spiritual beings are those stepping stones to bridge the gap between flesh and blood and nothingness.

In one sense this is like our appreciation of technology. How in the world could we ever even think about using machine-translated texts if we don't even appreciate its "lowest form," translation memory? (Of course, this facetious little parable does not make any sense when it comes to the goal: in no way would I want to equate machine translation with the Buddhist Nirvana . . . ).

A while back, I quoted Jaap van der Meer form an article in MultiLingual Computing (http://www.multilingual.com). For those who missed it then, here it is again:

"Disdain on the part of professional translators for the hilarious and stupid MT mistakes gave birth to a new variant of MT called translation memory (TM). TM started off as a lower-level feature of commercial MT systems (...). But the success of TM came with dedicated products such as IBM TM/2 and Trados. The marketing message was tuned in to what the professional translation industry wanted to hear: 'Forget about MT; it doesn't work well. Instead, use our TM product because it leaves you in full control of the process.'

"The message worked well: within a period of 10 to 15 years, TM products have found their way to the workstations of more than 50,000 translators in the world. But the message also caused a 'cognitive disorder' in the translation industry, namely that TM is good and MT is evil, foregoing the fact that TM is just a new variant of MT (...). The damage is done, however, and it will take years to convince the community of business translators that post-editing fuzzy matches from TM databases is, in fact, not different from post-editing fuzzy matches from any other MT system."