Thursday, February 9, 2017

Automatic translation

John McCarthy
In a famous summer course that took place at Dartmouth College in 1956, the term artificial intelligence was applied for the first time to all those computer programs that perform tasks traditionally considered exclusively human, such as playing chess and translating from one human language to another. Those attending the course, led by John McCarthy, felt optimistic enough to predict that in ten years those two problems would have been completely solved. Thus, they hoped that by 1966 there would be programs capable of defeating the world chess champion, and others that would translate perfectly between any two human languages.
In March 1961, my uncle, Felipe F. Moreno, then chief of Spanish translators at the headquarters of the International Telecommunication Union (ITU) in Geneva, wrote in the ITU magazine an article on machine translation and how it could affect human translators, which proves that the question was hot. Shortly afterwards, when the deadline announced by the artificial intelligence forerunners had been reached, with both problems far from being solved, it was obvious that they had been overly optimistic.
We know that the goal of writing a program that would defeat the world chess champion was met in 1997, when Deep Blue defeated Garry Kasparov, the champion in that year. The other problem, machine translation, was even more difficult. At the end of the sixties the following anecdote was well-known in the computer-programming world:

To test a couple of automatic translation programs, one from English to Russian, the other Russian to English, the first program was given the following text of the Gospel (Mat.26: 41): The spirit is willing, but the flesh is weak. The result of the Russian translation was passed as input to the Russian-English translator, and the result was: The vodka is good, but the meat is spoiled.
The anecdote is probably apocryphal, but it expresses quite well the problem of machine translation: human languages ​​are ambiguous, which makes translation very difficult. Ambiguity can be syntactic, as in the following examples:
John saw the man on the mountain with a telescope. Who is on the mountain? John, the man, or both? Who has the telescope? John, the man, or the mountain?
Time flies like an arrow. This phrase has three other possible syntactic interpretations, in addition to the usual. One of them could also be expressed thus: the flies of time do like an arrow.
Ambiguity can also be semantic, as in these examples:
We can meet at the bank (a building or the bank of a river?)
Every man loves a woman. Every man loves the same woman or a different woman for each man?
Another problem with these ambiguities is that they are usually different for different languages, which makes automatic translation difficult, as the actual meaning of many phrases depends on a very broad context, including general knowledge about the world which computer programs do not have. This is the main reason why the research on automatic translation took a long time to produce some results.
In the late 1970s, the Japanese government decided to embark on a project that would put their country at the forefront of computing research. Apparently they did not want the Japanese to be considered as efficient copiers of the technology developed by other countries, so for once they wanted to be copied. So they started the fifth generation project, with the following aims:
  • A computer Hardware adapted to make it easier to build artificial intelligence applications.
  • A computer Software capable of interacting with the user in their own language (English and Japanese) and translating correctly between those two languages.
The fifth generation project was to last ten years and ended in the early 1990s in a complete failure. The supposed fifth generation computers that were built turned out to be ordinary personal computers, equipped with a firmware that allowed them to understand the Prolog language. This was not new, as the first personal computers had a firmware that enabled them to understand the Basic language. The major objectives (machine translation and natural language understanding) were not achieved.
Google Translate icon
The project was successful in the sense that it pushed other countries to launch less ambitious projects, some of which did lead to reasonable results. For example, in the European Union, where the translation of documents between official languages ​​takes a significant proportion of the budget, the EUROTRA project was launched, whose initial objective (correctly translating texts between two natural languages) was finally reduced to a simpler, achievable goal: build tools that would help human translators increase their performance (computer-aided translation).
A modern tool of this type is Google Translate. The translations it provides are often made fun of, with examples like the following:

The Spanish sentence Me darĂ­a de tortas (I would kick myself) is translated thus by Google Translate: I would give of cakes.
And the sentence No se anda con chiquitas (he means business) is translated thus: She does not hang out with little girls.
Yes, it's funny, but if you use Google Translate and accept its translations just as they come, you are sorely mistaken. This tool is an aid to the human translator, who must take a part in the translation process. The translations offered by Google Translate must be revised and corrected, but even so the tool is very useful, as indicated by an example drawn from my own experience: before using Google Translate, it took me two to three months to translate one of my novels from Spanish into English. Since I started using the tool, that time has dropped to about two weeks, for a comparable final quality of the translation. In other words, my productivity as a translator is now four times better.

The same post in Spanish
Thematic thread on Natural and Artificial Intelligence: Preceding Next

Manuel Alfonseca

No comments:

Post a Comment