Machine translation is translation from one language into another by use of computer software. The computer software can be utilized for word-for-word translation or, in some cases, dependent on the software, more sophisticated translations can be obtained that account for idioms and structural feature differences between two languages. Machine translation software may also be customized to a certain extent when formulaic language is primarily used. Machine translation is more successful in these instances than for translation of informal language or conversation.
The concept of machine translation originated in 1629 when Rene Descartes introduced the notion of all languages using one symbol to impart corresponding meanings. It was not until 1954, however, that a machine translation of a limited number of Russian sentences into English was successfully accomplished.
Funding on machine translation research was heavily spent for the next decade until it was realized machine translation had failed to reach its expected potential. When computer data processing knowledge increased in the late 1980’s, machine translation once again gained interest.
Translation, in its simplest explanation, involves understanding the meaning of a source text and providing an accurate delivery of the message in the target language. To achieve this, a translator must be fully conversant with both the source and target languages’ grammatical structures and cultures.
This required extensive knowledge is where machine translation poses difficulties. There are different forms of machine translation employed, based on the needs of the user, to circumvent these exigencies.
The two main categories of machine translation are rule-based and statistical.
Rule-based machine translation is broken into three sub-categories: transfer-based; interlingual and dictionary-based. Transfer-based and interlingual machine translation application systems differ to a large extent, however, principally, source text is analyzed grammatically and reproduced into a form containing both the source and target languages (known as “interlinqua”) from which the target text is then produced. Transfer-based and interlingual methods require input of comprehensive word meanings and usages, along with rule sets, by a linguist proficient in grammar design. It provides an approximate translation that can be understood by a user. Dictionary-based machine translation is based on translation of words as found in dictionaries. Transfer-based is the most popular type of machine translation used.
Statistical machine translation operates via statistical methods based on bodies of words and sentences representative of a language (called “corpora”). The corpora can be bilingual or multilingual. Translation accuracy is dependent on amount of corpora available. As of yet, there are few corpora in existence. The results of a statistical machine translation with sufficient data accessible are notable.
A third classification for machine translation is example-based. This type also uses a bilingual corpus and provides a translation based on correlation.
The ambiguity of words with multiple meanings, dependent on context, in machine translation was initially brought up by Yehoshua Bar-Hillel in the early years of machine translation. Word sense disambiguation is the process of distinguishing the correct meaning or sense of the word.
Two groupings of procedures to resolve ambiguity have developed. The first, “shallow” approaches, has had the most success. Shallow approaches are often used to translate between parallel languages. Statistical methods are utilized to analyze words encompassing the ambiguous word to discern the most likely correct meaning. This leaves a margin for error.
The second grouping, “deep” approaches, presupposes the machine translation software has the requisite knowledge of the word to translate it in the proper meaning and context. For this approach to achieve the correct outcome, the software would require the ability to conduct research to resolve the ambiguity. Systems equipped with this capacity have not yet been developed.
Machine translation is evolving and easing the onerous task of translation, but human participation, to varying degrees, by a professional translator is still necessary for production of an accurate end product.
1. Publications on machine translation John Hutchins: Publications on machine translation, computer-based translation technologies, linguistics and other topics
2. Full Text Translator by Babylon.com
3. Text Translation by Google
4. European Association for Machine Translation
5. The Language Technologies Institute (LTI) at Carnegie Mellon University (CMU)
Based on Machine Translation article by Wikipedia