Google Translate is a multilingual machine translation system, developed and provided by Google, to translate text, voice, images or video in real-time from one language to another. It offers a web interface, as well as mobile interfaces for iOS and Android, and an API, which developers can use to build browser extensions, apps, and other software. Google Translate has the ability to translate into 133 languages at different levels, the system provides a free service and is used daily by more than 200 million people.
Google incorporated in November 2016 its neural machine translation system; a system that according to the company will perfect the dynamic evolution of the Translator, since it will analyze the composition of the sentences taking into account a series of factors. The system learns over time and user queries, which improves the quality of its translations. For now, Google has incorporated into this system the languages English, French, German, Portuguese, Spanish, Chinese, Japanese, and Turkish.
Since December 2016, free text translation has been limited by Google to 5,000 characters, while web page translation has no length limit.

Characteristics of Google Translate
Web interface
For some languages, Google Translate can pronounce the translated text, highlight corresponding words and phrases in the source and target text, and act as a simple dictionary for only one word at a time. If “Detect language” or “Text in an unknown language” is selected, the system can automatically identify the language.
In the web interface, users can suggest alternative translations, such as for technical terms, or correct errors, and these suggestions will be included in future updates to the translation process. If a user enters a URL in the original text, Google Translate will produce a hyperlink to a machine translation of the website. For some languages, text can be entered using: an on-screen keyboard, a handwriting tablet using handwriting recognition algorithms, or a microphone using a speech recognition system.
Browser integration
Google Translate is available in some browsers as an extension that translates the texts it collects on the websites it accesses.
The system has a series of Firefox extensions for Google Translate, which allow you to select commands from the translation service. There are also several Google Gadgets available that use Google Translate to integrate with iGoogle and other websites.
There is also an extension for Google’s Chrome browser; in February 2010, Google Translate was integrated into the standard Google Chrome browser to automatically translate the web page being viewed.
Mobile device interface of Google Translate
The application is compatible with more than 100 languages and allows translating: 50 languages starting from a photo of the source text, 43 languages starting from voice in conversation mode and 27 languages starting from a video in real-time in augmented reality mode.
Conversation Mode is a Google Translate interface that allows users to communicate fluently with a person in another language. The interface is available for some languages only.
The ‘input from camera’ functionality allows users to take a picture of a document or sign and Google Translate recognizes the text of the image using optical character recognition (ROC) technology and gives the translation to the selected target language. Camera input mode is not available for all languages.
The application has the ability to translate the text in real-time using the camera of the mobile device, using the “Snapshot” option. The speed and quality of the characteristic video translation in real-time (augmented reality) were further improved by using convolutional neural networks.
Android version
Google Translate is available as a free download app for users of the Android operating system. It works simply as the browser version, Google Translate for Android contains two main options: “SMS Translation” and “History”.
The app supports more than 130 languages and voice input has the ability to process 15 languages. It is available for devices running Android 2.1 and above and can be downloaded by searching for “Google Translate” on Google Play. The app possesses the functionality, whereby any language can be translated just by focusing the text on the mobile device’s camera and also offers a conversation mode that uses Google’s voice command and cloud storage to translate the dialogue between two people who speak different languages.
iOS version
There is an HTML5 Google Translate web app for iOS for iPhone, iPod Touch, and iOS users. The current Google Translate app is compatible with updated iPhone, iPad, and iPod Touch for iOS 15.0 or higher. It accepts voice input in 15 languages and allows the translation of a word or phrase into one of the more than 500 languages available. The system has the option to provide the translated version of the text by pronouncing it aloud in 100 different languages.
API
In May 2011, Google announced that the Google Translate API for software developers had become obsolete and would stop working on December 1, 2011, “due to the high operating cost resulting from the abuse of the use of the same”. Because the API is used on numerous third-party websites, this decision led some developers to criticize Google and question the feasibility of using Google’s APIs in their products. In response to public pressure, Google announced in June 2011 that the API would still be available through a paid service.
Translation Methodology
Google Translate does not use grammatical rules, as its algorithms are based on statistical analysis rather than analysis based on traditional grammatical rules. The original creator of the system, Franz Josef Och, has criticized the effectiveness of rule-based algorithms, highlighting the better performance of systems based on statistical approaches.
Google Translate is based on a method called statistical machine translation, specifically on the results of research conducted by Och with which he won the DARPA contest for speed machine translation in 2003. Och was head of Google’s machine translation group until he left the company to join Human Longevity in July 2014.
In its inner workings, Google Translate does not translate directly from one language to another (I1 → I2). Rather, it often translates first from the source language into English and then from English into the target language (I1 → EN → I2). However, because English, like all human languages, is ambiguous and context-dependent, this method can cause translation errors. For example, the translation of vous from French to Russian gives vous → you → ты or Bы/вы. 31 If Google were to use unambiguous, artificial language as an intermediary, it would be vous → you → Bы/вы or your → thou → ты
When Google Translate translates, it looks for patterns in hundreds of millions of documents to decide which is the best translation. By detecting patterns in documents that were translated by humans, the system makes intelligent decisions about which translation is most appropriate.
The following languages do not have a direct Google translation to or from English. These languages are translated through the indicated intermediate language (which in any case is closely related to the desired language, but is more widely spoken), and then passed through English (in a process comprising three successive translations):
- Belarusian (be ↔ ru ↔ in ↔ another);
- Catalan (ca ↔ is ↔ in ↔ another);
- Haitian Creole (ht ↔ fr ↔ in ↔ another);
- Galician (gl ↔ pt ↔ in ↔ another);
- Slovak (sk ↔ cs ↔ in ↔ other);
- Ukrainian (uk ↔ ru ↔ in ↔ another);
- Urdu (ur ↔ hi ↔ in ↔ another).
According to Och, a solid foundation for the development of a usable statistical machine translation system for a new language pair requires having a bilingual text corpus (or a parallel collection) of more than 150 to 200 million words, and two monolingual corpora each of more than one billion words. It is then possible to use statistical models from this data to translate between these languages.
To acquire this enormous amount of linguistic data, Google uses, for example, United Nations documents and reports. The UN normally publishes its documents and records in the six official languages of the UN, which has produced a large corpus of text in 6 languages.
Google representatives have participated in national conferences in Japan, where Google has requested bilingual data from researchers.
When Google Translate generates a translation, it looks for patterns in hundreds of millions of documents to help decide on the best translation. By detecting patterns in documents that have already been translated by human translators, Google Translate makes smart guesses (using an artificial intelligence system) as to what a proper translation should be.
Prior to October 2007, for languages other than Arabic, Chinese and Russian, Google Translate used SYSTRAN, a translation software engine that was used by several other online translation services such as Babel Fish (now discontinued). But since October 2007, Google Translate has used proprietary technology based on statistical machine translation.
Limitations
Google Translate has its limitations. The free service limits the number of paragraphs and the range of technical terms that can be translated, and while it can help the reader understand the general content of a text in another language, it doesn’t always deliver accurate translations, and sometimes the same word you want to translate is repeated verbatim.
Google Translate tries to differentiate between imperfect and perfect times in Romance languages so that habitual and continuous acts in the past often become individual historical events. Although seemingly pedantic, this can result in incorrect results that would have been avoided by a human translator. Knowledge of the subjunctive mode is practically non-existent. On the other hand, the informal second person (you) is often chosen, whatever the context or use is accepted. If your English reference material contains only “you” forms, you find it difficult to translate into a language that has more forms.
Some languages produce better results than others. Google Translate does a correct job especially when English is the target language and the source language is one of the languages of the European Union due to the large number of documents translated by the EU Parliament, to which the system has access. A 2010 analysis concluded that the translation from French to English is relatively accurate, and analyses conducted in 2011 and 2012 showed that the translation from Italian to English is also accurate.
However, if the source text is short, rule-based machine translations often perform better; this effect is particularly evident in translations from Chinese to English. While translations, in general, can be edited, in Chinese it is not possible to edit sentences. Instead, arbitrary sets of characters need to be edited, resulting in incorrect edits.
Texts written in Greek, Devanagari, Cyrillic and Arabic scripts can be automatically transliterated from phonetic equivalents written in the Latin alphabet. The browser version of Google Translate offers the option to read phonetically for Japanese to English conversion. The same option is not available in the paid API version. It also gives us a NOAD – New Oxford American Dictionary transcription when we translate a word from English into any other language which is a diacritical transcription.
For many of the most popular languages, the system has a “text-to-speech” audio function that allows you to read a text of a dozen words in that language. In the case of pluricentric languages, the accent of the message depends on the region:
from English, in the Americas, most of the Asia-Pacific and West Asia region the audio uses a general American accent with a feminine tone, while in Europe, Hong Kong, Malaysia, Singapore, Guyana and all other parts of the world use a British English accent with a feminine tone, a special accent is used in Australia, New Zealand and Norfolk Island; for Spaniards, in the Americas a Spanish accent from Latin America is used, while in the other parts of the world, an accent is used Castilian Spanish; in Portuguese, in general, a São Paulo accent is used, except for Portugal, where its native accent is used. For some less widely used languages, the open-source voice synthesizer eSpeak is used; however, production using a voice robot can be difficult to understand.
Supported languages on Google Translate
As of June 2022, Google Translate supports the following 133 languages.
- Afrikaans
- Aymara
- Albanian
- German
- Amharic
- Arabic
- Armenian
- Assamese
- Azeri
- Bambara
- Bengali
- Bhojpuri
- Belarusian
- Burmese
- Bosnian
- Bulgarian
- Cambodian
- Kannada
- Catalan
- Cebuano
- Czech
- Chichewa
- Chinese (Simplified)
- Chinese (traditional)
- Sinhalese
- Korean
- Corsican
- Haitian Creole
- Croatian
- Danish
- Dogri
- Slovak
- Slovenian
- Spanish
- Esperanto
- Estonian
- Ewe
- Basque
- Finnish
- French
- Frisian
- Scottish Gaelic
- Welsh
- Galician
- Georgian
- Greek
- Guarani
- Gujarati
- Hausa
- Hawaiian
- Hebrew
- Hindi
- Hmong
- Hungarian
- Igbo
- Ilocano
- Indonesian
- English
- Irish
- Icelandic
- Italian
- Japanese
- Javanese
- Kazakh
- Kiñaruanda
- Kyrgyz
- Konkani
- Krio
- Kurdish (Kurmanji)
- Kurdish (Sorani)
- Lao
- Latin
- Latvian
- Lingala
- Lithuanian
- Luganda
- Luxembourgish
- Macedonian
- Maithili
- Malayalam
- Malay
- Maldivian
- Malagasy
- Maltese
- Maori
- Marathi
- Meiteilon (Manipuri)
- Mizo
- Mongolian
- Dutch
- Nepali
- Norwegian
- Oriya
- Oromo
- Punjabi
- Pashtun
- Persian
- Polish
- Portuguese
- Quechua
- Romanian
- Russian
- Samoan
- Sanskrit
- Sepedi
- Serbian
- Southern Sotho
- Shona
- Sindhi
- Somali
- Swahili
- Swedish
- Sundanese
- Tagalog
- Thai
- Tamil
- Tatar
- Tajik
- Telugu
- Tigrinya
- Tsonga
- Turkish
- Turkmen
- Twi
- Ukrainian
- Uyghur
- Urdu
- Uzbek
- Vietnamese
- Xhosa
- Yiddish
- Yoruba
- Zulu
Languages in development and beta version of Google Translate
The following languages are not yet supported by Google Translate, however, you can contribute to these languages through the website for Google to support in the future. As of June 2022, 103 languages are in development, of which 9 are in beta.
Beta languages are closer to their public release and have an exclusive extra option to contribute that allows you to evaluate up to 4 translations of the beta version by translating an English text of up to 50 characters.
- Achenese
- Adyghe
- Afar BETA
- Ahirani
- Southern Altai
- Aragonese
- Avar
- Bagheli
- Balochi
- Bangala
- Baoulé
- Bashkir
- Batak tuff
- Betawi
- Bodo BETA
- Breton
- Kashmiri
- Cantonese
- Chatisgarí
- Chechen
- Cherokee
- Chiluba
- Chitonga
- Chittagonio
- Chuvash
- Cumuco
- Decaní
- Dholuo
- Diula
- Dzongkha
- Edo
- Efik
- Esan
- Fon
- Fulfulde BETA
- Gagaúzo
- Garhwali
- Kalaallisut
- Haryanvi
- Hiligainon
- Inuktitut
- Isoko
- Khakasium
- Kamba
- Kanuri
- Karachai-Balkaro
- Karakalpak
- Kashgai
- Kikuyu
- Kokborok
- Lakota
- Luba
- Madurés
- Magahi
- Kedah Malay
- Kelantan Malay
- Marwari
- Mazandaraní
- Minangkabau
- Montenegrin
- Mossi
- Navajo
- South Ndebele
- Nepal bhasa BETA
- Occitan
- Pampanga
- Pidgin from Nigeria
- K’iche
- Rangpuri
- Rayastani
- Rohingya
- Romansh
- Sadri
- Salt
- Northern Sami
- Samogitian
- Sango
- Santali BETA
- Saraiki BETA
- Serrano
- Tswana
- Shor
- Sicilian
- Swahili of the Congo
- Suryapuri
- Sylheti
- Tamazight BETA
- Siberian Tatar
- Tibetan BETA
- Tiv
- Tok Pisin
- Tswa
- Khorasan Turk
- Tuvinian
- Urhobo
- Urrumano
- Varhadi-Nagpuri
- Bandage
- Wolof
- Yakut
- Yucatecan BETA
- Zazaki
- Zhuang
Open source licenses and components
| Language | WordNet | License |
| Spanish | Spanish | CC-BY 3.0/GPL 3 |
| Arabic | Arabic Wordnet | |
| Catalan | Multilingual Central Repository | CC-BY-3.0 |
| Chinese | Chinese Wordnet | Wordnet |
| Danish | Dannet | No Wordnet |
| Spanish | Multilingual Central Repository | CC-BY-3.0 |
| Finnish | FinnWordnet | Wordnet |
| French | WOLF (WOrdnet Libre du Français) | CeCILL-C |
| Galician | Multilingual Central Repository | CC-BY-3.0 |
| Hebrew | Hebrew Wordnet | Wordnet |
| Hindi | IIT Bombay Wordnet | Indo Wordnet |
| Indonesian | Wordnet Bahasa | MIT |
| English | Princeton Wordnet | Wordnet |
| Italian | MultiWordnet | CC-BY-3.0 |
| Japanese | Japanese Wordnet | Wordnet |
| Javanese | Javanese Wordnet | Wordnet |
| Malay | Wordnet Bahasa | MIT |
| Norwegian | Norwegian Wordnet | Wordnet |
| Persian | Persian Wordnet | Free to Use |
| Polish | plWordnet | Wordnet |
| Portuguese | OpenWN-PT | CC-BY-SA-3.0 |
| Thai | Thai Wordnet | Wordnet |
Reviews
Shortly after launching the translation service, Google won an international competition for English-Arabic and English-Chinese machine translation.
Translation errors and oddities
Because Google Translate uses statistical matching to translate, sometimes the translated text can include glaring errors and seemingly meaningless phrases, using common terms for similar but not equivalent common terms in the target language, as with the Latin translation, reversing or altering its meaning of the requested sentence.
On April 23, 2020, it was announced that it adopted a new model to reduce the gender bias that occurs between two languages, when one of them distinguishes between male and female in the terms that the other has of gender neutral.
References (sources)
|
