For anyone who has used Google Translate and other machine translation tools, they know that many times the result is gibberish, especially for uncommon language pairs. However, for some common language pairs such as English and Spanish, for example, the translation can be right on the mark. You might be wondering why such a discrepancy might take place. After all, isn’t the machine translator just translating each word in the sentence and grouping them together? When it comes to Google Translate, the answer is a definite NO.
In fact, the method of translation for Google Translate is astoundingly complex and innovative. Google being one of the most powerful search engines on the web, it uses its vast computing power to scour the internet at lightning fast speed, looking for the expression in some text that exists alongside its paired translation. Instead of basing its assumption that the text is new and unique, the translator goes by the old adage, “there’s nothing new under the sun” and assumes the translation exists already somewhere on the web.
By using this method, it means that Google Translate relies on millions of already professionally (ideally) translated documents from millions of sources. “The corpus it can scan includes all the paper put out since 1957 by the EU in two dozen languages, everything the UN and its agencies have ever done in writing in six official languages, and huge amounts of other material, from the records of international tribunals to company reports and all the articles and books in bilingual form that have been put up on the web by individuals, libraries, booksellers, authors and academic departments.”¹
This methodology allows Google Translate to be classified more as a cultural mirror than as a strict machine translator, as it is merely searching for and reflecting the translations that humans themselves have created. While still in its early stages of accuracy development, Google Translate only can improve with time as more and more translated content becomes available on the web. In the meantime, inaccuracies occur from any gaps that need to be filled in the translation, and are done so with what Google determines to be the most probable translations based on context. Google translations themselves are also posted on the web and are thus scanned by Google Translate, creating a reinforcing loop of probability that the original translation was correct, even if it may not be.
While this system of translation is the most dynamic of all the machine translators on the web, the fact that Google licenses its API for developers to use (albeit, now for a price), assures that new innovative technology can be created, piggybacking on their technology.
As the information age continues to move forward, the biggest barrier to cross at this point is that of the language gap. There is a push for multilingual websites as we move toward globalization of the internet. Since languages themselves will always be dynamic and subtle in nature, the technology to interpret and translate meaning from text needs to be same. We are not at the point yet when this technology is perfected. But we are getting there.
¹Paragraph taken from ‘Is That A Fish In Your Ear: Translation and the Meaning of Everything’ by David Bellos published by Particular (£20). To order a copy for the special price of £16.50 (free P&P), call Independent Books Direct on 08430 600 030, or visit independentbooksdirect.co.uk