Google Translate has been a highly useful project for the development of the web, scholars, and anyone who wants to understand a page that isn’t in a language they speak. The utility has over 50 languages at its disposal, and while the translation of these different languages are in different stages of development (from alpha to ready), there’s no doubt that the options provided are impressive. Five new languages spoken in India are being added to the translate tool in “alpha” stage.
These five languages (Bengali, Gujarati, Kannada, Tamil, and Telugu) are all popularly spoken in India. As noted in the Official Google Blog, “In India and Bangladesh alone, more than 500 million people speak these five languages.” Beyond allowing translation services to other languages for speakers of these languages, users around the world will now have access to more content and information produced in these languages.
These specific languages provide challenges because of several elements of the language’s construction, including:
- The elegant and intricate script used for the languages (in fact, to see the script, users have to install the fonts for each language).
- The “SOV” sentence structure, which orders sentences as “subject object verb” rather than the traditional English “subject verb object.” This creates an increased risk of mistranslation, since it means more objects have to be rearranged.
- Affixes and other word elements may be added that have their own meaning, such as a descriptor on the number of items being described, the tense of a verb, and so forth.
Users will be able to translate directly from the script of Bengali, Gujarati, Kannada, Tamil, and Telugu, but transliteration is also available.
[sources include: The Official Google Blog]