Friday, November 22, 2024

Rare European languages among 110 new ones added to Google Translate

Must read

The tech giant announced that 110 languages had been added to its Translate feature, including Sicilian, Manx, Breton, and Romani.

ADVERTISEMENT

Google has announced that 110 languages have been added to its Translate feature, including several endangered European ones.

About a quarter of the new languages come from Africa, the company said, with all the new languages representing more than 614 million speakers.

You can now send a work email in Manx, the Celtic language of the Isle of Man that “almost went extinct with the death of its last native speaker in 1974,” the company said in a post on its blog.

Regional languages from France, including Breton – a Celtic language spoken in Brittany – and Occitan, which is spoken in the country’s south, are also on the list.

Sicilian and Venitian, two Italian dialects from respectively Sicily and Venice, were also added as well as Northern Sámi, a language spoken in the north of Scandinavia. 

Romani, a language that has many dialects and is spoken by around 4.6 million people in Europe, is also on the list.

Google’s long-term goal is “to build an AI model that will support the 1,000 most spoken languages,” the company said in 2022. 

How does it work?

Translation of these languages isn’t as straightforward as it first appear. Google uses artificial intelligence (AI) to power the Translate feature. 

AI needs a sufficient amount of written data, such as text from books, articles, and websites, something that can be scarce for regional and rare languages.

This data then feed algorithms inspired by the human brain called neural networks, allowing the system to analyse patterns, context, and linguistic structures in order to give natural-sounding translations.

AI tends to perform better in English due to the abundance of training data available.

Google isn’t the only company to have taken an interest in rare languages. 

Microsoft launched in 2022 the Microsoft Language Bank highlighting that 40 per cent of the world’s languages are endangered while the Living Tongues Institute for Endangered Languages partnered with audio company Shure for a campaign called “No Voice Left Behind” to promote the preservation of remote regions’ languages.

Stanford also launched its own initiative to preserve “ Digitally Disadvantaged Languages”.

Latest article