Samsung Research in India is part of a series about the people and innovations behind the democratization of mobile AI
As Samsung continues to pioneer premium mobile AI experiences, we visit Samsung Research centers around the world to learn how Galaxy AI is enabling more users to maximize their potential. Galaxy AI now supports 16 languages, so more people can expand their language capabilities, even when offline, thanks to on-device translation in features such as Live Translate, Interpreter, Note Assist and Browsing Assist. But what does AI language development involve? Last time, we visited Brazil to learn how teams work across cultures and borders to bring Galaxy AI to more people. This time, we’re in India to discover the value of cooperating with local partners.
Hidden inside the Vellore Institute of Technology in Chennai, India, is a lab filled with futuristic audio equipment. One will find mannequins — known in the industry as head and torso simulators — as well as binaural microphones and hearing devices. They are stored in special chambers treated with an advanced sound absorption system, making this lab the first of its kind in India. Imagine such a facility is used to develop the latest high-end high fidelity (Hi-Fi) equipment.
This is where the Vellore Institute of Technology collaborates with Samsung to produce and develop data and insights that power the latest AI models for Galaxy AI’s language capabilities. The facility was developed as part of Samsung SEED (Students Ecosystem for Engineered Data) Labs — an initiative that enables university staff, students and interns in India to work on projects requested by Samsung since 2021. This is just one of several university programs funded by Samsung in which students have the opportunity to work on projects with technical experts from the company.
“As a student, I love being able to work on multiple projects with a well-known and respected company such as Samsung,” says Yashika Ilanchezhiyan, a Samsung SEED student. “I’m given the confidence to learn new skills in a practical way and feel like I’m making a real difference in current and future products.”
“This kind of collaboration is a win-win situation,” says Giridhar Jakki, Head of Language AI at Samsung R&D Institute India - Bangalore (SRI-B). “Thanks to our projects with universities, we are able to access additional expertise and custom datasets. Partnering universities receive investment, financial incentives and expert mentorship from Samsung as a result.”
Lowering Language Barriers
SRI-B has collaborated with teams around the world to develop AI language models for British, Indian and Australian English as well as Thai, Vietnamese and Indonesian. Recently, core engineers from other Samsung Research centers visited Bangalore, India — where the SRI-B team helped ramp up the technology to bring Vietnamese, Thai and Indonesian to Galaxy AI. SRI-B was therefore ideally positioned to develop the Hindi language for Galaxy AI.
“Every language has its challenges,” says Jakki. “But when you consider the end goal of bringing people the ability to communicate in other languages, it’s worth every ounce of effort. We couldn’t wait to bring Hindi to Galaxy AI.”
Developing the Hindi AI model wasn’t simple. The team had to ensure more than 20 regional dialects, tonal inflections, punctuation and colloquialisms were covered. Additionally, it is common for Hindi speakers to mix English words in their conversations. This required the team to carry out multiple rounds of AI model training with a combination of translated and transliterated data.
“Hindi has a complex phonetic structure that includes retroflex sounds — sounds made by curling the tongue back in the mouth — which are not present in many other languages,” says Jakki. “To build the speech synthesis element of the AI solution, we carefully reviewed data with native linguists to understand all the unique sounds and created a special set of phenomes to support specific dialects of the language.”
Collaborative efforts between Samsung and academic partners were instrumental in developing the AI language model that reflected the cultural nuances of the India’s regions. The Vellore Institute of Technology helped secure almost a million lines of segmented and curated audio data on conversational speech, words and commands. Data was a crucial component for a task as critical as incorporating the fourth most spoken language in the world into Galaxy AI. Working with universities ensured Samsung was using the highest quality data.
Global Connections Deliver Big Impacts
This project perfectly encapsulates Samsung’s philosophy of open collaboration and the company’s belief that sharing expertise and perspectives ensures meaningful innovation. In the case of SRI-B, this not only includes working with academia but also sharing insights and best practices with other Samsung research centers around the world.
“I’m extremely proud of what we’ve achieved with the help of our partners,” says Jakki. “AI innovation through collaboration is a big part of what we do. We will continue to better understand, collect and analyze language data so more people can have access to AI tools in the future.”
Images (5)