In India, people speak many different languages, yet the medium of higher education is mainly English. It would be a grand feat if quality lectures available in English could be transcreated into various Indian languages. This idea first occurred to Prof. Rajeev Sangal of Indian Institute of Information Technology (IIIT) Hyderabad, who in discussion with the then Principal Scientific Advisor to the Prime Minister Prof. Vijay Raghavan, initiated the process.
This project aims to transcreate about 40,000 videos of lectures from the NPTEL and SWAYAM programmes into five Indian languages at first and then into 13. The pilot project spiralled out from IIIT Hyderabad, with Indian Institute of Technology (IIT) Madras and IIT Bombay as partners. Dipti Mishra (IIIT Hyderabad), Pushpak Bhattacharya (IIT Bombay), Umesh Srinivasan and Hema A. Murthy (IIT Madras) led the project in their respective institutions.
Also read | IIT Bombay software to enable live translations in regional languages inside classrooms
Some of the challenges in transcreating lectures are the following: The speech has to be first stripped from the video and converted into text. Manual intervention is needed to correct errors. Then the text has to be translated into the target language. In this, the technical terms have to be identified and the decision made as to whether it has to be translated or not. For instance the word “ratio” might be translated, but the word “calculus” might be kept as it is.
Machine translation is used for this. Then after correction of errors, the text which is in the target language is converted into speech which is artificially synthesised. This speech has to be attached to the video and even while doing this, the video and the speech have to be adjusted so that the lip movements are synchronised and the length of the speech matches in both languages. All these are non-trivial steps and are being carried out by the different institutions.
The process
In the project, every video is first subjected to speech recognition. The extracted file is cleaned up with errors being manually removed (at IIIT Hyderabad and IIT Bombay). The identification and discovery of technical “domain terms” is done at IIIT Hyderabad. This is followed by text-to-text machine translation, carried out at IIIT Hyderabad and IIT Bombay. A manual error-correction is then done at IIT Madras and IIIT Hyderabad. From the translated, corrected text, speech is synthesised at IIT Madras where also the speech is made to synchronise with the lip movement.
Challenges and targets
In future, the technology would develop to the extent that the synthesised speech will match with the original speaker’s voice itself. In the present version, the transcreated voice does not have the emotional ups and downs of the original. Also, while the original speaker might “hum and haw” and make some repetitions or mistakes, the transcreated speech is highly correct and “perfect.” The group is working on how to get the software to retain these human qualities of the original.
Open source
“The project is meant to engage a large number of startups for not only technical lectures, but also general topics,” said Prof Murthy.
In all, 75 videos have been transcreated in five different Indian languages. Once machine translation is developed for more languages, this can be extended to 13 languages. “The software is indigenous and available in open source, and to startups for commercial purposes at no fee,” says Prof. Murthy.
Future goals include making 100 courses available in 18 Indian languages. Also, the researchers plan to perform spoken language identification, provide keyword search and transcreations between two Indian languages and Indian English.
This will involve many more institutions across India.
“This is first time all of us have come together and made an impact,” says Dr. Murthy.