Effective Multilingual Interaction in Mobile Environments


The EMIME project will help to overcome the language barrier by developing a mobile device that performs personalized speech-to-speech translation, such that the a user's spoken input in one language is used to produce spoken output in another language, while continuing to sound like the user's voice. Personalisation of systems for cross-lingual spoken communication is an important, but little explored, topic. It is essential for providing more natural interaction and making the computing device a less obtrusive element when assisting human-human interactions. We will build on recent developments in speech synthesis using hidden Markov models, which is the same technology used for automatic speech recognition. Using a common statistical modelling framework for automatic speech recognition and speech synthesis will enable the use of common techniques for adaptation and multilinguality. Significant progress will be made towards a unified approach for speech recognition and speech synthesis: this is a very powerful concept, and will open up many new areas of research. In this project, we will explore the use of speaker adaptation across languages so that, by performing automatic speech recognition, we can learn the characteristics of an individual speaker, and then use those characteristics when producing output speech in another language.\n\nOur objectives are to: Personalise speech processing systems by learning individual characteristics of a user's speech and reproducing them in synthesised speech; Introduce a cross-lingual capability such that personal characteristics can be reproduced in a second language not spoken by the user; Develop and better understand the mathematical and theoretical relationship between speech recognition and synthesis; Eliminate the need for human intervention in the process of cross-lingual personalisation; Evaluate our research against state-of-the art techniques and in a practical mobile application.

  • Status
  • Completed
  • Project Launch
  • 01 March 2008
  • Project completed
  • 28 February 2011
ICT language speech-to-speech translation