Paul Marks, senior technology correspondent
It'll have language teachersthe world over ripping up their vocab books: near-real-time speech conversion from one language to another has just become a reality. Microsoft Research has demonstrated not only how to convert spoken English into Mandarin with just a few seconds' delay - but also how to output that Mandarin speech in the vocal style of the original speaker. The technology was demonstrated by Microsoft's research chief Rick Rashid in Tjianjin, China, on 25 October - but the news has taken a whileto trickle out.
Rashid spoke just eight English sentences into the lab's new speech-recognition, translation and generation system, yetthe company reports the Mandarin output wowed a crowd of 2000 students and academics (jump to 7:30 in the video above to hear the output).
The system's advanced capability stems from a blizzard of improvements at all stages of the speech-to-speech process. Software like Nuance's Dragon Naturally Speaking have quietly blazed the trail for speech recognition in offices - and now products based on it, like Apple's Siri iPhone assistant can recognise spoken questions and search for answers on the web. Microsoft's Kinect has a speech interface too.
While such systems go wrong a lot - typically erring on one out of every four or five words, says Rashid - they now have a better way to recognise what people are saying. Microsoft's trick is to use a novel neural networking (machine learning) system that reduces word-recognition errors to one in seven or eight. That means the translation engine, Bing Translate, has a far better chance of creating intelligible Mandarin text to feed into the speaking engine.
Tidak ada komentar:
Posting Komentar