A new study by the University of East Anglia suggests computers are now better at lip-reading than humans.
The peer-reviewed findings will be presented for the first time at the eighth International Conference on Auditory-Visual Speech Processing (AVSP) 2009, held at the University of East Anglia from September 10-13.
A research team from the School of Computing Sciences at UEA compared the performance of a machine-based lip-reading system with that of 19 human lip-readers. They found that the automated system significantly outperformed the human lip-readers - scoring a recognition rate of 80 per cent, compared with only 32 per cent for human viewers on the same task.
Furthermore, they found that machines are able to exploit very simplistic features that represent only the shape of the face, whereas human lip-readers require full video of people speaking.
The study also showed that rather than the traditional approach to lip-reading training, in which viewers are taught to spot key lip-shapes from static (often drawn) images, the dynamics and the full appearance of speech gestures are very important.
Using a new video-based training system, viewers with very limited training significantly improved their ability to lip-read monosyllabic words, which in itself is a very difficult task. It is hoped this research might lead to novel methods of lip-reading training for the deaf and hard of hearing.
"This pilot study is the first time an automated lip-reading system has been benchmarked against human lip-readers and the results are perhaps surprising," said the study's lead author Sarah Hilder.
"With just four hours of training it helped them improve their lip-reading skills markedly. We hope this research will represent a real technological advance for the deaf community."
Agnes Hoctor, campaigns manager at the RNID, said: "This research confirms how difficult the vital skill of lip-reading is to learn and why RNID is campaigning for people who are deaf or hard of hearing to have improved access to classes. We would welcome the development of video-based or online training resources to supplement the teaching of lip-reading. Hearing loss affects 55 per cent of people over 60 so, with the ageing population, demand to learn lip-reading is only going to increase."
The AVSP conference was held in the UK for the first time since its inception in 1998. The University of East Anglia hosted cutting edge researchers including psychologists, engineers, scientists and linguists from as far afield as Australia, Canada and Japan.
As part of the conference, delegates took part in a Visual Speech Synthesis Challenge in which a number of visual speech synthesizers, or ‘talking heads', battled it out to determine the most intelligible and visually appealing system.
Part of the lip-reading test used to compare the performance of the machine-based lip-reading system and human lip-readers can be downloaded here: www.jtuk.com/training/part1.html
Taken from www.uea.ac.uk/mac/comm.