Technique matches words with actors' facial expressions

Image
IANS New York
Last Updated : Apr 20 2015 | 2:48 PM IST

Does watching a dubbed foreign film often make you grimace because the facial expressions of the actors do not match the words put in their mouth?

This is now likely to change with a new method that analyses an actor's speech motions and suggests new words to reduce the subtle discrepancies between the spoken words and facial expressions.

"This approach, based on 'dynamic visemes,' or facial movements associated with speech sounds, produces far more alternative word sequences than approaches that use conventional visemes, which are static lip shapes associated with sounds," said Sarah Taylor from Disney Research at Pittsburgh.

"The method using dynamic visemes produces many more plausible alternative word sequences that are perceivably better than those produced using a static viseme approaches," Taylor said.

With the new method, the researchers found that the facial movements an actor makes when saying "clean swatches," for instance, are the same as those for such phrases as "likes swats," "then swine," or "need no pots."

Speech redubbing, such as translating movies, television shows and video games into another language, or removing offensive language from a TV show, typically involves meticulous scripting to selecting words that match lip motions and re-recording by a skilled actor. Automatic speech redubbing, as explored in this study, is a largely unexplored area of research.

With conventional static visemes, a lip shape is assumed to represent a small number of different sounds and the mapping of those units to distinctive sounds is incomplete, which the researchers found limited the number of alternative word sequences that could be automatically generated.

Dynamic visemes represents the set of speech-related lip motions and their mapping to sequences of spoken sounds. The researchers exploited this more general mapping to identify a large number of different word sequences for a given facial movement.

The findings are scheduled to be presented at ICASSP 2015, the IEEE International Conference on Acoustics, Speech and Signal Processing in Brisbane, Australia, on April 23.

*Subscribe to Business Standard digital and get complimentary access to The New York Times

Smart Quarterly

₹900

3 Months

₹300/Month

SAVE 25%

Smart Essential

₹2,700

1 Year

₹225/Month

SAVE 46%
*Complimentary New York Times access for the 2nd year will be given after 12 months

Super Saver

₹3,900

2 Years

₹162/Month

Subscribe

Renews automatically, cancel anytime

Here’s what’s included in our digital subscription plans

Exclusive premium stories online

  • Over 30 premium stories daily, handpicked by our editors

Complimentary Access to The New York Times

  • News, Games, Cooking, Audio, Wirecutter & The Athletic

Business Standard Epaper

  • Digital replica of our daily newspaper — with options to read, save, and share

Curated Newsletters

  • Insights on markets, finance, politics, tech, and more delivered to your inbox

Market Analysis & Investment Insights

  • In-depth market analysis & insights with access to The Smart Investor

Archives

  • Repository of articles and publications dating back to 1997

Ad-free Reading

  • Uninterrupted reading experience with no advertisements

Seamless Access Across All Devices

  • Access Business Standard across devices — mobile, tablet, or PC, via web or app

More From This Section

First Published: Apr 20 2015 | 2:40 PM IST

Next Story