Home / Technology / Tech News / Microsoft announces text-to-speech avatar tool to create talking videos

Microsoft announces text-to-speech avatar tool to create talking videos

Microsoft announced text-to-speech features that come with vision capabilities to help the user in creating create synthetic videos of a 2D photorealistic avatar speaking

3 min read Last Updated : Nov 17 2023 | 3:39 PM IST

Listen to This Article

Microsoft has recently announced its latest text-to-speech features that come with vision capabilities, enabling users to create talking avatar videos with the help of text inputs. The new feature will also help to build an interactive bot trained using human images.

The latest text-to-speech avatar system has features with vision capabilities allowing customers to develop synthetic videos of a 2D photorealistic avatar speaking. The neural text-to-speech model is trained by deep neural networks based on human video recording samples. The voice of the avatar will be provided by a text-to-text-to-speech voice model.

This text-to-speech avatar will help the users to create more engaging digital interactions and also to build conversational agents, chatbots, virtual assistants and more.

This is designed with the aim of protecting individual and society's rights, fostering transparent human-computer interaction, and counteracting the proliferation of harmful deepfakes and misleading content.

Why did Microsoft build a text-to-speech avatar?

According to Microsoft's text-to-speech avatar:

Also Read

Microsoft introduces two custom AI chips to power Azure services: Details

Microsoft renames Bing Chat as Copilot and adds new AI tools for services

Samsung rolls out Bixby Text Call in India: What it is, how it works & more

Moody's downgrades Azure Power Energy and Azure Power Solar Energy

Windows 11: Microsoft rolls out September update packed with AI features

Traditional video content generally takes a lot of time and budget, which includes setting up a video shooting environment, filming videos, editing, etc. This Microsoft avatar will reduce your dependency on traditional ways of video creation and help you create videos efficiently. The avatar will also help users build training videos, customer testimonials, product introductions, etc., with the help of text input.

The release of Azure OpenAI Service and neural text-to-speech, the interactive conversation is much more natural than before. This avatar helps in creating more engaging digital interaction. The user can also use this to build conversational agents, virtual assistants, chatbots and more.

According to the official website, there are three workflows of content generation, i.e., TTS audio synthesiser, text analyser, and TTS avatar video synthesiser.

The company offers two separate text-to-speech avatar features at this time. One is a prebuilt text-to-speech avatar and the other is a custom text-to-speech avatar.

According to the company website, “Microsoft offers prebuilt text-to-speech avatars as out of box products on Azure for its subscribers. These avatars can speak different languages and voices based on the text input. Customers can select an avatar from a variety of options and use it to create video content or interactive applications with real-time avatar responses.”

Video content creation through text-to-speech avatar

Start with a talking script for your avatar or you can even use plain text format or Synthesis Markup Language (SSML). SSML helps you in tuning the voice of your avatar which includes pronunciation, and expression of terms like brand names, along with gestures like waving and pointing to an item.

Once you are ready with your talking script, you can use Azure TTS 3.1 API to synthesise your video. Besides the inputs of SSML, you can also specify your avatar character and its style and even the format of your desired video.

If you want, you can also add content images, videos with text, animations, illustrations, etc. to come up with the final video.

Combine all your assets including avatar video, content and option background music and compose your rich video experience.

More From This Section

Qualcomm announces Snapdragon 7 Gen 3 with built-in AI capabilities

First Published: Nov 17 2023 | 3:39 PM IST