Microsoft VASA-1 AI Model Deepfake
You’ve seen Flawless TrueSync, now check out Microsoft’s VASA-1 AI model, which can generate deepfakes using a single photo and a short speech audio track. The generated clips result in deepfakes with lip movements that are precisely synchronized with the audio, while also capturing a large spectrum of facial nuances and natural head motions.



To achieve this level of liveliness in a deepfake, Microsoft needed to develop a new holistic facial dynamics and head movement generation model that works in a face latent space using videos. The current model supports online generation of 512×512 videos at up to 40 FPS with negligible starting latency, thus paving the way for real-time engagements with lifelike avatars that emulate human conversational behaviors. There’s no word yet on if VASA-1 will be released to the public.

Sale
Microsoft Xbox Series S 1TB SSD Console Carbon Black - Includes Xbox Wireless Controller - Up to 120...
  • XBOX SERIES S 1TB: Go all digital and experience next-gen speed and performance. Double the fun with double the storage.
  • FASTER LOAD TIMES: Make the most of every gaming minute with Quick Resume, lightning-fast load times, and gameplay of up to 120 FPS – all powered by...
  • IN THE BOX: Xbox Series S 1TB console, one Xbox Wireless Controller, an ultra high-speed HDMI cable, power cable, and 2 AA batteries.

Our method exhibits the capability to handle photo and audio inputs that are out of the training distribution. For example, it can handle artistic photos, singing audios, and non-English speech. These types of data were not present in the training set,” said the researchers.

Author

A technology, gadget and video game enthusiast that loves covering the latest industry news. Favorite trade show? Mobile World Congress in Barcelona.

Write A Comment