Microsoft’s AI app VASA-1 makes photographs talk and sing with believable facial expressions

HMN 2024 – Given a single portrait image, a speech audio clip, and optionally a set of other control signals, our approach produces a high-quality lifelike talking face video of 512× 512 resolution at up to 40 FPS. The method is generic and robust, and the generated talking faces can faithfully mimic human facial expressions and head movements, […]

 ……… Read More