Comment on Microsoft’s VASA-1 can deepfake a person with one photo and one audio track

luciole@beehaw.org ⁨1⁩ ⁨month⁩ ago

The actual research page is so awkward. The TLDR at the top goes:

single portrait photo + speech audio = hyper-realistic talking face video

Then a little lower comes the big red warning:

We are exploring visual affective skill generation for virtual, interactive characters, NOT impersonating any person in the real world.

No siree! Big “not what it looks like” vibes.

source
Sort:hotnewtop