Bytedance anounces INFP: Audio-Driven Interactive Head Generation in Dyadic Conversations

December 22, 2024

We present INFP, an audio-driven interactive head generation framework for dyadic conversations. Given the dual-track audio in dyadic conversations and a single portrait image of arbitrary agent, our framework can dynamically synthesize verbal, non-verbal and interactive agent videos with lifelike facial expressions and rhythmic head pose movements. Additionally, our framework is lightweight yet powerful, making it practical in instant communication scenarios such as the video conferencing. INFP denotes our method is Interactive, Natural, Flash and Person-generic.

Articles

Article

INFP: Audio-Driven Interactive Head Generation in Dyadic Conversations

Article

Bytedance announces INFP, AI that can make any single image talk and sing from any audio file expressively!

Videos

INFP Demo 1

Back to Timeline