Bytedance anounces INFP: Audio-Driven Interactive Head Generation in Dyadic Conversations
BytedanceDecember 22, 2024
We present INFP, an audio-driven interactive head generation framework for dyadic conversations. Given the dual-track audio in dyadic conversations and a single portrait image of arbitrary agent, our framework can dynamically synthesize verbal, non-verbal and interactive agent videos with lifelike facial expressions and rhythmic head pose movements. Additionally, our framework is lightweight yet powerful, making it practical in instant communication scenarios such as the video conferencing. INFP denotes our method is Interactive, Natural, Flash and Person-generic.