DyaDiT: A Multi-Modal Diffusion Transformer for Socially Favorable Dyadic Gesture GenerationDate: 2026-02-27Fetched: 2026-02-28T01:46:54.710651+00:00AuthorsYichen Peng, Jyun-Ting Song, Siyeol Jung, Ruofan Liu, Haiyang Liu, Xuangeng Chu, Ruicong Liu, Erwin Wu, Hideki Koike, Kris KitaniLinksHFarXivPDF1Abstract中文摘要EnglishDyaDiT是一种多模态扩散Transformer,通过捕捉两位说话者之间的交互动态,从双人音频信号生成与上下文相符的人体动作。