We stand with Ukraine

We stand with Ukraine

Yaohui Wang

Orcid: 0009-0002-9487-6187

Affiliations:

Shanghai Artificial Intelligence Laboratory, China
University of Côte d'Azur, Nice, France (PhD 2021)
INRIA, STARS, Sophia-Antipolis, France (former)

According to our database¹, Yaohui Wang authored at least 62 papers between 2018 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

Online presence:

On csauthors.net:

Bibliography

2025

THEval. Evaluation Framework for Talking Head Video Generation.

[BibT_eX]

[DOI]

,

Baptiste Chopin

,

,

Antitza Dantcheva

CoRR, November, 2025

RAPO++: Cross-Stage Prompt Optimization for Text-to-Video Generation via Data Alignment and Test-Time Scaling.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

,

CoRR, October, 2025

InternSVG: Towards Unified SVG Tasks with Multimodal Large Language Models.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

CoRR, October, 2025

Vinci: A Real-time Smart Assistant Based on Egocentric Vision-language Model for Portable Devices.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

Proc. ACM Interact. Mob. Wearable Ubiquitous Technol., September, 2025

CineTrans: Learning to Generate Videos with Cinematic Transitions via Masked Diffusion Models.

[BibT_eX]

[DOI]

,

,

,

,

CoRR, August, 2025

LIA-X: Interpretable Latent Portrait Animator.

[BibT_eX]

[DOI]

,

,

,

François Brémond

,

,

Antitza Dantcheva

CoRR, August, 2025

Consistent and Controllable Image Animation with Motion Linear Diffusion Transformers.

[BibT_eX]

[DOI]

,

,

,

,

,

CoRR, August, 2025

GenHOI: Generalizing Text-driven 4D Human-Object Interaction Synthesis for Unseen Objects.

[BibT_eX]

[DOI]

,

,

,

,

CoRR, June, 2025

LaVie: High-Quality Video Generation with Cascaded Latent Diffusion Models.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

Chen Change Loy

,

,

,

,

Int. J. Comput. Vis., May, 2025

Training-free Stylized Text-to-Image Generation with Fast Inference.

[BibT_eX]

[DOI]

,

,

,

,

CoRR, May, 2025

LEO: Generative Latent Image Animator for Human Video Synthesis.

[BibT_eX]

[DOI]

,

,

,

,

Antitza Dantcheva

,

,

Int. J. Comput. Vis., March, 2025

AccVideo: Accelerating Video Diffusion Model with Synthetic Dataset.

[BibT_eX]

[DOI]

,

,

,

,

,

CoRR, March, 2025

An Egocentric Vision-Language Model based Portable Real-time Smart Assistant.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

CoRR, March, 2025

Dimitra: Audio-driven Diffusion model for Expressive Talking Head Generation.

[BibT_eX]

[DOI]

Baptiste Chopin

,

Tashvik Dhamija

,

,

,

Antitza Dantcheva

CoRR, February, 2025

Vchitect-2.0: Parallel Transformer for Scaling Up Video Diffusion Models.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

CoRR, January, 2025

Latte: Latent Diffusion Transformer for Video Generation.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

Trans. Mach. Learn. Res., 2025

TimeStep Master: Asymmetrical Mixture of Timestep LoRA Experts for Versatile and Efficient Diffusion Models in Vision.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Fangyikang Wang

,

,

,

Proceedings of the Forty-second International Conference on Machine Learning, 2025

Ouroboros3D: Image-to-3D Generation via 3D-aware Recursive Diffusion.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Consistent and Controllable Image Animation with Motion Diffusion Models.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

The Devil is in the Prompts: Retrieval-Augmented Prompt Optimization for Text-to-Video Generation.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

2024

LIA: Latent Image Animator.

[BibT_eX]

[DOI]

,

,

François Brémond

,

Antitza Dantcheva

IEEE Trans. Pattern Anal. Mach. Intell., December, 2024

View-Invariant Skeleton Action Representation Learning via Motion Retargeting.

[BibT_eX]

[DOI]

,

,

Antitza Dantcheva

,

Lorenzo Garattoni

,

Gianpiero Francesca

,

François Brémond

Int. J. Comput. Vis., July, 2024

Uncertainty-aware image inpainting with adaptive feedback network.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Expert Syst. Appl., January, 2024

Vinci: A Real-time Embodied Smart Assistant based on Egocentric Vision-Language Model.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

CoRR, 2024

VBench++: Comprehensive and Versatile Benchmark Suite for Video Generative Models.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

Nattapol Chanpaisit

,

,

,

,

,

,

,

,

,

CoRR, 2024

Cinemo: Consistent and Controllable Image Animation with Motion Diffusion Models.

[BibT_eX]

[DOI]

,

,

,

,

,

,

CoRR, 2024

Ouroboros3D: Image-to-3D Generation via 3D-aware Recursive Diffusion.

[BibT_eX]

[DOI]

,

,

,

,

,

CoRR, 2024

Hierarchical Diffusion Autoencoders and Disentangled Image Manipulation.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024

4Diffusion: Multi-view Video Diffusion Model for 4D Generation.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding and Generation.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

Proceedings of the Twelfth International Conference on Learning Representations, 2024

SEINE: Short-to-Long Video Diffusion Model for Generative Transition and Prediction.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

Proceedings of the Twelfth International Conference on Learning Representations, 2024

AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning.

[BibT_eX]

[DOI]

,

,

,

Zhengyang Liang

,

,

,

Maneesh Agrawala

,

,

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Vlogger: Make Your Dream A Vlog.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

SinSR: Diffusion-Based Image Super-Resolution in a Single Step.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

EpiDiff: Enhancing Multi-View Synthesis via Localized Epipolar-Constrained Diffusion.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

VBench: Comprehensive Benchmark Suite for Video Generative Models.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

Nattapol Chanpaisit

,

,

,

,

,

,

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Brush Your Text: Synthesize Any Scene Text on Images via Diffusion Model.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

ConditionVideo: Training-Free Condition-Guided Video Generation.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023

Learning Invariance From Generated Variance for Unsupervised Person Re-Identification.

[BibT_eX]

[DOI]

,

,

,

Antitza Dantcheva

,

François Brémond

IEEE Trans. Pattern Anal. Mach. Intell., June, 2023

ConditionVideo: Training-Free Condition-Guided Text-to-Video Generation.

[BibT_eX]

[DOI]

,

,

,

,

CoRR, 2023

InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding and Generation.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

CoRR, 2023

AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning.

[BibT_eX]

[DOI]

,

,

,

,

,

,

CoRR, 2023

LEO: Generative Latent Image Animator for Human Video Synthesis.

[BibT_eX]

[DOI]

,

,

,

Antitza Dantcheva

,

,

CoRR, 2023

Long-Term Rhythmic Video Soundtracker.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the International Conference on Machine Learning, 2023

LAC - Latent Action Composition for Skeleton-based Action Segmentation.

[BibT_eX]

[DOI]

,

,

Antitza Dantcheva

,

,

Lorenzo Garattoni

,

Gianpiero Francesca

,

François Brémond

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Self-Supervised Video Representation Learning via Latent Time Navigation.

[BibT_eX]

[DOI]

,

,

,

Antitza Dantcheva

,

Lorenzo Garattoni

,

Gianpiero Francesca

,

François Brémond

Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022

ViA: View-invariant Skeleton Action Representation Learning via Motion Retargeting.

[BibT_eX]

[DOI]

,

,

Antitza Dantcheva

,

Lorenzo Garattoni

,

Gianpiero Francesca

,

François Brémond

CoRR, 2022

Latent Image Animator: Learning to Animate Images via Latent Space Navigation.

[BibT_eX]

[DOI]

,

,

François Brémond

,

Antitza Dantcheva

Proceedings of the Tenth International Conference on Learning Representations, 2022

2021

Learning to Generate Human Videos. (Apprendre à Générer des Vidéos de Personnes).

[BibT_eX]

[DOI]

PhD thesis, 2021

InMoDeGAN: Interpretable Motion Decomposition Generative Adversarial Network for Video Generation.

[BibT_eX]

[DOI]

,

François Brémond

,

Antitza Dantcheva

CoRR, 2021

Selective Spatio-Temporal Aggregation Based Pose Refinement System: Towards Understanding Human Activities in Real-World Videos.

[BibT_eX]

[DOI]

,

,

,

Rupayan Mallick

,

,

Gianpiero Francesca

,

François Brémond

Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2021

Self-Supervised Video Pose Representation Learning for Occlusion- Robust Action Recognition.

[BibT_eX]

[DOI]

,

,

Antitza Dantcheva

,

Lorenzo Garattoni

,

Gianpiero Francesca

,

François Brémond

Proceedings of the 16th IEEE International Conference on Automatic Face and Gesture Recognition, 2021

Emotion Editing in Head Reenactment Videos using Latent Space Manipulation.

[BibT_eX]

[DOI]

Valeriya Strizhkova

,

,

David Anghelone

,

,

Antitza Dantcheva

,

François Brémond

Proceedings of the 16th IEEE International Conference on Automatic Face and Gesture Recognition, 2021

Joint Generative and Contrastive Learning for Unsupervised Person Re-Identification.

[BibT_eX]

[DOI]

,

,

,

Antitza Dantcheva

,

François Brémond

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

UNIK: A Unified Framework for Real-world Skeleton-based Action Recognition.

[BibT_eX]

[DOI]

,

,

Antitza Dantcheva

,

Lorenzo Garattoni

,

Gianpiero Francesca

,

François Brémond

Proceedings of the 32nd British Machine Vision Conference 2021, 2021

2020

ImaGINator: Conditional Spatio-Temporal GAN for Video Generation.

[BibT_eX]

[DOI]

,

,

François Brémond

,

Antitza Dantcheva

Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2020

A video is worth more than 1000 lies. Comparing 3DCNN approaches for detecting deepfakes.

[BibT_eX]

[DOI]

,

Antitza Dantcheva

Proceedings of the 15th IEEE International Conference on Automatic Face and Gesture Recognition, 2020

G3AN: Disentangling Appearance and Motion for Video Generation.

[BibT_eX]

[DOI]

,

,

François Brémond

,

Antitza Dantcheva

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019

G<sup>3</sup>AN: This video does not exist. Disentangling motion and appearance for video generation.

[BibT_eX]

[DOI]

,

Piotr Tadeusz Bilinski

,

François Brémond

,

Antitza Dantcheva

CoRR, 2019

2018

Comparing Methods for Assessment of Facial Dynamics in Patients with Major Neurocognitive Disorders.

[BibT_eX]

[DOI]

,

Antitza Dantcheva

,

Jean-Claude Broutart

,

Philippe Robert

,

François Brémond

,

Piotr Tadeusz Bilinski

Proceedings of the Computer Vision - ECCV 2018 Workshops, 2018

From Attribute-Labels to Faces: Face Generation Using a Conditional Generative Adversarial Network.

[BibT_eX]

[DOI]

,

Antitza Dantcheva

,

François Brémond

Proceedings of the Computer Vision - ECCV 2018 Workshops, 2018

From attributes to faces: a conditional generative network for face generation.

[BibT_eX]

[DOI]

,

Antitza Dantcheva

,

François Brémond

Proceedings of the 2018 International Conference of the Biometrics Special Interest Group, 2018

Loading...