Yaowei Li

Orcid: 0009-0006-7325-6439

Affiliations:
  • Peking University, School of ECE, Peng Cheng Laboratory, Beijing, China


According to our database1, Yaowei Li authored at least 33 papers between 2023 and 2026.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

Online presence:

On csauthors.net:

Bibliography

2026
4DVD: Cascaded Dense-view Video Diffusion Model for High-quality 4D Content Generation.
Int. J. Comput. Vis., May, 2026

2025
UP-Person: Unified Parameter-Efficient Transfer Learning for Text-Based Person Retrieval.
IEEE Trans. Circuits Syst. Video Technol., December, 2025

GIR-Bench: Versatile Benchmark for Generating Images with Reasoning.
CoRR, October, 2025

ToonComposer: Streamlining Cartoon Production with Generative Post-Keyframing.
CoRR, August, 2025

IC-Custom: Diverse Image Customization via In-Context Learning.
CoRR, July, 2025

BlobCtrl: A Unified and Flexible Framework for Element-level Image Generation and Editing.
CoRR, March, 2025

BlobCtrl: Taming Controllable Blob for Element-level Image Editing.
Proceedings of the SIGGRAPH Asia 2025 Conference Papers, 2025

<i>ClimateIQA: </i> A New Dataset and Benchmark to Advance Vision-Language Models in Meteorology Anomalies Analysis.
Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, V.2, 2025

DisPose: Disentangling Pose Guidance for Controllable Human Image Animation.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

NVComposer: Boosting Generative Novel View Synthesis with Multiple Sparse and Unposed Images.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

DM-Adapter: Domain-Aware Mixture-of-Adapters for Text-Based Person Retrieval.
Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

Image Conductor: Precision Control for Interactive Video Synthesis.
Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

2024
BrushEdit: All-In-One Image Inpainting and Editing.
CoRR, 2024

Vision-Language Models Meet Meteorology: Developing Models for Extreme Weather Events Detection with Heatmaps.
CoRR, 2024

Dance with Labels: Dual-Heterogeneous Label Graph Interaction for Multi-intent Spoken Language Understanding.
Proceedings of the 17th ACM International Conference on Web Search and Data Mining, 2024

MotionCtrl: A Unified and Flexible Motion Controller for Video Generation.
Proceedings of the ACM SIGGRAPH 2024 Conference Papers, 2024

Clip-Based Synergistic Knowledge Transfer for text-based Person Retrieval.
Proceedings of the IEEE International Conference on Acoustics, 2024

KC-Prompt: End-To-End Knowledge-Complementary Prompting for Rehearsal-Free Continual Learning.
Proceedings of the IEEE International Conference on Acoustics, 2024

Towards Multi-modal Sarcasm Detection via Disentangled Multi-grained Multi-modal Distilling.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

Soul-Mix: Enhancing Multimodal Machine Translation with Manifold Mixup.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

Aligner²: Enhancing Joint Multiple Intent Detection and Slot Filling via Adjustive and Forced Cross-Task Alignment.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

Embracing Language Inclusivity and Diversity in CLIP through Continual Language Learning.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

Exploiting Auxiliary Caption for Video Grounding.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

Towards Multi-Intent Spoken Language Understanding via Hierarchical Attention and Optimal Transport.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Generating Templated Caption for Video Grounding.
CoRR, 2023

GhostT5: Generate More Features with Cheap Operations to Improve Textless Spoken Question Answering.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

FC-MTLF: A Fine- and Coarse-grained Multi-Task Learning Framework for Cross-Lingual Spoken Language Understanding.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

C²A-SLU: Cross and Contrastive Attention for Improving ASR Robustness in Spoken Language Understanding.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Unify, Align and Refine: Multi-Level Semantic Alignment for Radiology Report Generation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

G2L: Semantically Aligned and Uniform Video Grounding via Geodesic and Game Theory.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

SSVMR: Saliency-Based Self-Training for Video-Music Retrieval.
Proceedings of the IEEE International Conference on Acoustics, 2023

Accelerating Multiple Intent Detection and Slot Filling via Targeted Knowledge Distillation.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

DAS-CL: Towards Multimodal Machine Translation via Dual-Level Asymmetric Contrastive Learning.
Proceedings of the 32nd ACM International Conference on Information and Knowledge Management, 2023


  Loading...