Yaowei Li

Orcid: 0009-0006-7325-6439

Affiliations:
  • Peking University, School of ECE, Peng Cheng Laboratory, Beijing, China


According to our database1, Yaowei Li authored at least 29 papers between 2023 and 2025.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
ToonComposer: Streamlining Cartoon Production with Generative Post-Keyframing.
CoRR, August, 2025

IC-Custom: Diverse Image Customization via In-Context Learning.
CoRR, July, 2025

UP-Person: Unified Parameter-Efficient Transfer Learning for Text-based Person Retrieval.
CoRR, April, 2025

BlobCtrl: A Unified and Flexible Framework for Element-level Image Generation and Editing.
CoRR, March, 2025

DisPose: Disentangling Pose Guidance for Controllable Human Image Animation.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

NVComposer: Boosting Generative Novel View Synthesis with Multiple Sparse and Unposed Images.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

DM-Adapter: Domain-Aware Mixture-of-Adapters for Text-Based Person Retrieval.
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

Image Conductor: Precision Control for Interactive Video Synthesis.
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

2024
BrushEdit: All-In-One Image Inpainting and Editing.
CoRR, 2024

Vision-Language Models Meet Meteorology: Developing Models for Extreme Weather Events Detection with Heatmaps.
CoRR, 2024

Dance with Labels: Dual-Heterogeneous Label Graph Interaction for Multi-intent Spoken Language Understanding.
Proceedings of the 17th ACM International Conference on Web Search and Data Mining, 2024

MotionCtrl: A Unified and Flexible Motion Controller for Video Generation.
Proceedings of the ACM SIGGRAPH 2024 Conference Papers, 2024

Clip-Based Synergistic Knowledge Transfer for text-based Person Retrieval.
Proceedings of the IEEE International Conference on Acoustics, 2024

KC-Prompt: End-To-End Knowledge-Complementary Prompting for Rehearsal-Free Continual Learning.
Proceedings of the IEEE International Conference on Acoustics, 2024

Towards Multi-modal Sarcasm Detection via Disentangled Multi-grained Multi-modal Distilling.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

Soul-Mix: Enhancing Multimodal Machine Translation with Manifold Mixup.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

Aligner²: Enhancing Joint Multiple Intent Detection and Slot Filling via Adjustive and Forced Cross-Task Alignment.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

Embracing Language Inclusivity and Diversity in CLIP through Continual Language Learning.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

Exploiting Auxiliary Caption for Video Grounding.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

Towards Multi-Intent Spoken Language Understanding via Hierarchical Attention and Optimal Transport.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Generating Templated Caption for Video Grounding.
CoRR, 2023

GhostT5: Generate More Features with Cheap Operations to Improve Textless Spoken Question Answering.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

FC-MTLF: A Fine- and Coarse-grained Multi-Task Learning Framework for Cross-Lingual Spoken Language Understanding.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

C²A-SLU: Cross and Contrastive Attention for Improving ASR Robustness in Spoken Language Understanding.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Unify, Align and Refine: Multi-Level Semantic Alignment for Radiology Report Generation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

G2L: Semantically Aligned and Uniform Video Grounding via Geodesic and Game Theory.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

SSVMR: Saliency-Based Self-Training for Video-Music Retrieval.
Proceedings of the IEEE International Conference on Acoustics, 2023

Accelerating Multiple Intent Detection and Slot Filling via Targeted Knowledge Distillation.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

DAS-CL: Towards Multimodal Machine Translation via Dual-Level Asymmetric Contrastive Learning.
Proceedings of the 32nd ACM International Conference on Information and Knowledge Management, 2023


  Loading...