Yaowei Li

Orcid: 0009-0006-7325-6439

Affiliations:

Peking University, School of ECE, Peng Cheng Laboratory, Beijing, China

According to our database¹, Yaowei Li authored at least 29 papers between 2023 and 2025.

Collaborative distances:

Dijkstra number² of five.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Bibliography

2025

ToonComposer: Streamlining Cartoon Production with Generative Post-Keyframing.

[BibT_eX]

[DOI]

CoRR, August, 2025

IC-Custom: Diverse Image Customization via In-Context Learning.

[BibT_eX]

[DOI]

CoRR, July, 2025

UP-Person: Unified Parameter-Efficient Transfer Learning for Text-based Person Retrieval.

[BibT_eX]

[DOI]

CoRR, April, 2025

BlobCtrl: A Unified and Flexible Framework for Element-level Image Generation and Editing.

[BibT_eX]

[DOI]

CoRR, March, 2025

DisPose: Disentangling Pose Guidance for Controllable Human Image Animation.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

NVComposer: Boosting Generative Novel View Synthesis with Multiple Sparse and Unposed Images.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

DM-Adapter: Domain-Aware Mixture-of-Adapters for Text-Based Person Retrieval.

[BibT_eX]

[DOI]

Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

Image Conductor: Precision Control for Interactive Video Synthesis.

[BibT_eX]

[DOI]

Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

2024

BrushEdit: All-In-One Image Inpainting and Editing.

[BibT_eX]

[DOI]

CoRR, 2024

Vision-Language Models Meet Meteorology: Developing Models for Extreme Weather Events Detection with Heatmaps.

[BibT_eX]

[DOI]

CoRR, 2024

Dance with Labels: Dual-Heterogeneous Label Graph Interaction for Multi-intent Spoken Language Understanding.

[BibT_eX]

[DOI]

Proceedings of the 17th ACM International Conference on Web Search and Data Mining, 2024

MotionCtrl: A Unified and Flexible Motion Controller for Video Generation.

[BibT_eX]

[DOI]

Proceedings of the ACM SIGGRAPH 2024 Conference Papers, 2024

Clip-Based Synergistic Knowledge Transfer for text-based Person Retrieval.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

KC-Prompt: End-To-End Knowledge-Complementary Prompting for Rehearsal-Free Continual Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Towards Multi-modal Sarcasm Detection via Disentangled Multi-grained Multi-modal Distilling.

[BibT_eX]

[DOI]

Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

Soul-Mix: Enhancing Multimodal Machine Translation with Manifold Mixup.

[BibT_eX]

[DOI]

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

Aligner²: Enhancing Joint Multiple Intent Detection and Slot Filling via Adjustive and Forced Cross-Task Alignment.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

Embracing Language Inclusivity and Diversity in CLIP through Continual Language Learning.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

Exploiting Auxiliary Caption for Video Grounding.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

Towards Multi-Intent Spoken Language Understanding via Hierarchical Attention and Optimal Transport.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023

Generating Templated Caption for Video Grounding.

[BibT_eX]

[DOI]

CoRR, 2023

GhostT5: Generate More Features with Cheap Operations to Improve Textless Spoken Question Answering.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

FC-MTLF: A Fine- and Coarse-grained Multi-Task Learning Framework for Cross-Lingual Spoken Language Understanding.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

C²A-SLU: Cross and Contrastive Attention for Improving ASR Robustness in Spoken Language Understanding.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Unify, Align and Refine: Multi-Level Semantic Alignment for Radiology Report Generation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

G2L: Semantically Aligned and Uniform Video Grounding via Geodesic and Game Theory.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

SSVMR: Saliency-Based Self-Training for Video-Music Retrieval.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Accelerating Multiple Intent Detection and Slot Filling via Targeted Knowledge Distillation.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

DAS-CL: Towards Multimodal Machine Translation via Dual-Level Asymmetric Contrastive Learning.

[BibT_eX]

[DOI]

Proceedings of the 32nd ACM International Conference on Information and Knowledge Management, 2023

Yaowei Li

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...