Chenliang Li

Orcid: 0000-0001-9077-3928

Affiliations:
  • Alibaba DAMO Academy, Hangzhou, China


According to our database1, Chenliang Li authored at least 45 papers between 2019 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
Writing-RL: Advancing Long-form Writing via Adaptive Curriculum Reinforcement Learning.
CoRR, June, 2025

MUSEG: Reinforcing Video Temporal Understanding via Timestamp-Aware Multi-Segment Grounding.
CoRR, May, 2025

QwenLong-CPRS: Towards ∞-LLMs with Dynamic Context Optimization.
CoRR, May, 2025

QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning.
CoRR, May, 2025

WritingBench: A Comprehensive Benchmark for Generative Writing.
CoRR, March, 2025

MM-StoryAgent: Immersive Narrated Storybook Video Generation with a Multi-Agent Paradigm across Text, Image and Audio.
CoRR, March, 2025

2024
ProFuser: Progressive Fusion of Large Language Models.
CoRR, 2024

RoleInteract: Evaluating the Social Interaction of Role-Playing Agents.
CoRR, 2024

Efficient Vision-and-Language Pre-training with Text-Relevant Image Patch Selection.
CoRR, 2024

mPLUG-PaperOwl: Scientific Diagram Analysis with the Multimodal Large Language Model.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Small LLMs Are Weak Tool Learners: A Multi-LLM Agent.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Unifying Latent and Lexicon Representations for Effective Video-Text Retrieval.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

Semantics-enhanced Cross-modal Masked Image Modeling for Vision-Language Pre-training.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

SocialBench: Sociality Evaluation of Role-Playing Conversational Agents.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

2023
Achieving Human Parity on Visual Question Answering.
ACM Trans. Inf. Syst., 2023

ModelScope-Agent: Building Your Customizable Agent System with Open-source Large Language Models.
CoRR, 2023

mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding.
CoRR, 2023

Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Dataset for Pre-training and Benchmarks.
CoRR, 2023

Transforming Visual Scene Graphs to Image Captions.
CoRR, 2023

mPLUG-Owl: Modularization Empowers Large Language Models with Multimodality.
CoRR, 2023

ChatPLUG: Open-Domain Generative Dialogue System with Internet-Augmented Instruction Tuning for Digital Human.
CoRR, 2023

Adaptively Clustering Neighbor Elements for Image Captioning.
CoRR, 2023

COPA : Efficient Vision-Language Pre-training through Collaborative Object- and Patch-Text Alignment.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

mPLUG-2: A Modularized Multi-modal Foundation Model Across Text, Image and Video.
Proceedings of the International Conference on Machine Learning, 2023

Learning Trajectory-Word Alignments for Video-Language Tasks.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

BUS : Efficient and Effective Vision-language Pre-training with Bottom-Up Patch Summarization.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

UReader: Universal OCR-free Visually-situated Language Understanding with Multimodal Large Language Model.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

ModelScope-Agent: Building Your Customizable Agent System with Open-source Large Language Models.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Transforming Visual Scene Graphs to Image Captions.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2022
mPLUG: Effective and Efficient Vision-Language Learning by Cross-modal Skip-connections.
CoRR, 2022

mPLUG: Effective and Efficient Vision-Language Learning by Cross-modal Skip-connections.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

TRIPS: Efficient Vision-and-Language Pre-training with Text-Relevant Image Patch Selection.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

2021
Achieving Human Parity on Visual Question Answering.
CoRR, 2021

Grid-VLP: Revisiting Grid Features for Vision-Language Pre-training.
CoRR, 2021

SemVLP: Vision-Language Pre-training by Aligning Semantics at Multiple Levels.
CoRR, 2021

AliMe DA: A Data Augmentation Framework for Question Answering in Cold-start Scenarios.
Proceedings of the SIGIR '21: The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2021

MinD at SemEval-2021 Task 6: Propaganda Detection using Transfer Learning and Multimodal Fusion.
Proceedings of the 15th International Workshop on Semantic Evaluation, 2021

E2E-VLP: End-to-End Vision-Language Pre-training Enhanced by Visual Learning.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

Addressing Semantic Drift in Generative Question Answering with Auxiliary Extraction.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

StructuralLM: Structural Pre-training for Form Understanding.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

A Unified Pretraining Framework for Passage Ranking and Expansion.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
PALM: Pre-training an Autoencoding&Autoregressive Language Model for Context-conditioned Generation.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

Generating Well-Formed Answers by Machine Reading with Stochastic Selector Networks.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
IDST at TREC 2019 Deep Learning Track: Deep Cascade Ranking with Generation-based Document Expansion and Pre-trained Language Modeling.
Proceedings of the Twenty-Eighth Text REtrieval Conference, 2019

Incorporating External Knowledge into Machine Reading for Generative Question Answering.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019


  Loading...