Kai Chen
Orcid: 0000-0002-6820-2325Affiliations:
- Shanghai AI Laboratory, Guangzhou, China
- SenseTime Research, Hong Kong
- Chinese University of Hong Kong, MMLab, Hong Kong (PhD 2019)
According to our database1,
Kai Chen
authored at least 186 papers
between 2017 and 2025.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
-
on github.com
-
on chenkai.site
On csauthors.net:
Bibliography
2025
CoRR, August, 2025
CoRR, August, 2025
InternBootcamp Technical Report: Boosting LLM Reasoning with Verifiable Task Scaling.
CoRR, August, 2025
CoRR, August, 2025
IFDECORATOR: Wrapping Instruction Following Reinforcement Learning with Verifiable Rewards.
CoRR, August, 2025
CompassVerifier: A Unified and Robust Verifier for LLMs Evaluation and Outcome Reward.
CoRR, August, 2025
Reconsidering Overthinking: Penalizing Internal and External Redundancy in CoT Reasoning.
CoRR, August, 2025
IEEE Trans. Pattern Anal. Mach. Intell., July, 2025
Smooth Reading: Bridging the Gap of Recurrent LLM to Self-Attention LLM on Long-Context Tasks.
CoRR, July, 2025
CoRR, July, 2025
MotionShot: Adaptive Motion Transfer across Arbitrary Objects for Text-to-Video Generation.
CoRR, July, 2025
CoRR, July, 2025
CoRR, July, 2025
CoRR, July, 2025
CoRR, June, 2025
Benchmarking and Improving Bird's Eye View Perception Robustness in Autonomous Driving.
IEEE Trans. Pattern Anal. Mach. Intell., May, 2025
IEEE Trans. Pattern Anal. Mach. Intell., May, 2025
MedSentry: Understanding and Mitigating Safety Risks in Medical LLM Multi-Agent Systems.
CoRR, May, 2025
CoRR, May, 2025
MIG: Automatic Data Selection for Instruction Tuning by Maximizing Information Gain in Semantic Space.
CoRR, April, 2025
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models.
CoRR, April, 2025
CoRR, March, 2025
CoRR, March, 2025
CoRR, March, 2025
MDTeamGPT: A Self-Evolving LLM-based Multi-Agent Framework for Multi-Disciplinary Team Medical Consultation.
CoRR, March, 2025
CoRR, February, 2025
CoRR, February, 2025
CoRR, February, 2025
Emilia: A Large-Scale, Extensive, Multilingual, and Diverse Dataset for Speech Generation.
CoRR, January, 2025
CoRR, January, 2025
CoRR, January, 2025
CoRR, January, 2025
Quantum circuit mapping based on discrete particle swarm optimization and deep reinforcement learning.
Swarm Evol. Comput., 2025
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2025
Proceedings of the Thirteenth International Conference on Learning Representations, 2025
Proceedings of the Thirteenth International Conference on Learning Representations, 2025
Proceedings of the Thirteenth International Conference on Learning Representations, 2025
Proceedings of the Thirteenth International Conference on Learning Representations, 2025
A Self-Evolving Framework for Multi-Agent Medical Consultation Based on Large Language Models.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025
Prompting Large Language Models to Tackle the Full Software Development Lifecycle: A Case Study.
Proceedings of the 31st International Conference on Computational Linguistics, 2025
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025
Proceedings of the Findings of the Association for Computational Linguistics, 2025
SWE-Fixer: Training Open-Source LLMs for Effective and Efficient GitHub Issue Resolution.
Proceedings of the Findings of the Association for Computational Linguistics, 2025
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025
Capability Salience Vector: Fine-grained Alignment of Loss and Capabilities for Downstream Task Scaling Law.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025
What are the Essential Factors in Crafting Effective Long Context Multi-Hop Instruction Datasets? Insights and Best Practices.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025
2024
IEEE Trans. Pattern Anal. Mach. Intell., December, 2024
Trans. Mach. Learn. Res., 2024
IEEE Geosci. Remote. Sens. Lett., 2024
InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions.
CoRR, 2024
Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling.
CoRR, 2024
CoRR, 2024
InternLM2.5-StepProver: Advancing Automated Theorem Proving via Expert Iteration on Large-Scale LEAN Problems.
CoRR, 2024
CoRR, 2024
What are the Essential Factors in Crafting Effective Long Context Multi-Hop Instruction Datasets? Insights and Best Practices.
CoRR, 2024
CoRR, 2024
CoRR, 2024
Live2Diff: Live Stream Translation via Uni-directional Attention in Video Diffusion Models.
CoRR, 2024
InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output.
CoRR, 2024
CoRR, 2024
CoRR, 2024
CoRR, 2024
CoRR, 2024
InternLM-XComposer2: Mastering Free-form Text-Image Composition and Comprehension in Vision-Language Large Model.
CoRR, 2024
CoRR, 2024
Proceedings of the IEEE Spoken Language Technology Workshop, 2024
Emilia: An Extensive, Multilingual, and Diverse Speech Dataset For Large-Scale Speech Generation.
Proceedings of the IEEE Spoken Language Technology Workshop, 2024
Lean Workbook: A large-scale Lean problem set formalized from natural language math problems.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
AlchemistCoder: Harmonizing and Eliciting Code Capability by Hindsight Tuning on Multi-source Data.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2024, 2024
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024
Proceedings of the Forty-first International Conference on Machine Learning, 2024
Proceedings of the Forty-first International Conference on Machine Learning, 2024
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024
Scaling Behavior for Large Language Models regarding Numeral Systems: An Example using Pythia.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024
A Task Is Worth One Word: Learning with Task Prompts for High-Quality Versatile Image Inpainting.
Proceedings of the Computer Vision - ECCV 2024, 2024
Proceedings of the Computer Vision - ECCV 2024, 2024
Proceedings of the Computer Vision - ECCV 2024 Workshops, 2024
Proceedings of the Computer Vision - ECCV 2024, 2024
Proceedings of the Computer Vision - ECCV 2024, 2024
Proceedings of the Computer Vision - ECCV 2024, 2024
Proceedings of the Computer Vision - ECCV 2024, 2024
PIA: Your Personalized Image Animator via Plug-and-Play Modules in Text-to-Image Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
From Pixels to Graphs: Open-Vocabulary Scene Graph Generation with Vision-Language Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
MathBench: Evaluating the Theory and Application Proficiency of LLMs with a Hierarchical Mathematics Benchmark.
Proceedings of the Findings of the Association for Computational Linguistics, 2024
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024
LLaST: Improved End-to-end Speech Translation System Leveraged by Large Language Models.
Proceedings of the Findings of the Association for Computational Linguistics, 2024
Agent-FLAN: Designing Data and Methods of Effective Agent Tuning for Large Language Models.
Proceedings of the Findings of the Association for Computational Linguistics, 2024
T-Eval: Evaluating the Tool Utilization Capability of Large Language Models Step by Step.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024
2023
CoRR, 2023
InternLM-XComposer: A Vision-Language Large Model for Advanced Text-image Comprehension and Composition.
CoRR, 2023
CoRR, 2023
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023
Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
Semantics-Aware Dynamic Localization and Refinement for Referring Image Segmentation.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023
2022
IEEE Trans. Pattern Anal. Mach. Intell., 2022
CoRR, 2022
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022
Proceedings of the Computer Vision - ECCV 2022, 2022
Proceedings of the Computer Vision - ECCV 2022 Workshops, 2022
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
TransRank: Self-supervised Video Representation Learning via Ranking-based Transformation Recognition.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
2021
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021
2020
Proceedings of the Computer Vision - ECCV 2020, 2020
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020
2019
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019
2018
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018
2017
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017