Yan Gao

Orcid: 0009-0004-5960-1684

Affiliations:
  • Xiaohongshu Inc., Beijing, China
  • Alibaba Group, Beijing, China
  • Chinese Academy of Sciences, Institute of Computing Technology, State Key Laboratory of Computer Architecture, Beijing, China


According to our database1, Yan Gao authored at least 62 papers between 2010 and 2026.

Collaborative distances:

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

Online presence:

On csauthors.net:

Bibliography

2026
Tournament-GRPO: Group-Wise Tournament Rewards for Reinforcement Learning in Open-Ended Long-Form Generation.
CoRR, May, 2026

UnityMAS-O: A General RL Optimization Framework for LLM-Based Multi-Agent Systems.
CoRR, May, 2026

Knowledge-Graph Paths as Intermediate Supervision for Self-Evolving Search Agents.
CoRR, May, 2026

Focus on the Core: Empowering Diffusion Large Language Models by Self-Contrast.
CoRR, May, 2026

From a Social Cognitive Perspective: Context-Aware Visual Social Relationship Recognition.
IEEE Trans. Neural Networks Learn. Syst., April, 2026

SPARD: Self-Paced Curriculum for RL Alignment via Integrating Reward Dynamics and Data Utility.
CoRR, April, 2026

PRAISE: Prefix-Based Rollout Reuse in Agentic Search Training.
CoRR, April, 2026

Aligning Large Language Models with Searcher Preferences.
CoRR, March, 2026

Self-Compression of Chain-of-Thought via Multi-Agent Reinforcement Learning.
CoRR, January, 2026

JADE: Bridging the Strategic-Operational Gap in Dynamic Agentic RAG.
CoRR, January, 2026

Anti-Length Shift: Dynamic Outlier Truncation for Training Efficient Reasoning Models.
CoRR, January, 2026

2025
DAComp: Benchmarking Data Agents across the Full Data Intelligence Lifecycle.
CoRR, December, 2025

Towards Fine-Grained Code-Switch Speech Translation with Semantic Space Alignment.
CoRR, November, 2025

TeaRAG: A Token-Efficient Agentic Retrieval-Augmented Generation Framework.
CoRR, November, 2025

RealBench: A Chinese Multi-image Understanding Benchmark Close to Real-world Scenarios.
CoRR, September, 2025

Decomposed Reasoning with Reinforcement Learning for Relevance Assessment in UGC Platforms.
CoRR, August, 2025

RedOne: Revealing Domain-specific LLM Post-Training in Social Networking Services.
CoRR, July, 2025

Plan Your Travel and Travel with Your Plan: Wide-Horizon Planning and Evaluation via LLM.
CoRR, June, 2025

PaRT: Enhancing Proactive Social Chatbots with Personalized Real-Time Retrieval.
Proceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2025

Qilin: A Multimodal Information Retrieval Dataset with APP-level User Sessions.
Proceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2025

RAG-IGBench: Innovative Evaluation for RAG-based Interleaved Generation in Open-domain Question Answering.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

Wide-Horizon Thinking and Simulation-Based Evaluation for Real-World LLM Planning with Multifaceted Constraints.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

MoDification: Mixture of Depths Made Easy.
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025

NoteLLM-2: Multimodal Large Representation Models for Recommendation.
Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, V.1, 2025

RealBench: A Chinese Multi-image Understanding Benchmark Close to Real-world Scenarios.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025

Think-Search-Patch: A Retrieval-Augmented Reasoning Framework for Repository-Level Code Repair.
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

DecEx-RAG: Boosting Agentic Retrieval-Augmented Generation with Decision and Execution Optimization via Process Supervision.
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

SelfAug: Mitigating Catastrophic Forgetting in Retrieval-Augmented Generation via Distribution Self-Alignment.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025

Understanding the RoPE Extensions of Long-Context LLMs: An Attention Perspective.
Proceedings of the 31st International Conference on Computational Linguistics, 2025

ZigZagKV: Dynamic KV Cache Compression for Long-context Modeling based on Layer Uncertainty.
Proceedings of the 31st International Conference on Computational Linguistics, 2025

Towards the Law of Capacity Gap in Distilling Language Models.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

2024
OV-DAR: Open-Vocabulary Object Detection and Attributes Recognition.
Int. J. Comput. Vis., November, 2024

TOMGPT: Reliable Text-Only Training Approach for Cost-Effective Multi-modal Large Language Model.
ACM Trans. Knowl. Discov. Data, August, 2024

ScalingNote: Scaling up Retrievers with Large Language Models for Real-World Dense Retrieval.
CoRR, 2024

Benchmarking Large Language Models for Conversational Question Answering in Multi-instructional Documents.
CoRR, 2024

NoteLLM-2: Multimodal Large Representation Models for Recommendation.
CoRR, 2024

From Image to Video, what do we need in multimodal LLMs?
CoRR, 2024

Agent Group Chat: An Interactive Group Chat Simulacra For Better Eliciting Collective Emergent Behavior.
CoRR, 2024

NoteLLM: A Retrievable Large Language Model for Note Recommendation.
CoRR, 2024

NoteLLM: A Retrievable Large Language Model for Note Recommendation.
Proceedings of the Companion Proceedings of the ACM on Web Conference 2024, 2024

Vript: A Video Is Worth Thousands of Words.
Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

VideoLLM-MoD: Efficient Video-Language Streaming with Mixture-of-Depths Vision Computation.
Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Are Mixture-of-Modality-Experts Transformers Robust to Missing Modality During Training and Inferring?
Proceedings of the Intelligent Information Processing XII, 2024

Caseg: Clip-Based Action Segmentation With Learnable Text Prompt.
Proceedings of the IEEE International Conference on Image Processing, 2024

DetectBench: Can Large Language Model Detect and Piece Together Implicit Evidence?
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

PyramidInfer: Pyramid KV Cache Compression for High-throughput LLM Inference.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

Efficient k-Nearest-Neighbor Machine Translation with Dynamic Retrieval.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

AQ-DETR: Low-Bit Quantized Detection Transformer with Auxiliary Queries.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Towards the Law of Capacity Gap in Distilling Language Models.
CoRR, 2023

MVP-SEG: Multi-View Prompt Learning for Open-Vocabulary Semantic Segmentation.
CoRR, 2023

MVP-SEG: Multi-view Prompt Learning for Open-Vocabulary Semantic Segmentation.
Proceedings of the Pattern Recognition and Computer Vision - 6th Chinese Conference, 2023

2INER: Instructive and In-Context Learning on Few-Shot Named Entity Recognition.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

OvarNet: Towards Open-Vocabulary Object Attribute Recognition.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022
Occluded Video Instance Segmentation: A Benchmark.
Int. J. Comput. Vis., 2022

Unified QA-aware Knowledge Graph Generation Based on Multi-modal Modeling.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

NFormer: Robust Person Re-identification with Neighbor Transformer.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021
Occluded Video Instance Segmentation.
CoRR, 2021

Occluded Video Instance Segmentation: Dataset and ICCV 2021 Challenge.
Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1, 2021

Decoupled IoU Regression for Object Detection.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

2020
1st Place Solutions for OpenImage2019 - Object Detection and Instance Segmentation.
CoRR, 2020

2019
Utilizing the Instability in Weakly Supervised Object Detection.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2019

2010
Chinese Word Sense Induction based on Hierarchical Clustering Algorithm.
Proceedings of the CIPS-SIGHAN Joint Conference on Chinese Language Processing, 2010


  Loading...