Ge Zhang

Orcid: 0000-0002-0064-2906

Affiliations:

ByteDance Inc.
01.AI (former)
University of Waterloo, Canada (PhD)
Beijing Academy of Artificial Intelligence, China (former)
University of Michigan, Ann Arbor, MI, USA (former)

According to our database¹, Ge Zhang authored at least 201 papers between 2020 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Bibliography

2026

TraceGraph: Shared Decision Landscapes for Diagnosing and Improving Agent Trajectories.

[BibT_eX]

[DOI]

CoRR, May, 2026

Balanced Aggregation: Understanding and Fixing Aggregation Bias in GRPO.

[BibT_eX]

[DOI]

CoRR, May, 2026

In-Place Test-Time Training.

[BibT_eX]

[DOI]

CoRR, April, 2026

Xpertbench: Expert Level Tasks with Rubrics-Based Evaluation.

[BibT_eX]

[DOI]

CoRR, April, 2026

Understanding by Reconstruction: Reversing the Software Development Process for LLM Pretraining.

[BibT_eX]

[DOI]

CoRR, March, 2026

$OneMillion-Bench: How Far are Language Agents from Human Experts?

[BibT_eX]

[DOI]

CoRR, March, 2026

CoTJudger: A Graph-Driven Framework for Automatic Evaluation of Chain-of-Thought Efficiency and Redundancy in LRMs.

[BibT_eX]

[DOI]

CoRR, March, 2026

Search More, Think Less: Rethinking Long-Horizon Agentic Search for Efficiency and Generalization.

[BibT_eX]

[DOI]

CoRR, February, 2026

VeRA: Verified Reasoning Data Augmentation at Scale.

[BibT_eX]

[DOI]

CoRR, February, 2026

WorldTravel: A Realistic Multimodal Travel-Planning Benchmark with Tightly Coupled Constraints.

[BibT_eX]

[DOI]

CoRR, February, 2026

The Optimal Token Baseline: Variance Reduction for Long-Horizon LLM-RL.

[BibT_eX]

[DOI]

CoRR, February, 2026

Context Forcing: Consistent Autoregressive Video Generation with Long Context.

[BibT_eX]

[DOI]

CoRR, February, 2026

BABE: Biology Arena BEnchmark.

[BibT_eX]

[DOI]

CoRR, February, 2026

TabularMath: Evaluating Computational Extrapolation in Tabular Learning via Program-Verified Synthesis.

[BibT_eX]

[DOI]

CoRR, February, 2026

Retrieval-Infused Reasoning Sandbox: A Benchmark for Decoupling Retrieval and Reasoning Capabilities.

[BibT_eX]

[DOI]

CoRR, January, 2026

ConceptMoE: Adaptive Token-to-Concept Compression for Implicit Compute Allocation.

[BibT_eX]

[DOI]

CoRR, January, 2026

The Molecular Structure of Thought: Mapping the Topology of Long Chain-of-Thought Reasoning.

[BibT_eX]

[DOI]

CoRR, January, 2026

MMTableBench: A Multi-level Multimodal Benchmark for Reasoning and Layout Complexity in Table QA.

[BibT_eX]

[DOI]

Proceedings of the ACM Web Conference 2026, 2026

MMRA: A Benchmark for Evaluating Multi-Granularity and Multi-Image Relational Association Capabilities in Large Visual Language Models.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EACL 2026, 2026

COIG-P: A High-Quality and Large-Scale Chinese Preference Dataset for Alignment with Human Values.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EACL 2026, 2026

PRISM: Probabilistic Reward Model with Inherent Structural Modeling.

[BibT_eX]

[DOI]

Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2026

CriticLean: Critic-Guided Reinforcement Learning for Mathematical Formalization.

[BibT_eX]

[DOI]

Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2026

Video SimpleQA: Towards Factuality Evaluation in Large Video Language Models.

[BibT_eX]

[DOI]

Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

2025

Encyclo-K: Evaluating LLMs with Dynamically Composed Knowledge Statements.

[BibT_eX]

[DOI]

CoRR, December, 2025

Dynamic Large Concept Models: Latent Reasoning in an Adaptive Semantic Space.

[BibT_eX]

[DOI]

CoRR, December, 2025

Taming the Tail: Stable LLM Reinforcement Learning via Dynamic Vocabulary Pruning.

[BibT_eX]

[DOI]

CoRR, December, 2025

CodeSimpleQA: Scaling Factuality in Code Large Language Models.

[BibT_eX]

[DOI]

CoRR, December, 2025

NL2Repo-Bench: Towards Long-Horizon Repository Generation Evaluation of Coding Agents.

[BibT_eX]

[DOI]

CoRR, December, 2025

AutoMV: An Automatic Multi-Agent System for Music Video Generation.

[BibT_eX]

[DOI]

CoRR, December, 2025

From Code Foundation Models to Agents and Applications: A Comprehensive Survey and Practical Guide to Code Intelligence.

[BibT_eX]

[DOI]

CoRR, November, 2025

DiscoX: Benchmarking Discourse-Level Translation task in Expert Domains.

[BibT_eX]

[DOI]

CoRR, November, 2025

MVU-Eval: Towards Multi-Video Understanding Evaluation for Multimodal LLMs.

[BibT_eX]

[DOI]

CoRR, November, 2025

LPFQA: A Long-Tail Professional Forum-based Benchmark for LLM Evaluation.

[BibT_eX]

[DOI]

CoRR, November, 2025

RLoop: An Self-Improving Framework for Reinforcement Learning with Iterative Policy Initialization.

[BibT_eX]

[DOI]

CoRR, November, 2025

MME-CC: A Challenging Multi-Modal Evaluation Benchmark of Cognitive Capacity.

[BibT_eX]

[DOI]

CoRR, November, 2025

Scaling Latent Reasoning via Looped Language Models.

[BibT_eX]

[DOI]

CoRR, October, 2025

Beyond Correctness: Evaluating Subjective Writing Preferences Across Cultures.

[BibT_eX]

[DOI]

CoRR, October, 2025

A<sup>2</sup>FM: An Adaptive Agent Foundation Model for Tool-Aware Hybrid Reasoning.

[BibT_eX]

[DOI]

CoRR, October, 2025

ACADREASON: Exploring the Limits of Reasoning Models with Academic Research Problems.

[BibT_eX]

[DOI]

CoRR, October, 2025

OmniVideoBench: Towards Audio-Visual Understanding Evaluation for Omni MLLMs.

[BibT_eX]

[DOI]

CoRR, October, 2025

Knapsack RL: Unlocking Exploration of LLMs via Optimizing Budget Allocation.

[BibT_eX]

[DOI]

CoRR, September, 2025

Towards Personalized Deep Research: Benchmarks and Evaluations.

[BibT_eX]

[DOI]

CoRR, September, 2025

VideoScore2: Think before You Score in Generative Video Evaluation.

[BibT_eX]

[DOI]

CoRR, September, 2025

FinSearchComp: Towards a Realistic, Expert-Level Evaluation of Financial Search and Reasoning.

[BibT_eX]

[DOI]

CoRR, September, 2025

Reverse-Engineered Reasoning for Open-Ended Generation.

[BibT_eX]

[DOI]

CoRR, September, 2025

TreePO: Bridging the Gap of Policy Optimization and Efficacy and Inference Efficiency with Heuristic Tree-based Modeling.

[BibT_eX]

[DOI]

CoRR, August, 2025

M3TQA: Massively Multilingual Multitask Table Question Answering.

[BibT_eX]

[DOI]

CoRR, August, 2025

MM-BrowseComp: A Comprehensive Benchmark for Multimodal Browsing Agents.

[BibT_eX]

[DOI]

CoRR, August, 2025

Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL.

[BibT_eX]

[DOI]

CoRR, August, 2025

FutureX: An Advanced Live Benchmark for LLM Agents in Future Prediction.

[BibT_eX]

[DOI]

CoRR, August, 2025

VeriGUI: Verifiable Long-Chain GUI Dataset.

[BibT_eX]

[DOI]

CoRR, August, 2025

Efficient Agents: Building Effective Agents While Reducing Cost.

[BibT_eX]

[DOI]

CoRR, August, 2025

Seed Diffusion: A Large-Scale Diffusion Language Model with High-Speed Inference.

[BibT_eX]

[DOI]

CoRR, August, 2025

Seed-Prover: Deep and Broad Reasoning for Automated Theorem Proving.

[BibT_eX]

[DOI]

CoRR, July, 2025

IFEvalCode: Controlled Code Generation.

[BibT_eX]

[DOI]

CoRR, July, 2025

Multilingual Multimodal Software Developer for Code Generation.

[BibT_eX]

[DOI]

CoRR, July, 2025

First Return, Entropy-Eliciting Explore.

[BibT_eX]

[DOI]

CoRR, July, 2025

A Systematic Analysis of Hybrid Linear Attention.

[BibT_eX]

[DOI]

CoRR, July, 2025

Agent KB: Leveraging Cross-Domain Experience for Agentic Problem Solving.

[BibT_eX]

[DOI]

CoRR, July, 2025

A Survey on Latent Reasoning.

[BibT_eX]

[DOI]

CoRR, July, 2025

MSCFF-Net: multi-scale context feature fusion network for polyp segmentation.

[BibT_eX]

[DOI]

Multim. Syst., June, 2025

OAgents: An Empirical Study of Building Effective Agents.

[BibT_eX]

[DOI]

CoRR, June, 2025

Scaling Test-time Compute for LLM Agents.

[BibT_eX]

[DOI]

CoRR, June, 2025

SciDA: Scientific Dynamic Assessor of LLMs.

[BibT_eX]

[DOI]

CoRR, June, 2025

TaskCraft: Automated Generation of Agentic Tasks.

[BibT_eX]

[DOI]

CoRR, June, 2025

ScaleLong: A Multi-Timescale Benchmark for Long Video Understanding.

[BibT_eX]

[DOI]

CoRR, May, 2025

P2P: Automated Paper-to-Poster Generation and Fine-Grained Benchmark.

[BibT_eX]

[DOI]

CoRR, May, 2025

General-Reasoner: Advancing LLM Reasoning Across All Domains.

[BibT_eX]

[DOI]

CoRR, May, 2025

VideoEval-Pro: Robust and Realistic Long Video Understanding Evaluation.

[BibT_eX]

[DOI]

CoRR, May, 2025

KORGym: A Dynamic Game Platform for LLM Reasoning Evaluation.

[BibT_eX]

[DOI]

CoRR, May, 2025

FormalMATH: Benchmarking Formal Mathematical Reasoning of Large Language Models.

[BibT_eX]

[DOI]

CoRR, May, 2025

IV-Bench: A Benchmark for Image-Grounded Video Perception and Reasoning in Multimodal LLMs.

[BibT_eX]

[DOI]

CoRR, April, 2025

ReTool: Reinforcement Learning for Strategic Tool Use in LLMs.

[BibT_eX]

[DOI]

CoRR, April, 2025

Video SimpleQA: Towards Factuality Evaluation in Large Video Language Models.

[BibT_eX]

[DOI]

CoRR, March, 2025

A Comprehensive Survey on Long Context Language Modeling.

[BibT_eX]

[DOI]

CoRR, March, 2025

FlexWorld: Progressively Expanding 3D Scenes for Flexiable-View Synthesis.

[BibT_eX]

[DOI]

CoRR, March, 2025

YuE: Scaling Open Foundation Models for Long-Form Music Generation.

[BibT_eX]

[DOI]

CoRR, March, 2025

Can Large Language Models Detect Errors in Long Chain-of-Thought Reasoning?

[BibT_eX]

[DOI]

CoRR, February, 2025

CodeCriticBench: A Holistic Code Critique Benchmark for Large Language Models.

[BibT_eX]

[DOI]

CoRR, February, 2025

Audio-FLAN: A Preliminary Release.

[BibT_eX]

[DOI]

CoRR, February, 2025

SimpleVQA: Multimodal Factuality Evaluation for Multimodal Large Language Models.

[BibT_eX]

[DOI]

CoRR, February, 2025

CryptoX : Compositional Reasoning Evaluation of Large Language Models.

[BibT_eX]

[DOI]

CoRR, February, 2025

BitsAI-CR: Automated Code Review via LLM in Practice.

[BibT_eX]

[DOI]

CoRR, January, 2025

Aligning Instruction Tuning with Pre-training.

[BibT_eX]

[DOI]

CoRR, January, 2025

Generating Symbolic World Models via Test-time Scaling of Large Language Models.

[BibT_eX]

[DOI]

Trans. Mach. Learn. Res., 2025

Long-context LLMs Struggle with Long In-context Learning.

[BibT_eX]

[DOI]

Trans. Mach. Learn. Res., 2025

BitsAI-CR: Automated Code Review via LLM in Practice.

[BibT_eX]

[DOI]

Proceedings of the 33rd ACM International Conference on the Foundations of Software Engineering, 2025

Overview of the NLPCC 2025 Shared Task: Gender Bias Mitigation Challenge.

[BibT_eX]

[DOI]

Proceedings of the Natural Language Processing and Chinese Computing, 2025

KORGym: A Dynamic Game Platform for LLM Reasoning Evaluation.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

FlexWorld: Progressively Expanding 3D Scenes for Flexible-View Exploration.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

CLaMP 2: Multimodal Music Information Retrieval Across 101 Languages Using Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2025, Albuquerque, New Mexico, USA, April 29, 2025

COIG-CQIA: Quality is All You Need for Chinese Instruction Fine-tuning.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2025, Albuquerque, New Mexico, USA, April 29, 2025

VCR: A Task for Pixel-Level Complex Reasoning in Vision Language Models via Restoring Occluded Text.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

OmniEdit: Building Image Editing Generalist Models Through Specialist Supervision.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

MTU-Bench: A Multi-granularity Tool-Use Benchmark for Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

MuPT: A Generative Symbolic Music Pretrained Transformer.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

KOR-Bench: Benchmarking Language Models on Knowledge-Orthogonal Reasoning Tasks.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Omni-MATH: A Universal Olympiad Level Mathematic Benchmark for Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

McEval: Massively Multilingual Code Evaluation.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

VAMBA: Understanding Hour-Long Videos with Hybrid Mamba-Transformers.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

SimpleVQA: Multimodal Factuality Evaluation for Multimodal Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

OAgents: An Empirical Study of Building Effective Agents.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025

MARS-Bench: A Multi-turn Athletic Real-world Scenario Benchmark for Dialogue Evaluation.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025

MIO: A Foundation Model on Multimodal Tokens.

[BibT_eX]

[DOI]

Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

LIME: Less Is More for MLLM Evaluation.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2025

MMMU-Pro: A More Robust Multi-discipline Multimodal Understanding Benchmark.

[BibT_eX]

[DOI]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

M2RC-EVAL: Massively Multilingual Repository-level Code Completion Evaluation.

[BibT_eX]

[DOI]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

Can Large Language Models Detect Errors in Long Chain-of-Thought Reasoning?

[BibT_eX]

[DOI]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

KARPA: A Training-free Method of Adapting Knowledge Graph as References for Large Language Model's Reasoning Path Aggregation.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2025

Can MLLMs Understand the Deep Implication Behind Chinese Images?

[BibT_eX]

[DOI]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

TableBench: A Comprehensive and Complex Benchmark for Table Question Answering.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

2024

ConsistI2V: Enhancing Visual Consistency for Image-to-Video Generation.

[BibT_eX]

[DOI]

Trans. Mach. Learn. Res., 2024

TIGERScore: Towards Building Explainable Metric for All Text Generation Tasks.

[BibT_eX]

[DOI]

Trans. Mach. Learn. Res., 2024

KARPA: A Training-free Method of Adapting Knowledge Graph as References for Large Language Model's Reasoning Path Aggregation.

[BibT_eX]

[DOI]

CoRR, 2024

Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey.

[BibT_eX]

[DOI]

CoRR, 2024

PhysGame: Uncovering Physical Commonsense Violations in Gameplay Videos.

[BibT_eX]

[DOI]

CoRR, 2024

OmniEdit: Building Image Editing Generalist Models Through Specialist Supervision.

[BibT_eX]

[DOI]

CoRR, 2024

OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

MdEval: Massively Multilingual Code Debugging.

[BibT_eX]

[DOI]

CoRR, 2024

M2rc-Eval: Massively Multilingual Repository-level Code Completion Evaluation.

[BibT_eX]

[DOI]

CoRR, 2024

AutoKaggle: A Multi-Agent Framework for Autonomous Data Science Competitions.

[BibT_eX]

[DOI]

CoRR, 2024

Can MLLMs Understand the Deep Implication Behind Chinese Images?

[BibT_eX]

[DOI]

CoRR, 2024

A Comparative Study on Reasoning Patterns of OpenAI's o1 Model.

[BibT_eX]

[DOI]

CoRR, 2024

MTU-Bench: A Multi-granularity Tool-Use Benchmark for Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

ING-VP: MLLMs cannot Play Easy Vision-based Games Yet.

[BibT_eX]

[DOI]

CoRR, 2024

KOR-Bench: Benchmarking Language Models on Knowledge-Orthogonal Reasoning Tasks.

[BibT_eX]

[DOI]

CoRR, 2024

MIO: A Foundation Model on Multimodal Tokens.

[BibT_eX]

[DOI]

CoRR, 2024

HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

OmniBench: Towards The Future of Universal Omni-Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

LIME: Less Is More for MLLM Evaluation.

[BibT_eX]

[DOI]

CoRR, 2024

Towards a Unified View of Preference Learning for Large Language Models: A Survey.

[BibT_eX]

[DOI]

CoRR, 2024

FuzzCoder: Byte-level Fuzzing Test via Large Language Model.

[BibT_eX]

[DOI]

CoRR, 2024

Foundation Models for Music: A Survey.

[BibT_eX]

[DOI]

CoRR, 2024

TableBench: A Comprehensive and Complex Benchmark for Table Question Answering.

[BibT_eX]

[DOI]

CoRR, 2024

I-SHEEP: Self-Alignment of LLM from Scratch through an Iterative Self-Enhancement Paradigm.

[BibT_eX]

[DOI]

CoRR, 2024

MMRA: A Benchmark for Multi-granularity Multi-image Relational Association.

[BibT_eX]

[DOI]

CoRR, 2024

LongIns: A Challenging Long-context Instruction-based Exam for LLMs.

[BibT_eX]

[DOI]

CoRR, 2024

GIEBench: Towards Holistic Evaluation of Group Identity-based Empathy for Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

PIN: A Knowledge-Intensive Dataset for Paired and Interleaved Multimodal Documents.

[BibT_eX]

[DOI]

CoRR, 2024

McEval: Massively Multilingual Code Evaluation.

[BibT_eX]

[DOI]

CoRR, 2024

VCR: Visual Caption Restoration.

[BibT_eX]

[DOI]

CoRR, 2024

II-Bench: An Image Implication Understanding Benchmark for Multimodal Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series.

[BibT_eX]

[DOI]

CoRR, 2024

MAmmoTH2: Scaling Instructions from the Web.

[BibT_eX]

[DOI]

CoRR, 2024

MuPT: A Generative Symbolic Music Pretrained Transformer.

[BibT_eX]

[DOI]

CoRR, 2024

Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model.

[BibT_eX]

[DOI]

CoRR, 2024

CodeEditorBench: Evaluating Code Editing Capability of Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

The Fine Line: Navigating Large Language Model Pretraining with Down-streaming Capability Analysis.

[BibT_eX]

[DOI]

CoRR, 2024

COIG-CQIA: Quality is All You Need for Chinese Instruction Fine-tuning.

[BibT_eX]

[DOI]

CoRR, 2024

Yi: Open Foundation Models by 01.AI.

[BibT_eX]

[DOI]

CoRR, 2024

DEEP-ICL: Definition-Enriched Experts for Language Model In-Context Learning.

[BibT_eX]

[DOI]

CoRR, 2024

StructLM: Towards Building Generalist Models for Structured Knowledge Grounding.

[BibT_eX]

[DOI]

CoRR, 2024

ChatMusician: Understanding and Generating Music Intrinsically with LLM.

[BibT_eX]

[DOI]

CoRR, 2024

CMDAG: A Chinese Metaphor Dataset with Annotated Grounds as CoT for Boosting Metaphor Generation.

[BibT_eX]

[DOI]

CoRR, 2024

CIF-Bench: A Chinese Instruction-Following Benchmark for Evaluating the Generalizability of Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

MORE-3S: Multimodal-based Offline Reinforcement Learning with Shared Semantic Spaces.

[BibT_eX]

[DOI]

CoRR, 2024

ConsistI2V: Enhancing Visual Consistency for Image-to-Video Generation.

[BibT_eX]

[DOI]

CoRR, 2024

Read to Play (R2-Play): Decision Transformer with Multimodal Game Instruction.

[BibT_eX]

[DOI]

CoRR, 2024

CMMMU: A Chinese Massive Multi-discipline Multimodal Understanding Benchmark.

[BibT_eX]

[DOI]

CoRR, 2024

E^2-LLM: Efficient and Extreme Length Extension of Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

Kun: Answer Polishment for Chinese Self-Alignment with Instruction Back-Translation.

[BibT_eX]

[DOI]

CoRR, 2024

Overview of the NLPCC 2024 Shared Task on Chinese Metaphor Generation.

[BibT_eX]

[DOI]

Proceedings of the Natural Language Processing and Chinese Computing, 2024

MAmmoTH2: Scaling Instructions from the Web.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

D-CPT Law: Domain-specific Continual Pre-Training Scaling Law for Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

DDK: Distilling Domain Knowledge for Efficient Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

RoleAgent: Building, Interacting, and Benchmarking High-quality Role-Playing Agents from Scripts.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

II-Bench: An Image Implication Understanding Benchmark for Multimodal Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

MusiLingo: Bridging Music and Text with Pre-trained Language Models for Music Captioning and Query Response.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2024, 2024

ComposerX: Multi-Agent Symbolic Music Composition With LLMs.

[BibT_eX]

[DOI]

Proceedings of the 25th International Society for Music Information Retrieval Conference, 2024

AutoAgents: A Framework for Automatic Agent Generation.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024

MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Massive Editing for Large Language Models via Meta Learning.

[BibT_eX]

[DOI]

Chenmien Tan

Ge Zhang

Jie Fu

Proceedings of the Twelfth International Conference on Learning Representations, 2024

MERT: Acoustic Music Understanding Model with Large-Scale Self-supervised Training.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

MMTE: Corpus and Metrics for Evaluating Machine Translation Quality of Metaphorical Language.

[BibT_eX]

[DOI]

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

VideoScore: Building Automatic Metrics to Simulate Fine-grained Human Feedback for Video Generation.

[BibT_eX]

[DOI]

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

UniIR: Training and Benchmarking Universal Multimodal Information Retrievers.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

MMMU: A Massive Multi-Discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

MORE-3S: Multimodal-based Offline Reinforcement Learning with Shared Semantic Spaces.

[BibT_eX]

[DOI]

Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

CMDAG: A Chinese Metaphor Dataset with Annotated Grounds as CoT for Boosting Metaphor Generation.

[BibT_eX]

[DOI]

Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2024

AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling.

[BibT_eX]

[DOI]

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

ChatMusician: Understanding and Generating Music Intrinsically with LLM.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2024

SciMMIR: Benchmarking Scientific Multi-modal Information Retrieval.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2024

E2-LLM: Efficient and Extreme Length Extension of Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2024

CIF-Bench: A Chinese Instruction-Following Benchmark for Evaluating the Generalizability of Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2024

2023

Align on the Fly: Adapting Chatbot Behavior to Established Norms.

[BibT_eX]

[DOI]

CoRR, 2023

MERT: Acoustic Music Understanding Model with Large-Scale Self-supervised Training.

[BibT_eX]

[DOI]

CoRR, 2023

TPDM: Selectively Removing Positional Information for Zero-shot Translation via Token-Level Position Disentangle Module.

[BibT_eX]

[DOI]

Xingran Chen

Ge Zhang

Jie Fu

CoRR, 2023

Interactive Natural Language Processing.

[BibT_eX]

[DOI]

CoRR, 2023

Chinese Open Instruction Generalist: A Preliminary Release.

[BibT_eX]

[DOI]

CoRR, 2023

CORGI-PM: A Chinese Corpus For Gender Bias Probing and Mitigation.

[BibT_eX]

[DOI]

CoRR, 2023

MARBLE: Music Audio Representation Benchmark for Universal Evaluation.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

LyricWhiz: Robust Multilingual Zero-Shot Lyrics Transcription by Whispering to ChatGPT.

[BibT_eX]

[DOI]

Proceedings of the 24th International Society for Music Information Retrieval Conference, 2023

On the Effectiveness of Speech Self-Supervised Learning for Music.

[BibT_eX]

[DOI]

Proceedings of the 24th International Society for Music Information Retrieval Conference, 2023

2022

MAP-Music2Vec: A Simple and Effective Baseline for Self-Supervised Music Audio Representation Learning.

[BibT_eX]

[DOI]

CoRR, 2022

HERB: Measuring Hierarchical Regional Bias in Pre-trained Language Models.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: AACL-IJCNLP 2022, 2022

1Cademy @ Causal News Corpus 2022: Leveraging Self-Training in Causality Classification of Socio-Political Event Data.

[BibT_eX]

[DOI]

Proceedings of the 5th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text, 2022

1Cademy @ Causal News Corpus 2022: Enhance Causal Span Detection via Beam-Search-based Position Selector.

[BibT_eX]

[DOI]

Proceedings of the 5th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text, 2022

2020

Diverse Melody Generation from Chinese Lyrics via Mutual Information Maximization.

[BibT_eX]

[DOI]

CoRR, 2020

Ge Zhang

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...