Wenhao Huang

Affiliations:
  • 01.AI, Beijing, China
  • Peking University, Key Laboratory of Machine Perception, Beijing, China (former)
  • Beijing University of Posts and Telecommunications, School of Software Engineering, Beijing, China (former)


According to our database1, Wenhao Huang authored at least 103 papers between 2010 and 2025.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
MM-BrowseComp: A Comprehensive Benchmark for Multimodal Browsing Agents.
CoRR, August, 2025

FutureX: An Advanced Live Benchmark for LLM Agents in Future Prediction.
CoRR, August, 2025

Seed Diffusion: A Large-Scale Diffusion Language Model with High-Speed Inference.
CoRR, August, 2025

Seed-Prover: Deep and Broad Reasoning for Automated Theorem Proving.
CoRR, July, 2025

First Return, Entropy-Eliciting Explore.
CoRR, July, 2025

A Systematic Analysis of Hybrid Linear Attention.
CoRR, July, 2025

CriticLean: Critic-Guided Reinforcement Learning for Mathematical Formalization.
CoRR, July, 2025

DenseWorld-1M: Towards Detailed Dense Grounded Caption in the Real World.
CoRR, June, 2025

SciDA: Scientific Dynamic Assessor of LLMs.
CoRR, June, 2025

ScaleLong: A Multi-Timescale Benchmark for Long Video Understanding.
CoRR, May, 2025

MARS-Bench: A Multi-turn Athletic Real-world Scenario Benchmark for Dialogue Evaluation.
CoRR, May, 2025

P2P: Automated Paper-to-Poster Generation and Fine-Grained Benchmark.
CoRR, May, 2025

KORGym: A Dynamic Game Platform for LLM Reasoning Evaluation.
CoRR, May, 2025

FormalMATH: Benchmarking Formal Mathematical Reasoning of Large Language Models.
CoRR, May, 2025

IV-Bench: A Benchmark for Image-Grounded Video Perception and Reasoning in Multimodal LLMs.
CoRR, April, 2025

Seed1.5-Thinking: Advancing Superb Reasoning Models with Reinforcement Learning.
CoRR, April, 2025

COIG-P: A High-Quality and Large-Scale Chinese Preference Dataset for Alignment with Human Values.
CoRR, April, 2025

A Comprehensive Survey on Long Context Language Modeling.
CoRR, March, 2025

FlexWorld: Progressively Expanding 3D Scenes for Flexiable-View Synthesis.
CoRR, March, 2025

YuE: Scaling Open Foundation Models for Long-Form Music Generation.
CoRR, March, 2025

CodeCriticBench: A Holistic Code Critique Benchmark for Large Language Models.
CoRR, February, 2025

SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines.
CoRR, February, 2025

SimpleVQA: Multimodal Factuality Evaluation for Multimodal Large Language Models.
CoRR, February, 2025

Aligning Instruction Tuning with Pre-training.
CoRR, January, 2025

COIG-CQIA: Quality is All You Need for Chinese Instruction Fine-tuning.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2025, Albuquerque, New Mexico, USA, April 29, 2025

KOR-Bench: Benchmarking Language Models on Knowledge-Orthogonal Reasoning Tasks.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Steering Protein Family Design through Profile Bayesian Flow.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Long Video Diffusion Generation with Segmented Cross-Attention and Content-Rich Video Data Curation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

LIME: Less Is More for MLLM Evaluation.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

PopAlign: Diversifying Contrasting Patterns for a More Comprehensive Alignment.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

Can MLLMs Understand the Deep Implication Behind Chinese Images?
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

2024
ConsistI2V: Enhancing Visual Consistency for Image-to-Video Generation.
Trans. Mach. Learn. Res., 2024

TIGERScore: Towards Building Explainable Metric for All Text Generation Tasks.
Trans. Mach. Learn. Res., 2024

AutoKaggle: A Multi-Agent Framework for Autonomous Data Science Competitions.
CoRR, 2024

Can MLLMs Understand the Deep Implication Behind Chinese Images?
CoRR, 2024

PopAlign: Diversifying Contrasting Patterns for a More Comprehensive Alignment.
CoRR, 2024

A Comparative Study on Reasoning Patterns of OpenAI's o1 Model.
CoRR, 2024

PositionID: LLMs can Control Lengths, Copy and Paste with Explicit Positional Awareness.
CoRR, 2024

ING-VP: MLLMs cannot Play Easy Vision-based Games Yet.
CoRR, 2024

KOR-Bench: Benchmarking Language Models on Knowledge-Orthogonal Reasoning Tasks.
CoRR, 2024

MIO: A Foundation Model on Multimodal Tokens.
CoRR, 2024

OmniBench: Towards The Future of Universal Omni-Language Models.
CoRR, 2024

LIME: Less Is More for MLLM Evaluation.
CoRR, 2024

Foundation Models for Music: A Survey.
CoRR, 2024

I-SHEEP: Self-Alignment of LLM from Scratch through an Iterative Self-Enhancement Paradigm.
CoRR, 2024

MMRA: A Benchmark for Multi-granularity Multi-image Relational Association.
CoRR, 2024

DreamStory: Open-Domain Story Visualization by LLM-Guided Multi-Subject Consistent Diffusion.
CoRR, 2024

LongIns: A Challenging Long-context Instruction-based Exam for LLMs.
CoRR, 2024

GIEBench: Towards Holistic Evaluation of Group Identity-based Empathy for Large Language Models.
CoRR, 2024

PIN: A Knowledge-Intensive Dataset for Paired and Interleaved Multimodal Documents.
CoRR, 2024

II-Bench: An Image Implication Understanding Benchmark for Multimodal Large Language Models.
CoRR, 2024

MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series.
CoRR, 2024

COIG-CQIA: Quality is All You Need for Chinese Instruction Fine-tuning.
CoRR, 2024

Yi: Open Foundation Models by 01.AI.
CoRR, 2024

m2mKD: Module-to-Module Knowledge Distillation for Modular Transformers.
CoRR, 2024

ChatMusician: Understanding and Generating Music Intrinsically with LLM.
CoRR, 2024

LLM Agents for Psychology: A Study on Gamified Assessments.
CoRR, 2024

GenDec: A robust generative Question-decomposition method for Multi-hop reasoning.
CoRR, 2024

CMMMU: A Chinese Massive Multi-discipline Multimodal Understanding Benchmark.
CoRR, 2024

Kun: Answer Polishment for Chinese Self-Alignment with Instruction Back-Translation.
CoRR, 2024

II-Bench: An Image Implication Understanding Benchmark for Multimodal Large Language Models.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

MusiLingo: Bridging Music and Text with Pre-trained Language Models for Music Captioning and Query Response.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2024, 2024

MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

MERT: Acoustic Music Understanding Model with Large-Scale Self-supervised Training.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

MMTE: Corpus and Metrics for Evaluating Machine Translation Quality of Metaphorical Language.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

PositionID: LLMs can Control Lengths, Copy and Paste with Explicit Positional Awareness.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

MMMU: A Massive Multi-Discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

MORE-3S: Multimodal-based Offline Reinforcement Learning with Shared Semantic Spaces.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

CMDAG: A Chinese Metaphor Dataset with Annotated Grounds as CoT for Boosting Metaphor Generation.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024


PsychoGAT: A Novel Psychological Measurement Paradigm through Interactive Fiction Games with LLM Agents.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

SciMMIR: Benchmarking Scientific Multi-modal Information Retrieval.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

RoleLLM: Benchmarking, Eliciting, and Enhancing Role-Playing Abilities of Large Language Models.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

CIF-Bench: A Chinese Instruction-Following Benchmark for Evaluating the Generalizability of Large Language Models.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

2023
LLaSM: Large Language and Speech Model.
CoRR, 2023

MERT: Acoustic Music Understanding Model with Large-Scale Self-supervised Training.
CoRR, 2023

Chinese Open Instruction Generalist: A Preliminary Release.
CoRR, 2023

2022
PEMP: Leveraging Physics Properties to Enhance Molecular Property Prediction.
Proceedings of the 31st ACM International Conference on Information & Knowledge Management, 2022

2020
Real-time Transportation Prediction Correction using Reconstruction Error in Deep Learning.
ACM Trans. Knowl. Discov. Data, 2020

2019
Text Assisted Insight Ranking Using Context-Aware Memory Network.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

Learning Personalized End-to-End Goal-Oriented Dialog.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2015
Mining Dependencies Considering Time Lag in Spatio-Temporal Traffic Data.
Proceedings of the Web-Age Information Management - 16th International Conference, 2015

Traffic Flow Decomposition and Prediction Based on Robust Principal Component Analysis.
Proceedings of the IEEE 18th International Conference on Intelligent Transportation Systems, 2015

Hybrid Multi-metric K-Nearest Neighbor Regression for Traffic Flow Prediction.
Proceedings of the IEEE 18th International Conference on Intelligent Transportation Systems, 2015

Probabilistic dynamic causal model for temporal data.
Proceedings of the 2015 International Joint Conference on Neural Networks, 2015

Improving deep neural network ensembles using reconstruction error.
Proceedings of the 2015 International Joint Conference on Neural Networks, 2015

Learning Common Metrics for Homogenous Tasks in Traffic Flow Prediction.
Proceedings of the 14th IEEE International Conference on Machine Learning and Applications, 2015

Incorporating temporal smoothness and group structure in learning with incomplete data.
Proceedings of the 12th International Conference on Fuzzy Systems and Knowledge Discovery, 2015

Short-term traffic flow forecasting: Multi-metric KNN with related station discovery.
Proceedings of the 12th International Conference on Fuzzy Systems and Knowledge Discovery, 2015

2014
Deep Architecture for Traffic Flow Prediction: Deep Belief Networks With Multitask Learning.
IEEE Trans. Intell. Transp. Syst., 2014

A Spatial-temporal Topic Segmentation Model for Human Mobile Behavior.
Proceedings of the Web-Age Information Management - 15th International Conference, 2014

Metric-Based Multi-Task Grouping Neural Network for Traffic Flow Forecasting.
Proceedings of the Advances in Neural Networks - ISNN 2014, 2014

Dynamic boosting in deep learning using reconstruction error.
Proceedings of the 2014 International Joint Conference on Neural Networks, 2014

Deep process neural network for temporal deep learning.
Proceedings of the 2014 International Joint Conference on Neural Networks, 2014

Traffic zone division using mobile billing data.
Proceedings of the 11th International Conference on Fuzzy Systems and Knowledge Discovery, 2014

2013
Adaptive Weight Optimization for Classification of Imbalanced Data.
Proceedings of the Intelligence Science and Big Data Engineering, 2013

Cost sensitive GPS-based activity recognition.
Proceedings of the 10th International Conference on Fuzzy Systems and Knowledge Discovery, 2013

Hierarchical destination prediction based on GPS history.
Proceedings of the 10th International Conference on Fuzzy Systems and Knowledge Discovery, 2013

Automated urban location annotation on mobile records.
Proceedings of the 10th International Conference on Fuzzy Systems and Knowledge Discovery, 2013

Deep Architecture for Traffic Flow Prediction.
Proceedings of the Advanced Data Mining and Applications - 9th International Conference, 2013

2011
Discrete Trajectory Prediction on Mobile Data.
Proceedings of the Web Technologies and Applications - 13th Asia-Pacific Web Conference, 2011

2010
Systematic Improvement of Monte-Carlo Tree Search with Self-generated Neural-Networks Controllers.
Proceedings of the Learning and Intelligent Optimization, 4th International Conference, 2010

Anchor Points Seeking of Large Urban Crowd Based on the Mobile Billing Data.
Proceedings of the Advanced Data Mining and Applications - 6th International Conference, 2010


  Loading...