Hao Liang

Affiliations:
  • Peking University, Center for Data Science, Beijing, China


According to our database1, Hao Liang authored at least 38 papers between 2024 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
Rethinking Text-to-SQL: Dynamic Multi-turn SQL Interaction for Real-world Database Exploration.
CoRR, October, 2025

Jarvis: Towards Personalized AI Assistant via Personal KV-Cache Retrieval.
CoRR, October, 2025

CapGeo: A Caption-Assisted Approach to Geometric Reasoning.
CoRR, October, 2025

DARO: Difficulty-Aware Reweighting Policy Optimization.
CoRR, October, 2025

LongCat-Flash-Thinking Technical Report.
CoRR, September, 2025

Multimodal Reasoning for Science: Technical Report and 1st Place Solution to the ICML 2025 SeePhys Challenge.
CoRR, September, 2025

Native Visual Understanding: Resolving Resolution Dilemmas in Vision-Language Models.
CoRR, June, 2025

Learning What Reinforcement Learning Can't: Interleaved Online Fine-Tuning for Hardest Questions.
CoRR, June, 2025

LogicPuzzleRL: Cultivating Robust Mathematical Reasoning in LLMs via Reinforcement Learning.
CoRR, June, 2025

UniCTokens: Boosting Personalized Understanding and Generation via Unified Concept Tokens.
CoRR, May, 2025

LoVR: A Benchmark for Long Video Retrieval in Multimodal Contexts.
CoRR, May, 2025

Let's Verify Math Questions Step by Step.
CoRR, May, 2025

Unlocking the Potential of Difficulty Prior in RL-based Multimodal Reasoning.
CoRR, May, 2025

Concept-as-Tree: Synthetic Data is All You Need for VLM Personalization.
CoRR, March, 2025

Evaluating and Predicting Distorted Human Body Parts for Generated Images.
CoRR, March, 2025

MathClean: A Benchmark for Synthetic Mathematical Data Cleaning.
CoRR, February, 2025

MM-Verify: Enhancing Multimodal Reasoning with Chain-of-Thought Verification.
CoRR, February, 2025

Baichuan-Omni-1.5 Technical Report.
CoRR, January, 2025

Facilitating Multi-turn Function Calling for LLMs via Compositional Instruction Tuning.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

PAS: Plug-and-Play Prompt Augmentation System.
Proceedings of the 41st IEEE International Conference on Data Engineering, 2025

Training Data Distribution Estimation for Optimized Pre-training Data Management.
Proceedings of the 41st IEEE International Conference on Data Engineering, 2025

CFBench: A Comprehensive Constraints-Following Benchmark for LLMs.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

MM-Verify: Enhancing Multimodal Reasoning with Chain-of-Thought Verification.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

2024
MC-LLaVA: Multi-Concept Personalized Vision-Language Model.
CoRR, 2024

EVQAScore: Efficient Video Question Answering Data Evaluation.
CoRR, 2024

Document Parsing Unveiled: Techniques, Challenges, and Prospects for Structured Information Extraction.
CoRR, 2024

Baichuan Alignment Technical Report.
CoRR, 2024

Gradual Learning: Optimizing Fine-Tuning with Partially Mastered Knowledge in Large Language Models.
CoRR, 2024

BEATS: Optimizing LLM Mathematical Capabilities with BackVerify and Adaptive Disambiguate based Efficient Tree Search.
CoRR, 2024

Data Proportion Detection for Optimized Data Management for Large Language Models.
CoRR, 2024

MathScape: Evaluating MLLMs in multimodal Math Scenarios through a Hierarchical Benchmark.
CoRR, 2024

CFBench: A Comprehensive Constraints-Following Benchmark for LLMs.
CoRR, 2024

Synth-Empathy: Towards High-Quality Synthetic Empathy Data.
CoRR, 2024

SynthVLM: High-Efficiency and High-Quality Synthetic Data for Vision Language Models.
CoRR, 2024

PAS: Data-Efficient Plug-and-Play Prompt Augmentation System.
CoRR, 2024

KeyVideoLLM: Towards Large-scale Video Keyframe Selection.
CoRR, 2024

Efficient-Empathy: Towards Efficient and Effective Selection of Empathy Data.
CoRR, 2024

A Survey of Multimodal Large Language Model from A Data-centric Perspective.
CoRR, 2024


  Loading...