Haoli Bai

According to our database1, Haoli Bai authored at least 68 papers between 2016 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
QuantClaw: Precision Where It Matters for OpenClaw.
CoRR, April, 2026

GRPO-VPS: Enhancing Group Relative Policy Optimization with Verifiable Process Supervision for Effective Reasoning.
CoRR, April, 2026

REAgent: Requirement-Driven LLM Agents for Software Issue Resolution.
CoRR, April, 2026

Beyond Masks: Efficient, Flexible Diffusion Language Models via Deletion-Insertion Processes.
CoRR, March, 2026

BATQuant: Outlier-resilient MXFP4 Quantization via Learnable Block-wise Optimization.
CoRR, March, 2026

Stabilizing Reinforcement Learning for Diffusion Language Models.
CoRR, March, 2026

FastCode: Fast and Cost-Efficient Code Understanding and Reasoning.
CoRR, March, 2026

Merging Beyond: Streaming LLM Updates via Activation-Guided Rotations.
CoRR, February, 2026

OVD: On-policy Verbal Distillation.
CoRR, January, 2026

From Verifiable Dot to Reward Chain: Harnessing Verifiable Reference-based Rewards for Reinforcement Learning of Open-ended Generation.
CoRR, January, 2026

What Makes Low-Bit Quantization-Aware Training Work for Reasoning LLMs? A Systematic Study.
CoRR, January, 2026

Benchmarking Post-Training Quantization of Large Language Models under Microscaling Floating Point Formats.
CoRR, January, 2026

SWE-Lego: Pushing the Limits of Supervised Fine-tuning for Software Issue Resolving.
CoRR, January, 2026

The Synergy Dilemma of Long-CoT SFT and RL: Investigating Post-Training Techniques for Reasoning VLMs.
Trans. Mach. Learn. Res., 2026

2025
Dream-VL & Dream-VLA: Open Vision-Language and Vision-Language-Action Models with Diffusion Language Model Backbone.
CoRR, December, 2025

Memory-T1: Reinforcement Learning for Temporal Reasoning in Multi-session Agents.
CoRR, December, 2025

InSight-o3: Empowering Multimodal Foundation Models with Generalized Visual Search.
CoRR, December, 2025

Dr.Mi-Bench: A Modular-integrated Benchmark for Scientific Deep Research Agent.
CoRR, December, 2025

HERMES: Towards Efficient and Verifiable Mathematical Reasoning in LLMs.
CoRR, November, 2025

E<sup>3</sup>-Pruner: Towards Efficient, Economical, and Effective Layer Pruning for Large Language Models.
CoRR, November, 2025

ATTS: Asynchronous Test-Time Scaling via Conformal Prediction.
CoRR, September, 2025

MMSearch-Plus: A Simple Yet Challenging Benchmark for Multimodal Browsing Agents.
CoRR, August, 2025

Think Before You Talk: Enhancing Meaningful Dialogue Generation in Full-Duplex Speech Language Models with Planning-Inspired Text Guidance.
CoRR, August, 2025

Fourier-VLM: Compressing Vision Tokens in the Frequency Domain for Large Vision-Language Models.
CoRR, August, 2025

The Synergy Dilemma of Long-CoT SFT and RL: Investigating Post-Training Techniques for Reasoning VLMs.
CoRR, July, 2025

FreqKV: Frequency Domain Key-Value Compression for Efficient Context Window Extension.
CoRR, May, 2025

Quantization Hurts Reasoning? An Empirical Study on Quantized Reasoning Models.
CoRR, April, 2025

A Simple Linear Patch Revives Layer-Pruned Large Language Models.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

TreeKV: Smooth Key-Value Cache Compression with Tree Structures.
Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence, 2025

FlatQuant: Flatness Matters for LLM Quantization.
Proceedings of the Forty-second International Conference on Machine Learning, 2025

WeightedKV: Attention Scores Weighted Key-Value Cache Merging for Large Language Models.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Faster and Better LLMs via Latency-Aware Test-Time Scaling.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025

Efficient Inference for Large Language Models -Algorithm, Model, and System.
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

2024
EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions.
CoRR, 2024

S2D: Sorted Speculative Decoding For More Efficient Deployment of Nested Large Language Models.
CoRR, 2024

Visually Guided Generative Text-Layout Pre-training for Document Intelligence.
CoRR, 2024

Visually Guided Generative Text-Layout Pre-training for Document Intelligence.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024

Plug-and-Play: An Efficient Post-training Pruning Method for Large Language Models.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

MoPE-CLIP: Structured Pruning for Efficient Vision-Language Models with Module-Wise Pruning Error Metric.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

IntactKV: Improving Large Language Model Quantization by Keeping Pivot Tokens Intact.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

2023
Structured Pruning for Efficient Generative Pre-trained Language Models.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

Wukong-Reader: Multi-modal Pre-training for Fine-grained Visual Document Understanding.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2022
Wukong-Reader: Multi-modal Pre-training for Fine-grained Visual Document Understanding.
CoRR, 2022

Towards Efficient Post-training Quantization of Pre-trained Language Models.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Dynamically Pruning Segformer for Efficient Semantic Segmentation.
Proceedings of the IEEE International Conference on Acoustics, 2022

2021
Gradient estimation of information measures in deep learning.
Knowl. Based Syst., 2021

Discrete Auto-regressive Variational Attention Models for Text Modeling.
Proceedings of the International Joint Conference on Neural Networks, 2021

DAP-BERT: Differentiable Architecture Pruning of BERT.
Proceedings of the Neural Information Processing - 28th International Conference, 2021

BinaryBERT: Pushing the Limit of BERT Quantization.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

2020
Structured pruning of recurrent neural networks through neuron selection.
Neural Networks, 2020

DART: Domain-Adversarial Residual-Transfer networks for unsupervised cross-domain image classification.
Neural Networks, 2020

Bayesian Automatic Model Compression.
IEEE J. Sel. Top. Signal Process., 2020

BinaryBERT: Pushing the Limit of BERT Quantization.
CoRR, 2020

Discrete Variational Attention Models for Language Generation.
CoRR, 2020

Efficient Bitwidth Search for Practical Mixed Precision Neural Network.
CoRR, 2020

Revisiting Parameter Sharing for Automatic Neural Channel Number Search.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

TranSlider: Transfer Ensemble Learning from Exploitation to Exploration.
Proceedings of the KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2020

M-NAS: Meta Neural Architecture Search.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

RTN: Reparameterized Ternary Network.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

Few Shot Network Compression via Cross Distillation.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Variational Random Function Model for Network Modeling.
IEEE Trans. Neural Networks Learn. Syst., 2019

Structured Pruning of Recurrent Neural Networks through Neuron Selection.
CoRR, 2019

2018
Structured Inference for Recurrent Hidden Semi-markov Model.
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018

Neural Relational Topic Models for Scientific Article Analysis.
Proceedings of the 27th ACM International Conference on Information and Knowledge Management, 2018

2017
Stochastic Sequential Neural Networks with Structured Inference.
CoRR, 2017

Learning from semantically dependent multi-tasks.
Proceedings of the 2017 International Joint Conference on Neural Networks, 2017

2016
Hierarchical Probabilistic Matrix Factorization with Network Topology for Multi-relational Social Network.
Proceedings of The 8th Asian Conference on Machine Learning, 2016


  Loading...