Huazheng Wang

Orcid: 0000-0003-3918-6925

According to our database1, Huazheng Wang authored at least 88 papers between 2015 and 2026.

Collaborative distances:

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
Speculative Pipeline Decoding: Higher-Accruacy and Zero-Bubble Speculation via Pipeline Parallelism.
CoRR, May, 2026

When Does Multi-Agent RL Improve LLM Workflows? Workflow, Scale, and Policy-Sharing Tradeoffs.
CoRR, May, 2026

MetaAgent-X : Breaking the Ceiling of Automatic Multi-Agent Systems via End-to-End Reinforcement Learning.
CoRR, May, 2026

Towards Affordable Energy: A Gymnasium Environment for Electric Utility Demand-Response Programs.
CoRR, May, 2026

EVOCHAMBER: Test-Time Co-evolution of Multi-Agent System at Individual, Team, and Population Scales.
CoRR, May, 2026

Beyond Static Bias: Adaptive Multi-Fidelity Bandits with Improving Proxies.
CoRR, May, 2026

Embodied LLM Agents Learn to Cooperate in Organized Teams.
IEEE Trans. Comput. Soc. Syst., April, 2026

When Can You Poison Rewards? A Tight Characterization of Reward Poisoning in Linear MDPs.
CoRR, April, 2026

Density-aware Soft Context Compression with Semi-Dynamic Compression Ratio.
CoRR, March, 2026

Live-Evo: Online Evolution of Agentic Memory from Continuous Feedback.
CoRR, February, 2026

Enhancing MLLMs for Online Understanding in Video Services via Preference Optimization.
IEEE Trans. Serv. Comput., 2026

LLM-RankFusion: Mitigating Intrinsic Inconsistency in LLM-based Ranking.
Trans. Mach. Learn. Res., 2026

A Survey of Self-Evolving Agents: What, When, How, and Where to Evolve on the Path to Artificial Super Intelligence.
Trans. Mach. Learn. Res., 2026

Erasing Without Remembering: Implicit Knowledge Forgetting in Large Language Models.
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2026

Example Quality Matters: Multi-Aspects Example Augmentation for Private Library Programming.
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2026

Bridging the Tokenizer Gap: Semantics and Distribution-aware Knowledge Transfer for Unbiased Cross-Tokenizer Distillation.
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

2025
Sliding Window Attention Adaptation.
CoRR, December, 2025

A Survey of Self-Evolving Agents: On Path to Artificial Super Intelligence.
CoRR, July, 2025

Divide, Optimize, Merge: Fine-Grained LLM Agent Optimization at Scale.
CoRR, May, 2025

Erasing Without Remembering: Safeguarding Knowledge Forgetting in Large Language Models.
CoRR, February, 2025

Fair Online Influence Maximization.
Trans. Mach. Learn. Res., 2025

Hard Work Does Not Always Pay Off: On the Robustness of NAS to Data Poisoning.
Trans. Mach. Learn. Res., 2025

Do LVLMs Truly Understand Video Anomalies? Revealing Hallucination via Co-Occurrence Patterns.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

Design-Based Bandits Under Network Interference: Trade-Off Between Regret and Statistical Inference.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

Evaluating and Mitigating Object Hallucination in Large Vision-Language Models: Can They Still See Removed Objects?
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025

Which Agent Causes Task Failures and When? On Automated Failure Attribution of LLM Multi-Agent Systems.
Proceedings of the Forty-second International Conference on Machine Learning, 2025

Provably Efficient Algorithm for Best Scoring Rule Identification in Online Principal-Agent Information Acquisition.
Proceedings of the Forty-second International Conference on Machine Learning, 2025

Mitigating Object Hallucination in Large Vision-Language Models via Visual Attention Direct Preference Optimization.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2025

A Common Pitfall of Margin-based Language Model Alignment: Gradient Entanglement.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Unveiling Internal Reasoning Modes in LLMs: A Deep Dive into Latent Reasoning vs. Factual Shortcuts with Attribute Rate Ratio.
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

TreeBoN: Enhancing Inference-Time Alignment with Speculative Tree-Search and Best-of-N Sampling.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025

The Ranking Blind Spot: Decision Hijacking in LLM-based Text Ranking.
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

Divide, Optimize, Merge: Scalable Fine-Grained Generative Optimization for LLM Agents.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025

SimpleDoc: Multi-Modal Document Understanding with Dual-Cue Page Retrieval and Iterative Refinement.
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

The Threat of PROMPTS in Large Language Models: A System and User Prompt Perspective.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

Efficient and Robust Reinforcement Learning from Human Feedback.
Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

FCOM: A Federated Collaborative Online Monitoring Framework via Representation Learning.
Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

2024
Adversarial Attacks on Online Learning to Rank with Stochastic Click Models.
Trans. Mach. Learn. Res., 2024

Memory-Augmented Agent Training for Business Document Understanding.
CoRR, 2024

RA-PbRL: Provably Efficient Risk-Aware Preference-Based Reinforcement Learning.
CoRR, 2024

TreeBoN: Enhancing Inference-Time Alignment with Speculative Tree-Search and Best-of-N Sampling.
CoRR, 2024

Contractual Reinforcement Learning: Pulling Arms with Invisible Hands.
CoRR, 2024

Hard Work Does Not Always Pay Off: Poisoning Attacks on Neural Architecture Search.
CoRR, 2024

AutoDefense: Multi-Agent LLM Defense against Jailbreak Attacks.
CoRR, 2024

Pure Exploration in Asynchronous Federated Bandits.
Proceedings of the Uncertainty in Artificial Intelligence, 2024

RA-PbRL: Provably Efficient Risk-Aware Preference-Based Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

MDR: Model-Specific Demonstration Retrieval at Inference Time for In-Context Learning.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024

Video Anomaly Detection via Progressive Learning of Multiple Proxy Tasks.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Conversational Dueling Bandits in Generalized Linear Models.
Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2024

Adversarial Attacks on Combinatorial Multi-Armed Bandits.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

PARL: A Unified Framework for Policy Alignment in Reinforcement Learning from Human Feedback.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

SSS: Editing Factual Knowledge in Language Models towards Semantic Sparse Space.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

Stealthy Adversarial Attacks on Stochastic Multi-Armed Bandits.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

Tree Search-Based Evolutionary Bandits for Protein Sequence Optimization.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Multi-Agent Join.
CoRR, 2023

Aligning Agent Policy with Externalities: Reward Design via Bilevel RL.
CoRR, 2023

Online Modeling and Monitoring of Dependent Processes under Resource Constraints.
CoRR, 2023

Provable Benefits of Policy Learning from Human Preferences in Contextual Bandit Problems.
CoRR, 2023

Machine Learning for Synthetic Data Generation: a Review.
CoRR, 2023

Incentivizing Exploration in Linear Contextual Bandits under Information Gap.
Proceedings of the 17th ACM Conference on Recommender Systems, 2023

Unified Off-Policy Learning to Rank: a Reinforcement Learning Perspective.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Provably Efficient Representation Learning with Tractable Planning in Low-Rank POMDP.
Proceedings of the International Conference on Machine Learning, 2023

Learning Kernelized Contextual Bandits in a Distributed and Asynchronous Environment.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

How Does Diffusion Influence Pretrained Language Models on Out-of-Distribution Data?
Proceedings of the ECAI 2023 - 26th European Conference on Artificial Intelligence, September 30 - October 4, 2023, Kraków, Poland, 2023

2022
Provably Efficient Reinforcement Learning for Online Adaptive Influence Maximization.
CoRR, 2022

Dynamic Global Sensitivity for Differentially Private Contextual Bandits.
Proceedings of the RecSys '22: Sixteenth ACM Conference on Recommender Systems, Seattle, WA, USA, September 18, 2022

Bandit Theory and Thompson Sampling-Guided Directed Evolution for Sequence Optimization.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Communication Efficient Distributed Learning for Kernelized Contextual Bandits.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

When Are Linear Stochastic Bandits Attackable?
Proceedings of the International Conference on Machine Learning, 2022

2021
Unbiased Learning to Rank: Online or Offline?
ACM Trans. Inf. Syst., 2021

Incentivizing Exploration in Linear Bandits under Information Gap.
CoRR, 2021

PairRank: Online Pairwise Learning to Rank by Divide-and-Conquer.
Proceedings of the WWW '21: The Web Conference 2021, 2021

Interactive Information Retrieval with Bandit Feedback.
Proceedings of the SIGIR '21: The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2021

2020
A Smoothed Analysis of Online Lasso for the Sparse Linear Contextual Bandit Problem.
CoRR, 2020

Global and Local Differential Privacy for Collaborative Bandits.
Proceedings of the RecSys 2020: Fourteenth ACM Conference on Recommender Systems, 2020

Learning by Exploration: New Challenges in Real-World Environments.
Proceedings of the KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2020

GeLaiGeLai: A visual platform for analysis of Classical Chinese Poetry based on Knowledge Graph.
Proceedings of the 2020 IEEE International Conference on Knowledge Graph, 2020

Incentivized Exploration for Multi-Armed Bandits under Reward Drift.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Dynamic Ensemble of Contextual Bandits to Satisfy Users' Changing Interests.
Proceedings of the World Wide Web Conference, 2019

Variance Reduction in Gradient Exploration for Online Learning to Rank.
Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2019

Factorization Bandits for Online Influence Maximization.
Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2019

Adversarial Domain Adaptation for Machine Reading Comprehension.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

2018
Efficient Exploration of Gradient Space for Online Learning to Rank.
Proceedings of the 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, 2018

2017
Factorization Bandits for Interactive Recommendation.
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

2016
Contextual Bandits in a Collaborative Environment.
Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval, 2016

Solving Verbal Questions in IQ Test by Knowledge-Powered Word Embedding.
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 2016

Learning Hidden Features for Contextual Bandits.
Proceedings of the 25th ACM International Conference on Information and Knowledge Management, 2016

2015
Solving Verbal Comprehension Questions in IQ Test by Knowledge-Powered Word Embedding.
CoRR, 2015


  Loading...