Mengdi Wang

Yuxiang Wang

CoRR, January, 2025

Deep Reinforcement Learning for Efficient and Fair Allocation of Healthcare Resources.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence, 2025

IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Rectified Diffusion: Straightness Is Not Your Need in Rectified Flow.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Towards Understanding Text Hallucination of Diffusion Models via Local Generation Bias.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Diffusion Transformer Captures Spatial-Temporal Dependencies: A Theory for Gaussian Process Data.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Collab: Controlled Decoding using Mixture of Agents for LLM Alignment.

[BibT_eX]

[DOI]

Souradip Chakraborty

Sujay Bhatt

Udari Madhushani Sehwag

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

A Common Pitfall of Margin-based Language Model Alignment: Gradient Entanglement.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Immune: Improving Safety Against Jailbreaks in Multi-modal LLMs via Inference-Time Alignment.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

2024

Efficient Reinforcement Learning With Impaired Observability: Learning to Act With Delayed and Missing State Observations.

[BibT_eX]

[DOI]

IEEE Trans. Inf. Theory, October, 2024

Redefining the Game: MVAE-DFDPnet's Low-Dimensional Embeddings for Superior Drug-Protein Interaction Predictions.

[BibT_eX]

[DOI]

IEEE J. Biomed. Health Informatics, July, 2024

Teamwork Reinforcement Learning With Concave Utilities.

[BibT_eX]

[DOI]

IEEE Trans. Mob. Comput., May, 2024

Boosting the Convergence of Reinforcement Learning-Based Auto-Pruning Using Historical Data.

[BibT_eX]

[DOI]

IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., February, 2024

Adversarial Attacks on Online Learning to Rank with Stochastic Click Models.

[BibT_eX]

[DOI]

Zichen Wang

Rishab Balasubramanian

Trans. Mach. Learn. Res., 2024

Author Correction: A 5′ UTR language model for decoding untranslated regions of mRNA and function predictions.

[BibT_eX]

[DOI]

Nat. Mac. Intell., 2024

A 5′ UTR language model for decoding untranslated regions of mRNA and function predictions.

[BibT_eX]

[DOI]

Nat. Mac. Intell., 2024

Sample Complexity of Neural Policy Mirror Descent for Policy Optimization on Low-Dimensional Manifolds.

[BibT_eX]

[DOI]

J. Mach. Learn. Res., 2024

On the Sample Complexity and Metastability of Heavy-tailed Policy Search in Continuous Control.

[BibT_eX]

[DOI]

J. Mach. Learn. Res., 2024

LIAR: Leveraging Alignment (Best-of-N) to Jailbreak LLMs in Seconds.

[BibT_eX]

[DOI]

CoRR, 2024

FoldMark: Protecting Protein Generative Models with Watermarking.

[BibT_eX]

[DOI]

CoRR, 2024

TreeBoN: Enhancing Inference-Time Alignment with Speculative Tree-Search and Best-of-N Sampling.

[BibT_eX]

[DOI]

CoRR, 2024

Long Term Memory: The Foundation of AI Self-Evolution.

[BibT_eX]

[DOI]

CoRR, 2024

AIME: AI System Optimization via Multiple LLM Evaluators.

[BibT_eX]

[DOI]

CoRR, 2024

Latent Diffusion Models for Controllable RNA Sequence Generation.

[BibT_eX]

[DOI]

CoRR, 2024

Relative-Translation Invariant Wasserstein Distance.

[BibT_eX]

[DOI]

CoRR, 2024

Contractual Reinforcement Learning: Pulling Arms with Invisible Hands.

[BibT_eX]

[DOI]

CoRR, 2024

Provable Statistical Rates for Consistency Diffusion Models.

[BibT_eX]

[DOI]

CoRR, 2024

SAIL: Self-Improving Efficient Online Alignment of Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

Self-Play with Adversarial Critic: Provable and Scalable Offline Alignment for Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

SpecDec++: Boosting Speculative Decoding via Adaptive Candidate Lengths.

[BibT_eX]

[DOI]

Kaixuan Huang

Xudong Guo

CoRR, 2024

AI Risk Management Should Incorporate Both Safety and Security.

[BibT_eX]

[DOI]

CoRR, 2024

CRISPR-GPT: An LLM Agent for Automated Design of Gene-Editing Experiments.

[BibT_eX]

[DOI]

CoRR, 2024

An Overview of Diffusion Models: Applications, Guided Generation, Statistical Rates and Optimization.

[BibT_eX]

[DOI]

CoRR, 2024

Diffusion Model for Data-Driven Black-Box Optimization.

[BibT_eX]

[DOI]

CoRR, 2024

Embodied LLM Agents Learn to Cooperate in Organized Teams.

[BibT_eX]

[DOI]

CoRR, 2024

Unveil Conditional Diffusion Models with Classifier-free Guidance: A Sharp Statistical Theory.

[BibT_eX]

[DOI]

CoRR, 2024

Regularized DeepIV with Model Selection.

[BibT_eX]

[DOI]

CoRR, 2024

MaxMin-RLHF: Towards Equitable Alignment of Large Language Models with Diverse Human Preferences.

[BibT_eX]

[DOI]

CoRR, 2024

Scalable Normalizing Flows Enable Boltzmann Generators for Macromolecules.

[BibT_eX]

[DOI]

CoRR, 2024

Deep reinforcement learning identifies personalized intermittent androgen deprivation therapy for prostate cancer.

[BibT_eX]

[DOI]

Briefings Bioinform., 2024

FlexSBDD: Structure-Based Drug Design with Flexible Protein Modeling.

[BibT_eX]

[DOI]

Zaixi Zhang

Qi Liu

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Nonparametric Classification on Low Dimensional Manifolds using Overparameterized Convolutional Residual Networks.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Fast Best-of-N Decoding via Speculative Rejection.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

One-Layer Transformer Provably Learns One-Nearest Neighbor In Context.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Offline Multitask Representation Learning for Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Gradient Guidance for Diffusion Models: An Optimization Perspective.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Global Convergence in Training Large-Scale Transformers.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Transfer Q-star : Principled Decoding for LLM Alignment.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

A Theoretical Perspective for Speculative Decoding Algorithm.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Conversational Dueling Bandits in Generalized Linear Models.

[BibT_eX]

[DOI]

Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2024

Is Inverse Reinforcement Learning Harder than Standard Reinforcement Learning? A Theoretical Perspective.

[BibT_eX]

[DOI]

Lei Zhao

Yu Bai

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Theoretical insights for diffusion guidance: A case study for Gaussian mixture models.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Information-Directed Pessimism for Offline Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Theory of Consistency Diffusion Models: Distribution Estimation Meets Fast Sampling.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

MaxMin-RLHF: Alignment with Diverse Human Preferences.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Sample-Efficient Learning of POMDPs with Multiple Observations In Hindsight.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

PARL: A Unified Framework for Policy Alignment in Reinforcement Learning from Human Feedback.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Policy Evaluation for Reinforcement Learning from Human Feedback: A Sample Complexity Analysis.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2024

Tree Search-Based Evolutionary Bandits for Protein Sequence Optimization.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

Visual Adversarial Examples Jailbreak Aligned Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023

Primal-Dual First-Order Methods for Affinely Constrained Multi-block Saddle Point Problems.

[BibT_eX]

[DOI]

SIAM J. Optim., June, 2023

1xN Pattern for Pruning Convolutional Neural Networks.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., April, 2023

Knowledge Annotation for Intelligent Textbooks.

[BibT_eX]

[DOI]

Technol. Knowl. Learn., March, 2023

Learning Good State and Action Representations for Markov Decision Process via Tensor Decomposition.

[BibT_eX]

[DOI]

J. Mach. Learn. Res., 2023

Double Duality: Variational Primal-Dual Policy Optimization for Constrained Reinforcement Learning.

[BibT_eX]

[DOI]

J. Mach. Learn. Res., 2023

Is Inverse Reinforcement Learning Harder than Standard Reinforcement Learning?

[BibT_eX]

[DOI]

Lei Zhao

Yu Bai

CoRR, 2023

Sample Complexity of Preference-Based Nonparametric Off-Policy Evaluation with Deep Networks.

[BibT_eX]

[DOI]

CoRR, 2023

Federated Multi-Level Optimization over Decentralized Networks.

[BibT_eX]

[DOI]

Shuoguang Yang

Xuezhou Zhang

CoRR, 2023

Deep Reinforcement Learning for Efficient and Fair Allocation of Health Care Resources.

[BibT_eX]

[DOI]

CoRR, 2023

Aligning Agent Policy with Externalities: Reward Design via Bilevel RL.

[BibT_eX]

[DOI]

CoRR, 2023

Actions Speak What You Want: Provably Sample-Efficient Reinforcement Learning of the Quantal Stackelberg Equilibrium from Strategic Feedbacks.

[BibT_eX]

[DOI]

Siyu Chen

Zhuoran Yang

CoRR, 2023

Provable Benefits of Policy Learning from Human Preferences in Contextual Bandit Problems.

[BibT_eX]

[DOI]

CoRR, 2023

Scaling In-Context Demonstrations with Structured Attention.

[BibT_eX]

[DOI]

CoRR, 2023

Nonparametric Classification on Low Dimensional Manifolds using Overparameterized Convolutional Residual Networks.

[BibT_eX]

[DOI]

CoRR, 2023

Visual Adversarial Examples Jailbreak Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2023

Reinforcement Learning with Human Feedback: Learning Dynamic Choices via Pessimism.

[BibT_eX]

[DOI]

Zihao Li

Zhuoran Yang

CoRR, 2023

Unified Off-Policy Learning to Rank: a Reinforcement Learning Perspective.

[BibT_eX]

[DOI]

Rishab Balasubramanian

Qingyun Wu

Huazheng Wang

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Reward-Directed Conditional Diffusion: Provable Distribution Estimation and Reward Improvement.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Posterior Sampling with Delayed Feedback for Reinforcement Learning with Linear Function Approximation.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Efficient RL with Impaired Observability: Learning to Act with Delayed and Missing State Observations.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Effective Minkowski Dimension of Deep Nonparametric Regression: Function Approximation and Statistical Theories.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

Provably Efficient Representation Learning with Tractable Planning in Low-Rank POMDP.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

Score Approximation, Estimation and Distribution Recovery of Diffusion Models on Low-Dimensional Data.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

STEERING : Stein Information Directed Exploration for Model-Based Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

Deep Reinforcement Learning for Cost-Effective Medical Diagnosis.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

Offline Reinforcement Learning with Differentiable Function Approximation is Provably Efficient.

[BibT_eX]

[DOI]

Ming Yin

Yu-Xiang Wang

Proceedings of the Eleventh International Conference on Learning Representations, 2023

Representation Learning for Low-rank General-sum Markov Games.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

Learning Kernelized Contextual Bandits in a Distributed and Asynchronous Environment.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

Sample Complexity of Nonparametric Off-Policy Evaluation on Low-Dimensional Manifolds using Deep Networks.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

Provable Benefits of Representational Transfer in Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Thirty Sixth Annual Conference on Learning Theory, 2023

Byzantine-Robust Online and Offline Distributed Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2023

2022

Energy system digitization in the era of AI: A three-layered approach toward carbon neutrality.

[BibT_eX]

[DOI]

Patterns, 2022

Learning Markov Models Via Low-Rank Optimization.

[BibT_eX]

[DOI]

Oper. Res., 2022

Near Sample-Optimal Reduction-based Policy Learning for Average Reward MDP.

[BibT_eX]

[DOI]

Jinghan Wang

CoRR, 2022

Energy System Digitization in the Era of AI: A Three-Layered Approach towards Carbon Neutrality.

[BibT_eX]

[DOI]

CoRR, 2022

Representation Learning for General-sum Low-rank Markov Games.

[BibT_eX]

[DOI]

CoRR, 2022

Provably Efficient Reinforcement Learning for Online Adaptive Influence Maximization.

[BibT_eX]

[DOI]

CoRR, 2022

Parameter-Efficient Sparsity for Large Language Models Fine-Tuning.

[BibT_eX]

[DOI]

CoRR, 2022

Optimal Estimation of Off-Policy Policy Gradient via Double Fitted Iteration.

[BibT_eX]

[DOI]

CoRR, 2022

Offline stochastic shortest path: Learning, evaluation and towards optimality.

[BibT_eX]

[DOI]

Proceedings of the Uncertainty in Artificial Intelligence, 2022

Bandit Theory and Thompson Sampling-Guided Directed Evolution for Sequence Optimization.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Decentralized Gossip-Based Stochastic Bilevel Optimization over Communication Networks.

[BibT_eX]

[DOI]

Shuoguang Yang

Xuezhou Zhang

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Communication Efficient Distributed Learning for Kernelized Contextual Bandits.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Off-Policy Fitted Q-Evaluation with Differentiable Function Approximators: Z-Estimation and Inference Theory.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

Efficient Reinforcement Learning in Block MDPs: A Model-free Representation Learning approach.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

Optimal Estimation of Policy Gradient via Double Fitted Iteration.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

Near-optimal Offline Reinforcement Learning with Linear Representation: Leveraging Variance Information with Pessimism.

[BibT_eX]

[DOI]

Proceedings of the Tenth International Conference on Learning Representations, 2022

Neural Bandits for Protein Sequence Optimization.

[BibT_eX]

[DOI]

Proceedings of the 56th Annual Conference on Information Sciences and Systems, 2022

Multi-Agent Reinforcement Learning with General Utilities via Decentralized Shadow Reward Actor-Critic.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021

Cautious Reinforcement Learning via Distributional Risk in the Dual Domain.

[BibT_eX]

[DOI]

IEEE J. Sel. Areas Inf. Theory, 2021

Voting-Based Multiagent Reinforcement Learning for Intelligent IoT.

[BibT_eX]

[DOI]

IEEE Internet Things J., 2021

Optimal policy evaluation using kernel-based temporal difference methods.

[BibT_eX]

[DOI]

Martin J. Wainwright

CoRR, 2021

You Only Compress Once: Towards Effective and Elastic BERT Compression via Exploit-Explore Stochastic Nature Gradient.

[BibT_eX]

[DOI]

CoRR, 2021

MARL with General Utilities via Decentralized Shadow Reward Actor-Critic.

[BibT_eX]

[DOI]

CoRR, 2021

1×N Block Pattern for Network Sparsity.

[BibT_eX]

[DOI]

CoRR, 2021

Bootstrapping Statistical Inference for Off-Policy Evaluation.

[BibT_eX]

[DOI]

CoRR, 2021

On the Convergence and Sample Efficiency of Variance-Reduced Policy Gradient Method.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Learning Good State and Action Representations via Tensor Decomposition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Information Theory, 2021

Bootstrapping Fitted Q-Evaluation for Off-Policy Inference.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

Sparse Feature Selection Makes Batch Reinforcement Learning More Sample Efficient.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

Contrastive Multi-document Question Generation.

[BibT_eX]

[DOI]

Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, 2021

Towards Compact CNNs via Collaborative Compression.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Intermittent Communications in Decentralized Shadow Reward Actor-Critic.

[BibT_eX]

[DOI]

Proceedings of the 2021 60th IEEE Conference on Decision and Control (CDC), 2021

Beyond Cumulative Returns via Reinforcement Learning over State-Action Occupancy Measures.

[BibT_eX]

[DOI]

Proceedings of the 2021 American Control Conference, 2021

Generalization Bounds for Stochastic Saddle Point Problems.

[BibT_eX]

[DOI]

Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021

Online Sparse Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021

2020

Spectral State Compression of Markov Processes.

[BibT_eX]

[DOI]

Anru Zhang

IEEE Trans. Inf. Theory, 2020

Adaptive Low-Nonnegative-Rank Approximation for State Aggregation of Markov Chains.

[BibT_eX]

[DOI]

SIAM J. Matrix Anal. Appl., 2020

A Single Timescale Stochastic Approximation Method for Nested Stochastic Optimization.

[BibT_eX]

[DOI]

Saeed Ghadimi

Andrzej Ruszczynski

SIAM J. Optim., 2020

Randomized Linear Programming Solves the Markov Decision Problem in Nearly Linear (Sometimes Sublinear) Time.

[BibT_eX]

[DOI]

Math. Oper. Res., 2020

Bridging Exploration and General Function Approximation in Reinforcement Learning: Provably Efficient Kernel and Neural Value Iterations.

[BibT_eX]

[DOI]

CoRR, 2020

Concept Annotation for Intelligent Textbooks.

[BibT_eX]

[DOI]

CoRR, 2020

Minimax-Optimal Off-Policy Evaluation with Linear Function Approximation.

[BibT_eX]

[DOI]

CoRR, 2020

Variational Policy Gradient Method for Reinforcement Learning with General Utilities.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Provably Efficient Reinforcement Learning with Kernel and Neural Function Approximations.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Generalized Leverage Score Sampling for Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

High-Dimensional Sparse Linear Bandits.

[BibT_eX]

[DOI]

Botao Hao

Tor Lattimore

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Fast Training of Deep Learning Models over Multiple GPUs.

[BibT_eX]

[DOI]

Proceedings of the Middleware '20: 21st International Middleware Conference, 2020

Model-Based Reinforcement Learning with Value-Targeted Regression.

[BibT_eX]

[DOI]

Proceedings of the 2nd Annual Conference on Learning for Dynamics and Control, 2020

Reinforcement Learning in Feature Space: Matrix Bandit, Kernels, and Regret Bound.

[BibT_eX]

[DOI]

Lin Yang

Proceedings of the 37th International Conference on Machine Learning, 2020

Minimax-Optimal Off-Policy Evaluation with Linear Function Approximation.

[BibT_eX]

[DOI]

Zeyu Jia

Proceedings of the 37th International Conference on Machine Learning, 2020

Model-Based Reinforcement Learning with Value-Targeted Regression.

[BibT_eX]

[DOI]

Proceedings of the 37th International Conference on Machine Learning, 2020

A History-Based Auto-Tuning Framework for Fast and High-Performance DNN Design on GPU.

[BibT_eX]

[DOI]

Proceedings of the 57th ACM/IEEE Design Automation Conference, 2020

Improved Sample Complexity for Stochastic Compositional Variance Reduced Gradient.

[BibT_eX]

[DOI]

Proceedings of the 2020 American Control Conference, 2020

Solving Discounted Stochastic Two-Player Games with Near-Optimal Time and Sample Complexity.

[BibT_eX]

[DOI]

Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

Sketching Transformed Matrices with Applications to Natural Language Processing.

[BibT_eX]

[DOI]

Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

2019

Multilevel Stochastic Gradient Methods for Nested Composition Optimization.

[BibT_eX]

[DOI]

Shuoguang Yang

SIAM J. Optim., 2019

Blessing of massive scale: spatial graphical model estimation with a total cardinality constraint approach.

[BibT_eX]

[DOI]

Han Liu

Math. Program., 2019

Picasso: A Sparse Learning Library for High Dimensional Data Analysis in R and Python.

[BibT_eX]

[DOI]

J. Mach. Learn. Res., 2019

Approximation Hardness for A Class of Sparse Optimization Problems.

[BibT_eX]

[DOI]

Yinyu Ye

J. Mach. Learn. Res., 2019

Unsupervised Common Question Generation from Multiple Documents using Reinforced Contrastive Coordinator.

[BibT_eX]

[DOI]

CoRR, 2019

Continuous Control with Contexts, Provably.

[BibT_eX]

[DOI]

CoRR, 2019

Voting-Based Multi-Agent Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, 2019

RL4health: Crowdsourcing Reinforcement Learning for Knee Replacement Pathway Optimization.

[BibT_eX]

[DOI]

Hao Lu

CoRR, 2019

Feature-Based Q-Learning for Two-Player Stochastic Games.

[BibT_eX]

[DOI]

Zeyu Jia

CoRR, 2019

Reinforcement Leaning in Feature Space: Matrix Bandit, Kernels, and Regret Bound.

[BibT_eX]

[DOI]

CoRR, 2019

Sample-Optimal Parametric Q-Learning with Linear Transition Models.

[BibT_eX]

[DOI]

CoRR, 2019

Online Factorization and Partition of Complex Networks by Random Walk.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fifth Conference on Uncertainty in Artificial Intelligence, 2019

Learning low-dimensional state embeddings and metastable clusters from time series data.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

State Aggregation Learning from Markov Transition Data.

[BibT_eX]

[DOI]

Zheng Tracy Ke

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Smart Roles: Inferring Professional Roles in Email Networks.

[BibT_eX]

[DOI]

Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2019

Maximum Likelihood Tensor Decomposition of Markov Decision Process.

[BibT_eX]

[DOI]

Chengzhuo Ni

Proceedings of the IEEE International Symposium on Information Theory, 2019

Characterizing Deep Learning Training Workloads on Alibaba-PAI.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Workload Characterization, 2019

Sample-Optimal Parametric Q-Learning Using Linearly Additive Features.

[BibT_eX]

[DOI]

Lin Yang

Proceedings of the 36th International Conference on Machine Learning, 2019

Learning to Control in Metric Space with Optimal Regret.

[BibT_eX]

[DOI]

Chengzhuo Ni

Proceedings of the 57th Annual Allerton Conference on Communication, 2019

2018

Near-optimal stochastic approximation for online principal component estimation.

[BibT_eX]

[DOI]

Math. Program., 2018

Graph-Adaptive Pruning for Efficient Inference of Convolutional Neural Networks.

[BibT_eX]

[DOI]

CoRR, 2018

A bird's-eye view on coherence, and a worm's-eye view on cohesion.

[BibT_eX]

[DOI]

CoRR, 2018

Diffusion Approximations for Online Principal Component Estimation and Global Convergence.

[BibT_eX]

[DOI]

CoRR, 2018

Improved Oracle Complexity for Stochastic Compositional Variance Reduced Gradient.

[BibT_eX]

[DOI]

CoRR, 2018

Scalable Bilinear π Learning Using State and Action Features.

[BibT_eX]

[DOI]

Lihong Li

CoRR, 2018

State Compression of Markov Processes via Empirical Low-Rank Estimation.

[BibT_eX]

[DOI]

Anru Zhang

CoRR, 2018

Variance Reduced Value Iteration and Faster Algorithms for Solving Markov Decision Processes.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, 2018

Near-Optimal Time and Sample Complexities for Solving Markov Decision Processes with a Generative Model.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Dimensionality Reduction for Stationary Time Series via Stochastic Nonconvex Optimization.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Estimation of Markov Chain via Rank-constrained Likelihood.

[BibT_eX]

[DOI]

Xudong Li

Anru Zhang

Proceedings of the 35th International Conference on Machine Learning, 2018

Scalable Bilinear Learning Using State and Action Features.

[BibT_eX]

[DOI]

Lihong Li

Proceedings of the 35th International Conference on Machine Learning, 2018

Efficient Deep Learning Inference Based on Model Compression.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2018

Minimax-Optimal Privacy-Preserving Sparse PCA in Distributed Systems.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2018

2017

Vanishing Price of Decentralization in Large Coordinative Nonconvex Optimization.

[BibT_eX]

[DOI]

SIAM J. Optim., 2017

Stochastic compositional gradient descent: algorithms for minimizing compositions of expected-value functions.

[BibT_eX]

[DOI]

Han Liu

Math. Program., 2017

Deep Primal-Dual Reinforcement Learning: Accelerating Actor-Critic using Bellman Duality.

[BibT_eX]

[DOI]

Woon Sang Cho

CoRR, 2017

Primal-Dual π Learning: Sample Complexity and Sublinear Run Time for Ergodic Markov Decision Problems.

[BibT_eX]

[DOI]

CoRR, 2017

Dynamic Factorization and Partition of Complex Networks.

[BibT_eX]

[DOI]

CoRR, 2017

Randomized Linear Programming Solves the Discounted Markov Decision Problem In Nearly-Linear Running Time.

[BibT_eX]

[DOI]

CoRR, 2017

Lower Bound On the Computational Complexity of Discounted Markov Decision Problems.

[BibT_eX]

[DOI]

CoRR, 2017

The Signals and Noise: Actionable Information in Improvised Social Media Channels During a Disaster.

[BibT_eX]

[DOI]

Proceedings of the 2017 ACM on Web Science Conference, 2017

Diffusion Approximations for Online Principal Component Estimation and Global Convergence.

[BibT_eX]

[DOI]

Chris Junchi Li

Tong Zhang

Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Strong NP-Hardness for Sparse Optimization with Concave Penalty Functions.

[BibT_eX]

[DOI]

Proceedings of the 34th International Conference on Machine Learning, 2017

Finite-sum Composition Optimization via Variance Reduced Gradient Descent.

[BibT_eX]

[DOI]

Xiangru Lian

Ji Liu

Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, 2017

2016

Stochastic First-Order Methods with Random Constraint Projection.

[BibT_eX]

[DOI]

Dimitri P. Bertsekas

SIAM J. Optim., 2016

Stochastic Primal-Dual Methods and Sample Complexity of Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, 2016

A stochastic compositional gradient method using Markov samples.

[BibT_eX]

[DOI]

Ji Liu

Proceedings of the Winter Simulation Conference, 2016

Link Prediction via Multi-hashing Framework.

[BibT_eX]

[DOI]

Yu-Ru Lin

Proceedings of the Social, Cultural, and Behavioral Modeling, 9th International Conference, 2016

TeleLink: Link Prediction in Social Network Based on Multiplex Cohesive Structures.

[BibT_eX]

[DOI]

Di Jin

Yu-Ru Lin

Proceedings of the Social, Cultural, and Behavioral Modeling, 9th International Conference, 2016

Accelerating Stochastic Composition Optimization.

[BibT_eX]

[DOI]

Ji Liu

Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

An online primal-dual method for discounted Markov decision processes.

[BibT_eX]

[DOI]

Proceedings of the 55th IEEE Conference on Decision and Control, 2016

2015

Incremental constraint projection methods for variational inequalities.

[BibT_eX]

[DOI]

Dimitri P. Bertsekas

Math. Program., 2015

A Distributed Tracking Algorithm for Reconstruction of Graph Signals.

[BibT_eX]

[DOI]

Xiaohan Wang

Yuantao Gu

IEEE J. Sel. Top. Signal Process., 2015

Random Multi-Constraint Projection: Stochastic Gradient Methods for Convex Optimization with Many Constraints.

[BibT_eX]

[DOI]

CoRR, 2015

Averaging random projection: A fast online solution for large-scale constrained stochastic optimization.

[BibT_eX]

[DOI]

Jialin Liu

Yuantao Gu

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

2014

Stabilization of Stochastic Iterative Methods for Singular and Nearly Singular Linear Systems.

[BibT_eX]

[DOI]

Dimitri P. Bertsekas

Math. Oper. Res., 2014

Multi-task nonconvex optimization with total budget constraint: A distributed algorithm using Monte Carlo estimates.

[BibT_eX]

[DOI]

Yunjian Xu

Yuntao Gu

Proceedings of the 19th International Conference on Digital Signal Processing, 2014

Learning distributed jointly sparse systems by collaborative LMS.

[BibT_eX]

[DOI]

Yuantao Gu

Proceedings of the IEEE International Conference on Acoustics, 2014

2013

Stochastic methods for large-scale linear problems, variational inequalities, and convex optimization.

[BibT_eX]

[DOI]