Simon S. Du

CoRR, March, 2026

Improving Clinical Diagnosis with Counterfactual Multi-Agent Reasoning.

[BibT_eX]

[DOI]

CoRR, March, 2026

Cold-Start Personalization via Training-Free Priors from Structured World Models.

[BibT_eX]

[DOI]

CoRR, February, 2026

2025

Unregularized Linear Convergence in Zero-Sum Game from Preference Feedback.

[BibT_eX]

[DOI]

CoRR, December, 2025

Understanding the Gain from Data Filtering in Multimodal Contrastive Learning.

[BibT_eX]

[DOI]

Divyansh Pareek

Sewoong Oh

CoRR, December, 2025

ThetaEvolve: Test-time Learning on Open Problems.

[BibT_eX]

[DOI]

CoRR, November, 2025

Convergence Dynamics of Over-Parameterized Score Matching for a Single Gaussian.

[BibT_eX]

[DOI]

CoRR, November, 2025

Global Convergence of Four-Layer Matrix Factorization under Random Initialization.

[BibT_eX]

[DOI]

CoRR, November, 2025

RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments.

[BibT_eX]

[DOI]

CoRR, November, 2025

Personalized Reasoning: Just-In-Time Personalization and Why LLMs Fail At It.

[BibT_eX]

[DOI]

CoRR, October, 2025

Spurious Rewards: Rethinking Training Signals in RLVR.

[BibT_eX]

[DOI]

CoRR, June, 2025

Policy-Based Trajectory Clustering in Offline Reinforcement Learning.

[BibT_eX]

[DOI]

Hao Hu

Xinqi Wang

CoRR, June, 2025

Chasing Moving Targets with Online Self-Play Reinforcement Learning for Safer Language Models.

[BibT_eX]

[DOI]

CoRR, June, 2025

Global Convergence of Gradient EM for Over-Parameterized Gaussian Mixtures.

[BibT_eX]

[DOI]

CoRR, June, 2025

Understanding the Performance Gap in Preference Learning: A Dichotomy of RLHF and DPO.

[BibT_eX]

[DOI]

CoRR, May, 2025

Reinforcement Learning for Reasoning in Large Language Models with One Training Example.

[BibT_eX]

[DOI]

CoRR, April, 2025

Improving Human-AI Coordination through Adversarial Training and Generative Models.

[BibT_eX]

[DOI]

CoRR, April, 2025

LoRe: Personalizing LLMs via Low-Rank Reward Modeling.

[BibT_eX]

[DOI]

CoRR, April, 2025

Extragradient Preference Optimization (EGPO): Beyond Last-Iterate Convergence for Nash Learning from Human Feedback.

[BibT_eX]

[DOI]

CoRR, March, 2025

A Minimalist Example of Edge-of-Stability and Progressive Sharpening.

[BibT_eX]

[DOI]

CoRR, March, 2025

Towards Understanding the Benefit of Multitask Representation Learning in Decision Process.

[BibT_eX]

[DOI]

CoRR, March, 2025

SHARP: Accelerating Language Model Inference by SHaring Adjacent layers with Recovery Parameters.

[BibT_eX]

[DOI]

CoRR, February, 2025

Reinforcement Learning for Reasoning in Large Language Models with One Training Example.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

Highlighting What Matters: Promptable Embeddings for Attribute-Focused Image Retrieval.

[BibT_eX]

[DOI]

Siting Li

Xiang Gao

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

Sharp Gap-Dependent Variance-Aware Regret Bounds for Tabular MDPs.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

Minimax Optimal Regret Bound for Reinforcement Learning with Trajectory Feedback.

[BibT_eX]

[DOI]

Proceedings of the Forty-second International Conference on Machine Learning, 2025

The Crucial Role of Samplers in Online Direct Preference Optimization.

[BibT_eX]

[DOI]

Ruizhe Shi

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Is Your World Simulator a Good Story Presenter? A Consecutive Events-Based Benchmark for Future Long Video Generation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Anytime Acceleration of Gradient Descent.

[BibT_eX]

[DOI]

Proceedings of the Thirty Eighth Annual Conference on Learning Theory, 2025

Cross-environment Cooperation Enables Zero-shot Multi-agent Coordination.

[BibT_eX]

[DOI]

Proceedings of the 47th Annual Meeting of the Cognitive Science Society, 2025

Offline Multi-task Transfer RL with Representational Penalization.

[BibT_eX]

[DOI]

Avinandan Bose

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2025

Exploring How Generative MLLMs Perceive More Than CLIP with the Same Vision Encoder.

[BibT_eX]

[DOI]

Siting Li

Pang Wei Koh

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

2024

LabelBench: A Comprehensive Framework for Benchmarking Adaptive Label-Efficient Learning.

[BibT_eX]

[DOI]

J. Data-centric Mach. Learn. Res., 2024

Hybrid Preference Optimization for Alignment: Provably Faster Convergence Rates by Combining Offline Preferences with Online Exploration.

[BibT_eX]

[DOI]

CoRR, 2024

On Erroneous Agreements of CLIP Image Embeddings.

[BibT_eX]

[DOI]

Siting Li

Pang Wei Koh

CoRR, 2024

Transformers are Efficient Compilers, Provably.

[BibT_eX]

[DOI]

CoRR, 2024

Multi-Agent Reinforcement Learning from Human Feedback: Data Coverage and Algorithmic Techniques.

[BibT_eX]

[DOI]

CoRR, 2024

Cost-Effective Proxy Reward Model Construction with On-Policy and Active Learning.

[BibT_eX]

[DOI]

CoRR, 2024

Toward Global Convergence of Gradient EM for Over-Parameterized Gaussian Mixture Models.

[BibT_eX]

[DOI]

Weihang Xu

CoRR, 2024

Transferable Reinforcement Learning via Generalized Occupancy Models.

[BibT_eX]

[DOI]

CoRR, 2024

Refined Sample Complexity for Markov Games with Independent Linear Function Approximation.

[BibT_eX]

[DOI]

Yan Dai

CoRR, 2024

Variance Alignment Score: A Simple But Tough-to-Beat Data Selection Method for Multimodal Contrastive Learning.

[BibT_eX]

[DOI]

CoRR, 2024

Distributional Successor Features Enable Zero-Shot Policy Optimization.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Toward Global Convergence of Gradient EM for Over-Paramterized Gaussian Mixture Models.

[BibT_eX]

[DOI]

Weihang Xu

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

CLIPLoss and Norm-Based Data Selection Methods for Multimodal Contrastive Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Decoding-Time Language Model Alignment with Multiple Objectives.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Understanding the Gains from Repeated Self-Distillation.

[BibT_eX]

[DOI]

Divyansh Pareek

Sewoong Oh

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Learning to Cooperate with Humans using Generative Agents.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Learning Optimal Tax Design in Nonatomic Congestion Games.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Rethinking Transformers in Solving POMDPs.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Free from Bellman Completeness: Trajectory Stitching via Model-based Return-conditioned Supervised Learning.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Horizon-Free Regret for Linear Markov Decision Processes.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

How Over-Parameterization Slows Down Gradient Descent in Matrix Sensing: The Curses of Symmetry and Initialization.

[BibT_eX]

[DOI]

Nuoya Xiong

Lijun Ding

Proceedings of the Twelfth International Conference on Learning Representations, 2024

JoMA: Demystifying Multilayer Transformers via Joint Dynamics of MLP and Attention.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Unleashing the Power of Pre-trained Language Models for Offline Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Dichotomy of Early and Late Phase Implicit Biases Can Provably Induce Grokking.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

A Black-box Approach for Non-stationary Multi-agent Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Optimal Multi-Distribution Learning.

[BibT_eX]

[DOI]

Proceedings of the Thirty Seventh Annual Conference on Learning Theory, June 30, 2024

Settling the sample complexity of online reinforcement learning.

[BibT_eX]

[DOI]

Proceedings of the Thirty Seventh Annual Conference on Learning Theory, June 30, 2024

Refined Sample Complexity for Markov Games with Independent Linear Function Approximation (Extended Abstract).

[BibT_eX]

[DOI]

Yan Dai

Proceedings of the Thirty Seventh Annual Conference on Learning Theory, June 30, 2024

Reflect-RL: Two-Player Online RL Fine-Tuning for LMs.

[BibT_eX]

[DOI]

Beibin Li

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

An Experimental Design Framework for Label-Efficient Supervised Finetuning of Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2024

2023

Integrating the traffic science with representation learning for city-wide network congestion prediction.

[BibT_eX]

[DOI]

Inf. Fusion, November, 2023

Understanding Curriculum Learning in Policy Optimization for Online Combinatorial Optimization.

[BibT_eX]

[DOI]

Trans. Mach. Learn. Res., 2023

Beyond Information Gain: An Empirical Benchmark for Low-Switching-Cost Reinforcement Learning.

[BibT_eX]

[DOI]

Trans. Mach. Learn. Res., 2023

LabelBench: A Comprehensive Framework for Benchmarking Label-Efficient Learning.

[BibT_eX]

[DOI]

CoRR, 2023

Optimal Extragradient-Based Algorithms for Stochastic Variational Inequalities with Separable Structure.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

A Reduction-based Framework for Sequential Decision Making with Delayed Feedback.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Scan and Snap: Understanding Training Dynamics and Token Composition in 1-layer Transformer.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Active representation learning for general task space with applications in robotics.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Sharp Variance-Dependent Bounds in Reinforcement Learning: Best of Both Worlds in Stochastic and Deterministic Environments.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

Horizon-Free and Variance-Dependent Reinforcement Learning for Latent Markov Decision Processes.

[BibT_eX]

[DOI]

Ruosong Wang

Proceedings of the International Conference on Machine Learning, 2023

On the Power of Pre-training for Generalization in RL: Provable Benefits and Hardness.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

Improved Active Multi-Task Representation Learning via Lasso.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

Understanding Incremental Learning of Gradient Descent: A Fine-grained Analysis of Matrix Sensing.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

Linear Convergence of Natural Policy Gradient Methods with Log-Linear Policies.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

Offline Congestion Games: How Feedback Type Affects Data Coverage Requirement.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

Faster Last-iterate Convergence of Policy Optimization in Zero-Sum Markov Games.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

Variance-Aware Sparse Linear Bandits.

[BibT_eX]

[DOI]

Yan Dai

Ruosong Wang

Proceedings of the Eleventh International Conference on Learning Representations, 2023

Over-Parameterization Exponentially Slows Down Gradient Descent for Learning a Single Neuron.

[BibT_eX]

[DOI]

Weihang Xu

Proceedings of the Thirty Sixth Annual Conference on Learning Theory, 2023

Breaking the Curse of Multiagents in a Large State Space: RL in Markov Games with Independent Linear Function Approximation.

[BibT_eX]

[DOI]

Kaiqing Zhang

Proceedings of the Thirty Sixth Annual Conference on Learning Theory, 2023

Blessing of Class Diversity in Pre-training.

[BibT_eX]

[DOI]

Yulai Zhao

Jianshu Chen

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2023

2022

Understanding the acceleration phenomenon via high-resolution differential equations.

[BibT_eX]

[DOI]

Math. Program., 2022

Horizon-Free Reinforcement Learning for Latent Markov Decision Processes.

[BibT_eX]

[DOI]

Ruosong Wang

CoRR, 2022

Optimal Extragradient-Based Bilinearly-Coupled Saddle-Point Optimization.

[BibT_eX]

[DOI]

CoRR, 2022

Provable General Function Class Representation Learning in Multitask Bandits and MDPs.

[BibT_eX]

[DOI]

CoRR, 2022

Nearly Minimax Algorithms for Linear Bandits with Shared Representation.

[BibT_eX]

[DOI]

CoRR, 2022

Understanding Curriculum Learning in Policy Optimization for Solving Combinatorial Optimization Problems.

[BibT_eX]

[DOI]

CoRR, 2022

TransFollower: Long-Sequence Car-Following Trajectory Prediction through Transformer.

[BibT_eX]

[DOI]

CoRR, 2022

When is Offline Two-Player Zero-Sum Markov Game Solvable?

[BibT_eX]

[DOI]

CoRR, 2022

Near-Optimal Randomized Exploration for Tabular Markov Decision Processes.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

On Gap-dependent Bounds for Offline Reinforcement Learning.

[BibT_eX]

[DOI]

Xinqi Wang

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Provable General Function Class Representation Learning in Multitask Bandits and MDP.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Learning in Congestion Games with Bandit Feedback.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

When are Offline Two-Player Zero-Sum Markov Games Solvable?

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Provably Efficient Offline Multi-agent Reinforcement Learning via Strategy-wise Bonus.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Nearly Optimal Policy Optimization with Stable at Any Time Guarantee.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

Reward-Free RL is No Harder Than Reward-Aware RL in Linear Markov Decision Processes.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

First-Order Regret in Reinforcement Learning with Linear Function Approximation: A Robust Estimation Approach.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

Active Multi-Task Representation Learning.

[BibT_eX]

[DOI]

Yifang Chen

Kevin Jamieson

Proceedings of the International Conference on Machine Learning, 2022

Near-Optimal Algorithms for Autonomous Exploration and Multi-Goal Stochastic Shortest Path.

[BibT_eX]

[DOI]

Haoyuan Cai

Tengyu Ma

Proceedings of the International Conference on Machine Learning, 2022

Denoised MDPs: Learning World Models Better Than the World Itself.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

A Reduction-Based Framework for Conservative Bandits and Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Tenth International Conference on Learning Representations, 2022

Provable Adaptation across Multiway Domains via Representation Learning.

[BibT_eX]

[DOI]

Zhili Feng

Shaobo Han

Proceedings of the Tenth International Conference on Learning Representations, 2022

Horizon-Free Reinforcement Learning in Polynomial Time: the Power of Stationary Policies.

[BibT_eX]

[DOI]

Proceedings of the Conference on Learning Theory, 2-5 July 2022, London, UK., 2022

Provably Efficient Policy Optimization for Two-Player Zero-Sum Markov Games.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2022

Gap-Dependent Bounds for Two-Player Markov Games.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2022

AdaLoss: A Computationally-Efficient and Provably Convergent Adaptive Gradient Method.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021

Near-Linear Time Local Polynomial Nonparametric Estimation with Box Kernels.

[BibT_eX]

[DOI]

Yi Wu

INFORMS J. Comput., 2021

A Benchmark for Low-Switching-Cost Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, 2021

Towards Demystifying Representation Learning with Non-contrastive Self-supervision.

[BibT_eX]

[DOI]

CoRR, 2021

A Unified Framework for Conservative Exploration.

[BibT_eX]

[DOI]

CoRR, 2021

On the Power of Multitask Representation Learning in Linear MDP.

[BibT_eX]

[DOI]

Rui Lu

Gao Huang

CoRR, 2021

Randomized Exploration is Near-Optimal for Tabular MDP.

[BibT_eX]

[DOI]

Zhihan Xiong

Ruoqi Shen

CoRR, 2021

Provably Efficient Policy Gradient Methods for Two-Player Zero-Sum Markov Games.

[BibT_eX]

[DOI]

CoRR, 2021

Variance-Aware Confidence Set: Variance-Dependent Bound for Linear Bandits and Horizon-Free Bound for Linear Mixture MDP.

[BibT_eX]

[DOI]

CoRR, 2021

A Provably Efficient Algorithm for Linear Markov Decision Process with Low Switching Cost.

[BibT_eX]

[DOI]

CoRR, 2021

When is particle filtering efficient for planning in partially observed linear dynamical systems?

[BibT_eX]

[DOI]

Proceedings of the Thirty-Seventh Conference on Uncertainty in Artificial Intelligence, 2021

Improved Variance-Aware Confidence Sets for Linear Bandits and Linear Mixture MDP.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Global Convergence of Gradient Descent for Asymmetric Low-Rank Matrix Factorization.

[BibT_eX]

[DOI]

Tian Ye

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Stochastic Shortest Path: Minimax, Parameter-Free and Towards Horizon-Free Regret.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Nearly Horizon-Free Offline Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Corruption Robust Active Learning.

[BibT_eX]

[DOI]

Yifang Chen

Kevin Jamieson

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Near Optimal Reward-Free Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

On Reinforcement Learning with Adversarial Corruption and Its Application to Block MDP.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

Bilinear Classes: A Structural Framework for Provable Generalization in RL.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

Improved Corruption Robust Algorithms for Episodic Reinforcement Learning.

[BibT_eX]

[DOI]

Yifang Chen

Kevin Jamieson

Proceedings of the 38th International Conference on Machine Learning, 2021

Impact of Representation Learning in Linear Bandits.

[BibT_eX]

[DOI]

Proceedings of the 9th International Conference on Learning Representations, 2021

How Neural Networks Extrapolate: From Feedforward to Graph Neural Networks.

[BibT_eX]

[DOI]

Ken-ichi Kawarabayashi

Stefanie Jegelka

Proceedings of the 9th International Conference on Learning Representations, 2021

Discovering Diverse Multi-Agent Strategic Behavior via Reward Randomization.

[BibT_eX]

[DOI]

Proceedings of the 9th International Conference on Learning Representations, 2021

Few-Shot Learning via Learning the Representation, Provably.

[BibT_eX]

[DOI]

Proceedings of the 9th International Conference on Learning Representations, 2021

Optimism in Reinforcement Learning with Generalized Linear Function Approximation.

[BibT_eX]

[DOI]

Proceedings of the 9th International Conference on Learning Representations, 2021

Is Reinforcement Learning More Difficult Than Bandits? A Near-optimal Algorithm Escaping the Curse of Horizon.

[BibT_eX]

[DOI]

Proceedings of the Conference on Learning Theory, 2021

Fine-Grained Gap-Dependent Bounds for Tabular MDPs via Adaptive Multi-Step Bootstrap.

[BibT_eX]

[DOI]

Haike Xu

Tengyu Ma

Proceedings of the Conference on Learning Theory, 2021

Q-learning with Logarithmic Regret.

[BibT_eX]

[DOI]

Kunhe Yang

Lin F. Yang

Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021

2020

On Stationary-Point Hitting Time and Ergodicity of Stochastic Gradient Langevin Dynamics.

[BibT_eX]

[DOI]

Xi Chen

Xin T. Tong

J. Mach. Learn. Res., 2020

Provable Benefits of Representation Learning in Linear Bandits.

[BibT_eX]

[DOI]

CoRR, 2020

Nearly Minimax Optimal Reward-free Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, 2020

When is Particle Filtering Efficient for POMDP Sequential Planning?

[BibT_eX]

[DOI]

CoRR, 2020

Is Long Horizon Reinforcement Learning More Difficult Than Short Horizon Reinforcement Learning?

[BibT_eX]

[DOI]

CoRR, 2020

Provably Efficient Exploration for RL with Unsupervised Learning.

[BibT_eX]

[DOI]

CoRR, 2020

Agnostic Q-learning with Function Approximation in Deterministic Systems: Tight Bounds on Approximation Error and Sample Complexity.

[BibT_eX]

[DOI]

CoRR, 2020

Over-parameterized Adversarial Training: An Analysis Overcoming the Curse of Dimensionality.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Planning with General Objective Functions: Going Beyond Total Rewards.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

On Reward-Free Reinforcement Learning with Linear Function Approximation.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Is Long Horizon RL More Difficult Than Short Horizon RL?

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Provably Efficient Exploration for Reinforcement Learning Using Unsupervised Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Agnostic $Q$-learning with Function Approximation in Deterministic Systems: Near-Optimal Bounds on Approximation Error and Sample Complexity.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

DualSMC: Tunneling Differentiable Filtering and Planning under Continuous POMDPs.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

Provable Representation Learning for Imitation Learning via Bi-level Optimization.

[BibT_eX]

[DOI]

Proceedings of the 37th International Conference on Machine Learning, 2020

What Can Neural Networks Reason About?

[BibT_eX]

[DOI]

Ken-ichi Kawarabayashi

Stefanie Jegelka

Proceedings of the 8th International Conference on Learning Representations, 2020

Is a Good Representation Sufficient for Sample Efficient Reinforcement Learning?

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Learning Representations, 2020

Harnessing the Power of Infinitely Wide Deep Nets on Small-data Tasks.

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Learning Representations, 2020

2019

Gradient Descent for Non-convex Problems in Modern Machine Learning.

[BibT_eX]

[DOI]

PhD thesis, 2019

Enhanced Convolutional Neural Tangent Kernels.

[BibT_eX]

[DOI]

CoRR, 2019

Continuous Control with Contexts, Provably.

[BibT_eX]

[DOI]

CoRR, 2019

Dual Sequential Monte Carlo: Tunneling Filtering and Planning in Continuous POMDPs.

[BibT_eX]

[DOI]

CoRR, 2019

Hitting Time of Stochastic Gradient Langevin Dynamics to Stationary Points: A Direct Analysis.

[BibT_eX]

[DOI]

Xi Chen

Xin T. Tong

CoRR, 2019

Global Convergence of Adaptive Gradient Methods for An Over-parameterized Neural Network.

[BibT_eX]

[DOI]

Xiaoxia Wu

Rachel A. Ward

CoRR, 2019

Acceleration via Symplectic Discretization of High-Resolution Differential Equations.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Towards Understanding the Importance of Shortcut Connections in Residual Networks.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Provably Efficient Q-learning with Function Approximation via Distribution Shift Error Checking Oracle.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Graph Neural Tangent Kernel: Fusing Graph Neural Networks with Graph Kernels.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

On Exact Computation with an Infinitely Wide Neural Net.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Gradient Descent Finds Global Minima of Deep Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the 36th International Conference on Machine Learning, 2019

Provably efficient RL with Rich Observations via Latent State Decoding.

[BibT_eX]

[DOI]

Proceedings of the 36th International Conference on Machine Learning, 2019

Width Provably Matters in Optimization for Deep Linear Neural Networks.

[BibT_eX]

[DOI]

Wei Hu

Proceedings of the 36th International Conference on Machine Learning, 2019

Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the 36th International Conference on Machine Learning, 2019

Gradient Descent Provably Optimizes Over-parameterized Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the 7th International Conference on Learning Representations, 2019

Linear Convergence of the Primal-Dual Gradient Method for Convex-Concave Saddle Point Problems without Strong Convexity.

[BibT_eX]

[DOI]

Wei Hu

Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, 2019

2018

Robust Nonparametric Regression under Huber's ε-contamination Model.

[BibT_eX]

[DOI]

Pradeep Ravikumar

CoRR, 2018

How Many Samples are Needed to Learn a Convolutional Neural Network?

[BibT_eX]

[DOI]

Xiyu Zhai

Ruslan Salakhutdinov

CoRR, 2018

Improved Learning of One-hidden-layer Convolutional Neural Networks with Overlaps.

[BibT_eX]

[DOI]

Surbhi Goel

CoRR, 2018

Near-Linear Time Local Polynomial Nonparametric Estimation.

[BibT_eX]

[DOI]

Yi Wu

CoRR, 2018

How Many Samples are Needed to Estimate a Convolutional Neural Network?

[BibT_eX]

[DOI]

Xiyu Zhai

Ruslan Salakhutdinov

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Algorithmic Regularization in Learning Deep Homogeneous Models: Layers are Automatically Balanced.

[BibT_eX]

[DOI]

Wei Hu

Jason D. Lee

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Fast and Sample Efficient Inductive Matrix Completion via Multi-Phase Procrustes Flow.

[BibT_eX]

[DOI]

Xiao Zhang

Quanquan Gu

Proceedings of the 35th International Conference on Machine Learning, 2018

Discrete-Continuous Mixtures in Probabilistic Programming: Generalized Semantics and Inference Algorithms.

[BibT_eX]

[DOI]

Proceedings of the 35th International Conference on Machine Learning, 2018

Gradient Descent Learns One-hidden-layer CNN: Don't be Afraid of Spurious Local Minima.

[BibT_eX]

[DOI]

Proceedings of the 35th International Conference on Machine Learning, 2018

On the Power of Over-parametrization in Neural Networks with Quadratic Activation.

[BibT_eX]

[DOI]

Jason D. Lee

Proceedings of the 35th International Conference on Machine Learning, 2018

When is a Convolutional Filter Easy to Learn?

[BibT_eX]

[DOI]

Jason D. Lee

Yuandong Tian

Proceedings of the 6th International Conference on Learning Representations, 2018

Stochastic Zeroth-order Optimization in High Dimensions.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2018

2017

Computationally Efficient Robust Estimation of Sparse Functionals.

[BibT_eX]

[DOI]

CoRR, 2017

On the Power of Truncated SVD for General High-rank Matrix Estimation Problems.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Hypothesis Transfer Learning via Transformation Functions.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Gradient Descent Can Take Exponential Time to Escape Saddle Points.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Stochastic Variance Reduction Methods for Policy Evaluation.

[BibT_eX]

[DOI]

Proceedings of the 34th International Conference on Machine Learning, 2017

High-Throughput Robotic Phenotyping of Energy Sorghum Crops.

[BibT_eX]

[DOI]

Srinivasan Vijayarangan

Dimitrios Apostolopoulos

David Wettergreen

Proceedings of the Field and Service Robotics, 2017

Computationally Efficient Robust Sparse Estimation in High Dimensions.

[BibT_eX]

[DOI]

Jerry Li

Proceedings of the 30th Conference on Learning Theory, 2017

2016

Transformation Function Based Methods for Model Shift.

[BibT_eX]

[DOI]

CoRR, 2016

Efficient Nonparametric Smoothness Estimation.

[BibT_eX]

[DOI]

Shashank Singh