We stand with Ukraine

We stand with Ukraine

Jason D. Lee

Orcid: 0000-0003-0064-7800

Affiliations:

Stanford University, Institute of Computational and Mathematical Engineering

According to our database¹, Jason D. Lee authored at least 163 papers between 2007 and 2024.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

Online presence:

on orcid.org
on stanford.edu

On csauthors.net:

Bibliography

2024

Horizon-Free Regret for Linear Markov Decision Processes.

[BibT_eX]

[DOI]

,

,

,

CoRR, 2024

Computational-Statistical Gaps in Gaussian Single-Index Models.

[BibT_eX]

[DOI]

,

Loucas Pillaud-Vivien

,

,

CoRR, 2024

How Well Can Transformers Emulate In-context Newton's Method?

[BibT_eX]

[DOI]

Angeliki Giannou

,

,

,

Dimitris Papailiopoulos

,

CoRR, 2024

How Transformers Learn Causal Structure with Gradient Descent.

[BibT_eX]

[DOI]

,

,

CoRR, 2024

LoRA Training in the NTK Regime has No Spurious Local Minima.

[BibT_eX]

[DOI]

,

,

CoRR, 2024

Revisiting Zeroth-Order Optimization for Memory-Efficient LLM Fine-Tuning: A Benchmark.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

CoRR, 2024

BitDelta: Your Fine-Tune May Only Be Worth One Bit.

[BibT_eX]

[DOI]

,

,

,

,

,

,

CoRR, 2024

An Information-Theoretic Analysis of In-Context Learning.

[BibT_eX]

[DOI]

,

,

,

Benjamin Van Roy

CoRR, 2024

Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads.

[BibT_eX]

[DOI]

,

,

,

,

,

,

CoRR, 2024

2023

Policy Mirror Descent for Regularized Reinforcement Learning: A Generalized Framework with Linear Convergence.

[BibT_eX]

[DOI]

,

,

,

,

,

SIAM J. Optim., June, 2023

Towards Optimal Statistical Watermarking.

[BibT_eX]

[DOI]

,

,

,

,

,

Michael I. Jordan

CoRR, 2023

Optimal Multi-Distribution Learning.

[BibT_eX]

[DOI]

,

,

,

,

CoRR, 2023

Dichotomy of Early and Late Phase Implicit Biases Can Provably Induce Grokking.

[BibT_eX]

[DOI]

,

,

,

,

,

CoRR, 2023

Learning Hierarchical Polynomials with Three-Layer Neural Networks.

[BibT_eX]

[DOI]

,

,

CoRR, 2023

Provably Efficient CVaR RL in Low-rank MDPs.

[BibT_eX]

[DOI]

,

,

,

,

,

,

CoRR, 2023

REST: Retrieval-Based Speculative Decoding.

[BibT_eX]

[DOI]

,

,

,

,

CoRR, 2023

Settling the Sample Complexity of Online Reinforcement Learning.

[BibT_eX]

[DOI]

,

,

,

CoRR, 2023

Teaching Arithmetic to Small Transformers.

[BibT_eX]

[DOI]

,

Kartik Sreenivasan

,

,

,

Dimitris Papailiopoulos

CoRR, 2023

Scaling In-Context Demonstrations with Structured Attention.

[BibT_eX]

[DOI]

,

,

,

CoRR, 2023

Solving Robust MDPs through No-Regret Dynamics.

[BibT_eX]

[DOI]

Etash Kumar Guha

,

CoRR, 2023

How to Query Human Feedback Efficiently in RL?

[BibT_eX]

[DOI]

,

Masatoshi Uehara

,

,

CoRR, 2023

Reward Collapse in Aligning Large Language Models.

[BibT_eX]

[DOI]

,

,

,

CoRR, 2023

Provable Offline Reinforcement Learning with Human Feedback.

[BibT_eX]

[DOI]

,

Masatoshi Uehara

,

,

,

CoRR, 2023

Refined Value-Based Offline RL under Realizability and Partial Coverage.

[BibT_eX]

[DOI]

Masatoshi Uehara

,

,

,

CoRR, 2023

Sample Complexity for Quadratic Bandits: Hessian Dependent Bounds and Optimal Algorithms.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Implicit Bias of Gradient Descent for Logistic Regression at the Edge of Stability.

[BibT_eX]

[DOI]

,

Vladimir Braverman

,

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Offline Minimax Soft-Q-learning Under Realizability and Partial Coverage.

[BibT_eX]

[DOI]

Masatoshi Uehara

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Provable Guarantees for Nonlinear Feature Learning in Three-Layer Neural Networks.

[BibT_eX]

[DOI]

,

,

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Fine-Tuning Language Models with Just Forward Passes.

[BibT_eX]

[DOI]

Sadhika Malladi

,

,

,

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Smoothing the Landscape Boosts the Signal for SGD: Optimal Sample Complexity for Learning Single Index Models.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Reward-agnostic Fine-tuning: Provable Statistical Benefits of Hybrid Reinforcement Learning.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Regret Guarantees for Online Deep Control.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the Learning for Dynamics and Control Conference, 2023

Computationally Efficient PAC RL in POMDPs with Latent Determinism and Conditional Embeddings.

[BibT_eX]

[DOI]

Masatoshi Uehara

,

,

,

,

Proceedings of the International Conference on Machine Learning, 2023

Understanding Incremental Learning of Gradient Descent: A Fine-grained Analysis of Matrix Sensing.

[BibT_eX]

[DOI]

,

,

,

Simon Shaolei Du

,

Proceedings of the International Conference on Machine Learning, 2023

Looped Transformers as Programmable Computers.

[BibT_eX]

[DOI]

Angeliki Giannou

,

Shashank Rajput

,

,

,

,

Dimitris Papailiopoulos

Proceedings of the International Conference on Machine Learning, 2023

Efficient displacement convex optimization with particle gradient descent.

[BibT_eX]

[DOI]

Hadi Daneshmand

,

,

Proceedings of the International Conference on Machine Learning, 2023

Local Optimization Achieves Global Optimality in Multi-Agent Reinforcement Learning.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the International Conference on Machine Learning, 2023

PAC Reinforcement Learning for Predictive State Representations.

[BibT_eX]

[DOI]

,

Masatoshi Uehara

,

,

Proceedings of the Eleventh International Conference on Learning Representations, 2023

Decentralized Optimistic Hyperpolicy Mirror Descent: Provably No-Regret Learning in Markov Games.

[BibT_eX]

[DOI]

,

,

Proceedings of the Eleventh International Conference on Learning Representations, 2023

Can We Find Nash Equilibria at a Linear Rate in Markov Games?

[BibT_eX]

[DOI]

,

,

Proceedings of the Eleventh International Conference on Learning Representations, 2023

Self-Stabilization: The Implicit Bias of Gradient Descent at the Edge of Stability.

[BibT_eX]

[DOI]

,

,

Proceedings of the Eleventh International Conference on Learning Representations, 2023

Provably Efficient Reinforcement Learning via Surprise Bound.

[BibT_eX]

[DOI]

,

,

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2023

Optimal Sample Complexity Bounds for Non-convex Optimization under Kurdyka-Lojasiewicz Condition.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2023

Provable Hierarchy-Based Meta-Reinforcement Learning.

[BibT_eX]

[DOI]

,

,

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2023

2022

Reconstructing Training Data from Model Gradient, Provably.

[BibT_eX]

[DOI]

,

,

CoRR, 2022

Neural Networks can Learn Representations with Gradient Descent.

[BibT_eX]

[DOI]

,

,

Mahdi Soltanolkotabi

CoRR, 2022

Nearly Minimax Algorithms for Linear Bandits with Shared Representation.

[BibT_eX]

[DOI]

,

,

,

CoRR, 2022

Provably Efficient Reinforcement Learning in Partially Observable Dynamical Systems.

[BibT_eX]

[DOI]

Masatoshi Uehara

,

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

On the Effective Number of Linear Regions in Shallow Univariate ReLU Networks: Convergence Guarantees and Implicit Bias.

[BibT_eX]

[DOI]

,

,

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

From Gradient Flow on Population Loss to Learning with Stochastic Gradient Descent.

[BibT_eX]

[DOI]

Christopher De Sa

,

,

,

,

Karthik Sridharan

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Identifying good directions to escape the NTK regime and efficiently learn low-degree plus sparse polynomials.

[BibT_eX]

[DOI]

,

,

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Implicit Bias of Gradient Descent on Reparametrized Models: On Equivalence to Mirror Descent.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Towards General Function Approximation in Zero-Sum Markov Games.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the Tenth International Conference on Learning Representations, 2022

Competitive Multi-Agent Reinforcement Learning with Self-Supervised Representation.

[BibT_eX]

[DOI]

,

,

,

H. Vincent Poor

Proceedings of the IEEE International Conference on Acoustics, 2022

Offline Reinforcement Learning with Realizability and Single-policy Concentrability.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the Conference on Learning Theory, 2-5 July 2022, London, UK., 2022

Optimization-Based Separations for Neural Networks.

[BibT_eX]

[DOI]

,

Proceedings of the Conference on Learning Theory, 2-5 July 2022, London, UK., 2022

Neural Networks can Learn Representations with Gradient Descent.

[BibT_eX]

[DOI]

Alexandru Damian

,

,

Mahdi Soltanolkotabi

Proceedings of the Conference on Learning Theory, 2-5 July 2022, London, UK., 2022

Provably Efficient Policy Optimization for Two-Player Zero-Sum Markov Games.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2022

2021

Linearized ADMM Converges to Second-Order Stationary Points for Non-Convex Problems.

[BibT_eX]

[DOI]

,

,

Meisam Razaviyayn

,

IEEE Trans. Signal Process., 2021

On the Theory of Policy Gradient Methods: Optimality, Approximation, and Distribution Shift.

[BibT_eX]

[DOI]

,

,

,

J. Mach. Learn. Res., 2021

Provable Regret Bounds for Deep Online Learning and Control.

[BibT_eX]

[DOI]

,

,

,

CoRR, 2021

A Short Note on the Relationship of Information Gain and Eluder Dimension.

[BibT_eX]

[DOI]

,

,

,

CoRR, 2021

MUSBO: Model-based Uncertainty Regularized and Sample Efficient Batch Optimization for Deployment Constrained Reinforcement Learning.

[BibT_eX]

[DOI]

,

,

,

H. Vincent Poor

CoRR, 2021

Provably Efficient Policy Gradient Methods for Two-Player Zero-Sum Markov Games.

[BibT_eX]

[DOI]

,

,

,

CoRR, 2021

Predicting What You Already Know Helps: Provable Self-Supervised Learning.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Optimal Gradient-based Algorithms for Non-concave Bandit Optimization.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Going Beyond Linear RL: Sample Efficient Neural Function Approximation.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Label Noise SGD Provably Prefers Flat Global Minimizers.

[BibT_eX]

[DOI]

,

,

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

How Fine-Tuning Allows for Effective Meta-Learning.

[BibT_eX]

[DOI]

,

,

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Near-Optimal Linear Regression under Distribution Shift.

[BibT_eX]

[DOI]

,

,

Proceedings of the 38th International Conference on Machine Learning, 2021

Bilinear Classes: A Structural Framework for Provable Generalization in RL.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Proceedings of the 38th International Conference on Machine Learning, 2021

A Theory of Label Propagation for Subpopulation Shift.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the 38th International Conference on Machine Learning, 2021

How Important is the Train-Validation Split in Meta-Learning?

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

Proceedings of the 38th International Conference on Machine Learning, 2021

Impact of Representation Learning in Linear Bandits.

[BibT_eX]

[DOI]

,

,

,

Simon Shaolei Du

Proceedings of the 9th International Conference on Learning Representations, 2021

Few-Shot Learning via Learning the Representation, Provably.

[BibT_eX]

[DOI]

Simon Shaolei Du

,

,

,

,

Proceedings of the 9th International Conference on Learning Representations, 2021

Shape Matters: Understanding the Implicit Bias of the Noise Covariance.

[BibT_eX]

[DOI]

Jeff Z. HaoChen

,

,

,

Proceedings of the Conference on Learning Theory, 2021

Modeling from Features: a Mean-field Framework for Over-parameterized Deep Neural Networks.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the Conference on Learning Theory, 2021

2020

Stochastic Subgradient Method Converges on Tame Functions.

[BibT_eX]

[DOI]

,

Dmitriy Drusvyatskiy

,

,

Found. Comput. Math., 2020

Provable Benefits of Representation Learning in Linear Bandits.

[BibT_eX]

[DOI]

,

,

,

CoRR, 2020

Distributed Estimation for Principal Component Analysis: a Gap-free Approach.

[BibT_eX]

[DOI]

,

,

,

CoRR, 2020

Steepest Descent Neural Architecture Optimization: Escaping Local Optimum with Signed Neural Splitting.

[BibT_eX]

[DOI]

,

,

,

,

CoRR, 2020

Agnostic Q-learning with Function Approximation in Deterministic Systems: Tight Bounds on Approximation Error and Sample Complexity.

[BibT_eX]

[DOI]

,

,

,

CoRR, 2020

Beyond Lazy Training for Over-parameterized Tensor Decomposition.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Sanity-Checking Pruning Methods: Random Tickets can Win the Jackpot.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Implicit Bias in Deep Linear Classification: Initialization Scale vs Training Accuracy.

[BibT_eX]

[DOI]

Edward Moroshko

,

Blake E. Woodworth

,

Suriya Gunasekar

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Generalized Leverage Score Sampling for Neural Networks.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Convergence of Meta-Learning with Task-Specific Adaptation over Partial Parameters.

[BibT_eX]

[DOI]

,

,

,

H. Vincent Poor

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

How to Characterize The Landscape of Overparameterized Convolutional Neural Networks.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Agnostic $Q$-learning with Function Approximation in Deterministic Systems: Near-Optimal Bounds on Approximation Error and Sample Complexity.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Towards Understanding Hierarchical Learning: Benefits of Neural Representations.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Optimal transport mapping via input convex neural networks.

[BibT_eX]

[DOI]

Ashok Vardhan Makkuva

,

Amirhossein Taghvaei

,

,

Proceedings of the 37th International Conference on Machine Learning, 2020

SGD Learns One-Layer Networks in WGANs.

[BibT_eX]

[DOI]

,

,

,

Constantinos Daskalakis

Proceedings of the 37th International Conference on Machine Learning, 2020

Beyond Linearization: On Quadratic and Higher-Order Approximation of Wide Neural Networks.

[BibT_eX]

[DOI]

,

Proceedings of the 8th International Conference on Learning Representations, 2020

Kernel and Rich Regimes in Overparametrized Models.

[BibT_eX]

[DOI]

Blake E. Woodworth

,

Suriya Gunasekar

,

,

Edward Moroshko

,

,

,

,

Proceedings of the Conference on Learning Theory, 2020

Optimality and Approximation with Policy Gradient Methods in Markov Decision Processes.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the Conference on Learning Theory, 2020

2019

Theoretical Insights Into the Optimization Landscape of Over-Parameterized Shallow Neural Networks.

[BibT_eX]

[DOI]

Mahdi Soltanolkotabi

,

,

IEEE Trans. Inf. Theory, 2019

First-order methods almost always avoid strict saddle points.

[BibT_eX]

[DOI]

,

Ioannis Panageas

,

Georgios Piliouras

,

,

Michael I. Jordan

,

Math. Program., 2019

When Does Non-Orthogonal Tensor Decomposition Have No Spurious Local Minima?

[BibT_eX]

[DOI]

,

Sina Baharlouei

,

Meisam Razaviyayn

,

CoRR, 2019

Incremental Methods for Weakly Convex Optimization.

[BibT_eX]

[DOI]

,

,

Anthony Man-Cho So

,

CoRR, 2019

Convergence of Adversarial Training in Overparametrized Networks.

[BibT_eX]

[DOI]

,

,

,

,

,

CoRR, 2019

Regularization Matters: Generalization and Optimization of Neural Nets v.s. their Induced Kernel.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Solving a Class of Non-Convex Min-Max Games Using Iterative First Order Methods.

[BibT_eX]

[DOI]

,

,

,

,

Meisam Razaviyayn

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Convergence of Adversarial Training in Overparametrized Neural Networks.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Neural Temporal-Difference Learning Converges to Global Optima.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Lexicographic and Depth-Sensitive Margins in Homogeneous and Non-Homogeneous Deep Models.

[BibT_eX]

[DOI]

Mor Shpigel Nacson

,

Suriya Gunasekar

,

,

,

Proceedings of the 36th International Conference on Machine Learning, 2019

Gradient Descent Finds Global Minima of Deep Neural Networks.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the 36th International Conference on Machine Learning, 2019

Convergence of Gradient Descent on Separable Data.

[BibT_eX]

[DOI]

Mor Shpigel Nacson

,

,

Suriya Gunasekar

,

Pedro Henrique Pamplona Savarese

,

,

Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, 2019

2018

Solving Non-Convex Non-Concave Min-Max Games Under Polyak-Łojasiewicz Condition.

[BibT_eX]

[DOI]

,

Meisam Razaviyayn

,

CoRR, 2018

On the Margin Theory of Feedforward Neural Networks.

[BibT_eX]

[DOI]

,

,

,

CoRR, 2018

Provably Correct Automatic Subdifferentiation for Qualified Programs.

[BibT_eX]

[DOI]

,

CoRR, 2018

Convergence of Gradient Descent on Separable Data.

[BibT_eX]

[DOI]

Mor Shpigel Nacson

,

,

Suriya Gunasekar

,

,

CoRR, 2018

Gradient Primal-Dual Algorithm Converges to Second-Order Stationary Solutions for Nonconvex Distributed Optimization.

[BibT_eX]

[DOI]

,

,

Meisam Razaviyayn

CoRR, 2018

Solving Approximate Wasserstein GANs to Stationarity.

[BibT_eX]

[DOI]

,

,

Meisam Razaviyayn

,

CoRR, 2018

On the Convergence and Robustness of Training GANs with Regularized Optimal Transport.

[BibT_eX]

[DOI]

,

,

Meisam Razaviyayn

,

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Adding One Neuron Can Eliminate All Bad Local Minima.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Provably Correct Automatic Sub-Differentiation for Qualified Programs.

[BibT_eX]

[DOI]

,

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Implicit Bias of Gradient Descent on Linear Convolutional Networks.

[BibT_eX]

[DOI]

Suriya Gunasekar

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Algorithmic Regularization in Learning Deep Homogeneous Models: Layers are Automatically Balanced.

[BibT_eX]

[DOI]

,

,

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Gradient Primal-Dual Algorithm Converges to Second-Order Stationary Solution for Nonconvex Distributed Optimization Over Networks.

[BibT_eX]

[DOI]

,

Meisam Razaviyayn

,

Proceedings of the 35th International Conference on Machine Learning, 2018

Characterizing Implicit Bias in Terms of Optimization Geometry.

[BibT_eX]

[DOI]

Suriya Gunasekar

,

,

,

Proceedings of the 35th International Conference on Machine Learning, 2018

Gradient Descent Learns One-hidden-layer CNN: Don't be Afraid of Spurious Local Minima.

[BibT_eX]

[DOI]

,

,

,

,

Barnabás Póczos

Proceedings of the 35th International Conference on Machine Learning, 2018

On the Power of Over-parametrization in Neural Networks with Quadratic Activation.

[BibT_eX]

[DOI]

,

Proceedings of the 35th International Conference on Machine Learning, 2018

No Spurious Local Minima in a Two Hidden Unit ReLU Network.

[BibT_eX]

[DOI]

,

,

Proceedings of the 6th International Conference on Learning Representations, 2018

When is a Convolutional Filter Easy to Learn?

[BibT_eX]

[DOI]

,

,

Proceedings of the 6th International Conference on Learning Representations, 2018

Learning One-hidden-layer Neural Networks with Landscape Design.

[BibT_eX]

[DOI]

,

,

Proceedings of the 6th International Conference on Learning Representations, 2018

2017

Communication-efficient Sparse Regression.

[BibT_eX]

[DOI]

,

,

,

Jonathan E. Taylor

J. Mach. Learn. Res., 2017

Distributed Stochastic Variance Reduced Gradient Methods by Sampling Extra Data with Replacement.

[BibT_eX]

[DOI]

,

,

,

J. Mach. Learn. Res., 2017

First-order Methods Almost Always Avoid Saddle Points.

[BibT_eX]

[DOI]

,

Ioannis Panageas

,

Georgios Piliouras

,

,

Michael I. Jordan

,

CoRR, 2017

An inexact subsampled proximal Newton-type method for large-scale machine learning.

[BibT_eX]

[DOI]

,

,

,

CoRR, 2017

A Flexible Framework for Hypothesis Testing in High-dimensions.

[BibT_eX]

[DOI]

,

CoRR, 2017

Gradient Descent Can Take Exponential Time to Escape Saddle Points.

[BibT_eX]

[DOI]

,

,

,

Michael I. Jordan

,

,

Barnabás Póczos

Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

On the Learnability of Fully-Connected Neural Networks.

[BibT_eX]

[DOI]

,

,

Martin J. Wainwright

,

Michael I. Jordan

Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, 2017

Sketching Meets Random Projection in the Dual: A Provable Recovery Algorithm for Big and High-dimensional Data.

[BibT_eX]

[DOI]

,

,

Mehrdad Mahdavi

,

,

Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, 2017

Black-box Importance Sampling.

[BibT_eX]

[DOI]

,

Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, 2017

2016

Gradient Descent Converges to Minimizers.

[BibT_eX]

[DOI]

,

,

Michael I. Jordan

,

CoRR, 2016

Communication-efficient distributed statistical learning.

[BibT_eX]

[DOI]

Michael I. Jordan

,

,

CoRR, 2016

Matrix Completion has No Spurious Local Minimum.

[BibT_eX]

[DOI]

,

,

Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

L1-regularized Neural Networks are Improperly Learnable in Polynomial Time.

[BibT_eX]

[DOI]

,

,

Michael I. Jordan

Proceedings of the 33nd International Conference on Machine Learning, 2016

A Kernelized Stein Discrepancy for Goodness-of-fit Tests.

[BibT_eX]

[DOI]

,

,

Michael I. Jordan

Proceedings of the 33nd International Conference on Machine Learning, 2016

Gradient Descent Only Converges to Minimizers.

[BibT_eX]

[DOI]

,

,

Michael I. Jordan

,

Proceedings of the 29th Conference on Learning Theory, 2016

2015

Matrix completion and low-rank SVD via fast alternating least squares.

[BibT_eX]

[DOI]

,

,

,

J. Mach. Learn. Res., 2015

Learning Halfspaces and Neural Networks with Random Initialization.

[BibT_eX]

[DOI]

,

,

Martin J. Wainwright

,

Michael I. Jordan

CoRR, 2015

ℓ<sub>1</sub>-regularized Neural Networks are Improperly Learnable in Polynomial Time.

[BibT_eX]

[DOI]

,

,

Michael I. Jordan

CoRR, 2015

Communication-efficient sparse regression: a one-shot approach.

[BibT_eX]

[DOI]

,

,

,

Jonathan E. Taylor

CoRR, 2015

Distributed Stochastic Variance Reduced Gradient Methods.

[BibT_eX]

[DOI]

,

,

CoRR, 2015

Selective Inference and Learning Mixed Graphical Models.

[BibT_eX]

[DOI]

CoRR, 2015

Evaluating the statistical significance of biclusters.

[BibT_eX]

[DOI]

,

,

Jonathan E. Taylor

Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

2014

Proximal Newton-Type Methods for Minimizing Composite Functions.

[BibT_eX]

[DOI]

,

,

Michael A. Saunders

SIAM J. Optim., 2014

Exact Post Model Selection Inference for Marginal Screening.

[BibT_eX]

[DOI]

,

Jonathan E. Taylor

Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

Scalable Methods for Nonnegative Matrix Factorizations of Near-separable Tall-and-skinny Matrices.

[BibT_eX]

[DOI]

Austin R. Benson

,

,

,

David F. Gleich

Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

2013

On model selection consistency of regularized M-estimators.

[BibT_eX]

[DOI]

,

,

Jonathan E. Taylor

CoRR, 2013

On model selection consistency of penalized M-estimators: a geometric theory.

[BibT_eX]

[DOI]

,

,

Jonathan E. Taylor

Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013

Using multiple samples to learn mixture models.

[BibT_eX]

[DOI]

,

Ran Gilad-Bachrach

,

Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013

Structure Learning of Mixed Graphical Models.

[BibT_eX]

[DOI]

,

Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics, 2013

2012

Proximal Newton-type Methods for Minimizing Convex Objective Functions in Composite Form

[BibT_eX]

[DOI]

,

,

Michael A. Saunders

CoRR, 2012

Learning Mixed Graphical Models

[BibT_eX]

[DOI]

,

CoRR, 2012

Proximal Newton-type methods for convex optimization.

[BibT_eX]

[DOI]

,

,

Michael A. Saunders

Proceedings of the Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012

2010

Practical Large-Scale Optimization for Max-norm Regularization.

[BibT_eX]

[DOI]

,

,

Ruslan Salakhutdinov

,

,

Proceedings of the Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010. Proceedings of a meeting held 6-9 December 2010, 2010

2009

A distributed concurrent on-line test scheduling protocol for many-core NoC-based systems.

[BibT_eX]

[DOI]

,

Rabi N. Mahapatra

,

Praveen Bhojwani

Proceedings of the 27th International Conference on Computer Design, 2009

2008

An On-Demand Test Triggering Mechanism for NoC-Based Safety-Critical Systems.

[BibT_eX]

[DOI]

,

,

Praveen Bhojwani

,

Rabi N. Mahapatra

Proceedings of the 9th International Symposium on Quality of Electronic Design (ISQED 2008), 2008

In-field NoC-based SoC testing with distributed test vector storage.

[BibT_eX]

[DOI]

,

Rabi N. Mahapatra

Proceedings of the 26th International Conference on Computer Design, 2008

2007

SAPP: scalable and adaptable peak power management in nocs.

[BibT_eX]

[DOI]

Praveen Bhojwani

,

,

Rabi N. Mahapatra

Proceedings of the 2007 International Symposium on Low Power Electronics and Design, 2007

A Safety Analysis Framework for COTS Microprocessors in Safety-Critical Applications.

[BibT_eX]

[DOI]

,

Praveen Bhojwani

,

Rabi N. Mahapatra

Proceedings of the Tenth IEEE International Symposium on High Assurance Systems Engineering (HASE 2007), 2007

Loading...