Niladri S. Chatterji

Saminul Haque

Tatsunori Hashimoto

Trans. Mach. Learn. Res., 2023

Random Feature Amplification: Feature Learning and Generalization in Neural Networks.

[BibT_eX]

[DOI]

Spencer Frei

J. Mach. Learn. Res., 2023

Deep linear networks can benignly overfit when shallow ones do.

[BibT_eX]

[DOI]

J. Mach. Learn. Res., 2023

2022

The Interplay Between Implicit Bias and Benign Overfitting in Two-Layer Linear Networks.

[BibT_eX]

[DOI]

J. Mach. Learn. Res., 2022

Foolish Crowds Support Benign Overfitting.

[BibT_eX]

[DOI]

Niladri Shekhar Chatterji

J. Mach. Learn. Res., 2022

Is Importance Weighting Incompatible with Interpolating Classifiers?

[BibT_eX]

[DOI]

Ke Alexander Wang

Saminul Haque

Tatsunori Hashimoto

Proceedings of the Tenth International Conference on Learning Representations, 2022

Benign Overfitting without Linearity: Neural Network Classifiers Trained by Gradient Descent for Noisy Linear Data.

[BibT_eX]

[DOI]

Spencer Frei

Proceedings of the Conference on Learning Theory, 2-5 July 2022, London, UK., 2022

2021

Why do Gradient Methods Work in Optimization and Sampling?

[BibT_eX]

[DOI]

PhD thesis, 2021

When Does Gradient Descent with Logistic Loss Find Interpolating Two-Layer Networks?

[BibT_eX]

[DOI]

J. Mach. Learn. Res., 2021

Finite-sample Analysis of Interpolating Linear Classifiers in the Overparameterized Regime.

[BibT_eX]

[DOI]

J. Mach. Learn. Res., 2021

On the Opportunities and Risks of Foundation Models.

[BibT_eX]

[DOI]

et al.

CoRR, 2021

On the Theory of Reinforcement Learning with Once-per-Episode Feedback.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

When does gradient descent with logistic loss interpolate using deep networks with smoothed ReLU activations?

[BibT_eX]

[DOI]

Proceedings of the Conference on Learning Theory, 2021

2020

Oracle lower bounds for stochastic gradient sampling algorithms.

[BibT_eX]

[DOI]

CoRR, 2020

The intriguing role of module criticality in the generalization of deep networks.

[BibT_eX]

[DOI]

Behnam Neyshabur

Hanie Sedghi

Proceedings of the 8th International Conference on Learning Representations, 2020

OSOM: A simultaneously optimal algorithm for multi-armed and linear contextual bandits.

[BibT_eX]

[DOI]

Vidya Muthukumar

Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

Langevin Monte Carlo without smoothness.

[BibT_eX]

[DOI]

Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

2019

Is There an Analog of Nesterov Acceleration for MCMC?

[BibT_eX]

[DOI]

CoRR, 2019

Online learning with kernel losses.

[BibT_eX]

[DOI]

Aldo Pacchiano

Proceedings of the 36th International Conference on Machine Learning, 2019

2018

Sharp Convergence Rates for Langevin Dynamics in the Nonconvex Setting.

[BibT_eX]

[DOI]

CoRR, 2018

On the Theory of Variance Reduction for Stochastic Gradient Monte Carlo.

[BibT_eX]

[DOI]

Proceedings of the 35th International Conference on Machine Learning, 2018

Underdamped Langevin MCMC: A non-asymptotic analysis.

[BibT_eX]

[DOI]

Proceedings of the Conference On Learning Theory, 2018

2017

Alternating minimization for dictionary learning with random initialization.

[BibT_eX]

[DOI]