Yair Carmon

Orcid: 0000-0001-5731-8640

According to our database¹, Yair Carmon authored at least 53 papers between 2008 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2025

The Sample Complexity of Parameter-Free Stochastic Convex Optimization.

[BibT_eX]

[DOI]

CoRR, June, 2025

An Analytical Model for Overparameterized Learning Under Class Imbalance.

[BibT_eX]

[DOI]

Eliav Mor

Yair Carmon

Trans. Mach. Learn. Res., 2025

Filter Like You Test: Data-Driven Data Filtering for CLIP Pretraining.

[BibT_eX]

[DOI]

Mikey Shechter

Yair Carmon

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

Convergence of Clipped SGD on Convex (L<sub>0</sub>, L<sub>1</sub>)-Smooth Functions.

[BibT_eX]

[DOI]

Ofir Gaash

Kfir Y. Levy

Yair Carmon

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

Extracting Dual Solutions via Primal Optimizers.

[BibT_eX]

[DOI]

Proceedings of the 16th Innovations in Theoretical Computer Science Conference, 2025

2024

Language models scale reliably with over-training and on downstream tasks.

[BibT_eX]

[DOI]

CoRR, 2024

A Whole New Ball Game: A Primal Accelerated Method for Matrix Games and Minimizing the Maximum of Smooth Functions.

[BibT_eX]

[DOI]

Proceedings of the 2024 ACM-SIAM Symposium on Discrete Algorithms, 2024

Resolving Discrepancies in Compute-Optimal Scaling of Language Models.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

DataComp-LM: In search of the next generation of training sets for language models.

[BibT_eX]

[DOI]

Khyathi Raghavi Chandu

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Accelerated Parameter-Free Stochastic Optimization.

[BibT_eX]

[DOI]

Proceedings of the Thirty Seventh Annual Conference on Learning Theory, June 30, 2024

The Price of Adaptivity in Stochastic Convex Optimization.

[BibT_eX]

[DOI]

Yair Carmon

Oliver Hinder

Proceedings of the Thirty Seventh Annual Conference on Learning Theory, June 30, 2024

2023

Lower bounds for non-convex stochastic optimization.

[BibT_eX]

[DOI]

Math. Program., May, 2023

DataComp: In search of the next generation of multimodal datasets.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Gradient Descent Monotonically Decreases the Sharpness of Gradient Flow Solutions in Scalar Networks and Beyond.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

DoG is SGD's Best Friend: A Parameter-Free Dynamic Step Size Schedule.

[BibT_eX]

[DOI]

Maor Ivgi

Oliver Hinder

Yair Carmon

Proceedings of the International Conference on Machine Learning, 2023

Malign Overfitting: Interpolation and Invariance are Fundamentally at Odds.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

ReSQueing Parallel and Private Stochastic Convex Optimization.

[BibT_eX]

[DOI]

Proceedings of the 64th IEEE Annual Symposium on Foundations of Computer Science, 2023

2022

Malign Overfitting: Interpolation Can Provably Preclude Invariance.

[BibT_eX]

[DOI]

CoRR, 2022

Optimal and Adaptive Monteiro-Svaiter Acceleration.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Distributionally Robust Optimization via Ball Oracle Acceleration.

[BibT_eX]

[DOI]

Yair Carmon

Danielle Hausler

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time.

[BibT_eX]

[DOI]

Raphael Gontijo Lopes

Proceedings of the International Conference on Machine Learning, 2022

RECAPP: Crafting a More Efficient Catalyst for Convex Optimization.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

Scaling Laws Under the Microscope: Predicting Transformer Performance from Small Scale Experiments.

[BibT_eX]

[DOI]

Maor Ivgi

Yair Carmon

Jonathan Berant

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

Making SGD Parameter-Free.

[BibT_eX]

[DOI]

Yair Carmon

Oliver Hinder

Proceedings of the Conference on Learning Theory, 2-5 July 2022, London, UK., 2022

2021

Lower bounds for finding stationary points II: first-order methods.

[BibT_eX]

[DOI]

Math. Program., 2021

Stochastic Bias-Reduced Gradient Methods.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Never Go Full Batch (in Stochastic Convex Optimization).

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Accuracy on the Line: on the Strong Correlation Between Out-of-Distribution and In-Distribution Generalization.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

Thinking Inside the Ball: Near-Optimal Minimization of the Maximal Loss.

[BibT_eX]

[DOI]

Proceedings of the Conference on Learning Theory, 2021

2020

First-Order Methods for Nonconvex Quadratic Minimization.

[BibT_eX]

[DOI]

Yair Carmon

John C. Duchi

SIAM Rev., 2020

Lower bounds for finding stationary points I.

[BibT_eX]

[DOI]

Math. Program., 2020

Large-Scale Methods for Distributionally Robust Optimization.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Acceleration with a Ball Optimization Oracle.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Coordinate Methods for Matrix Games.

[BibT_eX]

[DOI]

Proceedings of the 61st IEEE Annual Symposium on Foundations of Computer Science, 2020

Second-Order Information in Non-Convex Stochastic Optimization: Power and Limitations.

[BibT_eX]

[DOI]

Proceedings of the Conference on Learning Theory, 2020

2019

Gradient Descent Finds the Cubic-Regularized Nonconvex Newton Step.

[BibT_eX]

[DOI]

Yair Carmon

John C. Duchi

SIAM J. Optim., 2019

Unlabeled Data Improves Adversarial Robustness.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Variance Reduction for Matrix Games.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

A Rank-1 Sketch for Matrix Multiplicative Weights.

[BibT_eX]

[DOI]

Proceedings of the Conference on Learning Theory, 2019

2018

Accelerated Methods for NonConvex Optimization.

[BibT_eX]

[DOI]

SIAM J. Optim., 2018

Analysis of Krylov Subspace Solutions of Regularized Non-Convex Quadratic Problems.

[BibT_eX]

[DOI]

Yair Carmon

John C. Duchi

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

2017

"Convex Until Proven Guilty": Dimension-Free Acceleration of Gradient Descent on Non-Convex Functions.

[BibT_eX]

[DOI]

Proceedings of the 34th International Conference on Machine Learning, 2017

2016

Information, Estimation, and Lookahead in the Gaussian Channel.

[BibT_eX]

[DOI]

IEEE Trans. Signal Process., 2016

No bad local minima: Data independent training error guarantees for multilayer neural networks.

[BibT_eX]

[DOI]

Daniel Soudry

Yair Carmon

CoRR, 2016

Accelerated Methods for Non-Convex Optimization.

[BibT_eX]

[DOI]

CoRR, 2016

Gradient Descent Efficiently Finds the Cubic-Regularized Non-Convex Newton Step.

[BibT_eX]

[DOI]

Yair Carmon

John C. Duchi

CoRR, 2016

2015

Comparison of the Achievable Rates in OFDM and Single Carrier Modulation with I.I.D. Inputs.

[BibT_eX]

[DOI]

Yair Carmon

Shlomo Shamai

Tsachy Weissman

IEEE Trans. Inf. Theory, 2015

Lower Bounds and Approximations for the Information Rate of the ISI Channel.

[BibT_eX]

[DOI]

Yair Carmon

Shlomo Shamai

IEEE Trans. Inf. Theory, 2015

2013

The role of lookahead in estimation under Gaussian noise.

[BibT_eX]

[DOI]

Proceedings of the 2013 IEEE International Symposium on Information Theory, 2013

2012

On information, estimation and lookahead.

[BibT_eX]

[DOI]

Proceedings of the 50th Annual Allerton Conference on Communication, 2012

2009

Markov decision processes with exponentially representable discounting.

[BibT_eX]

[DOI]

Yair Carmon

Adam Shwartz

Oper. Res. Lett., 2009

Partial Similarity of Shapes Using a Statistical Significance Measure.

[BibT_eX]

[DOI]

Alexander M. Bronstein

Michael M. Bronstein

Yair Carmon

Ron Kimmel

IPSJ Trans. Comput. Vis. Appl., 2009

2008

Eventually-stationary policies for Markov decision models with non-constant discounting.

[BibT_eX]

[DOI]

Yair Carmon

Adam Shwartz

Proceedings of the 3rd International ICST Conference on Performance Evaluation Methodologies and Tools, 2008

Yair Carmon

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...