Yair Carmon

Orcid: 0000-0001-5731-8640

According to our database1, Yair Carmon authored at least 45 papers between 2008 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Language models scale reliably with over-training and on downstream tasks.
CoRR, 2024

The Price of Adaptivity in Stochastic Convex Optimization.
CoRR, 2024

A Whole New Ball Game: A Primal Accelerated Method for Matrix Games and Minimizing the Maximum of Smooth Functions.
Proceedings of the 2024 ACM-SIAM Symposium on Discrete Algorithms, 2024

2023
Lower bounds for non-convex stochastic optimization.
Math. Program., May, 2023


Gradient Descent Monotonically Decreases the Sharpness of Gradient Flow Solutions in Scalar Networks and Beyond.
Proceedings of the International Conference on Machine Learning, 2023

DoG is SGD's Best Friend: A Parameter-Free Dynamic Step Size Schedule.
Proceedings of the International Conference on Machine Learning, 2023

Malign Overfitting: Interpolation and Invariance are Fundamentally at Odds.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

ReSQueing Parallel and Private Stochastic Convex Optimization.
Proceedings of the 64th IEEE Annual Symposium on Foundations of Computer Science, 2023

2022
Malign Overfitting: Interpolation Can Provably Preclude Invariance.
CoRR, 2022

Optimal and Adaptive Monteiro-Svaiter Acceleration.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Distributionally Robust Optimization via Ball Oracle Acceleration.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time.
Proceedings of the International Conference on Machine Learning, 2022

RECAPP: Crafting a More Efficient Catalyst for Convex Optimization.
Proceedings of the International Conference on Machine Learning, 2022

Scaling Laws Under the Microscope: Predicting Transformer Performance from Small Scale Experiments.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

Making SGD Parameter-Free.
Proceedings of the Conference on Learning Theory, 2-5 July 2022, London, UK., 2022

2021
Lower bounds for finding stationary points II: first-order methods.
Math. Program., 2021

Stochastic Bias-Reduced Gradient Methods.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Never Go Full Batch (in Stochastic Convex Optimization).
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Accuracy on the Line: on the Strong Correlation Between Out-of-Distribution and In-Distribution Generalization.
Proceedings of the 38th International Conference on Machine Learning, 2021

Thinking Inside the Ball: Near-Optimal Minimization of the Maximal Loss.
Proceedings of the Conference on Learning Theory, 2021

2020
First-Order Methods for Nonconvex Quadratic Minimization.
SIAM Rev., 2020

Lower bounds for finding stationary points I.
Math. Program., 2020

Large-Scale Methods for Distributionally Robust Optimization.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Acceleration with a Ball Optimization Oracle.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Coordinate Methods for Matrix Games.
Proceedings of the 61st IEEE Annual Symposium on Foundations of Computer Science, 2020

Second-Order Information in Non-Convex Stochastic Optimization: Power and Limitations.
Proceedings of the Conference on Learning Theory, 2020

2019
Gradient Descent Finds the Cubic-Regularized Nonconvex Newton Step.
SIAM J. Optim., 2019

Unlabeled Data Improves Adversarial Robustness.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Variance Reduction for Matrix Games.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

A Rank-1 Sketch for Matrix Multiplicative Weights.
Proceedings of the Conference on Learning Theory, 2019

2018
Accelerated Methods for NonConvex Optimization.
SIAM J. Optim., 2018

Analysis of Krylov Subspace Solutions of Regularized Non-Convex Quadratic Problems.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

2017
"Convex Until Proven Guilty": Dimension-Free Acceleration of Gradient Descent on Non-Convex Functions.
Proceedings of the 34th International Conference on Machine Learning, 2017

2016
Information, Estimation, and Lookahead in the Gaussian Channel.
IEEE Trans. Signal Process., 2016

No bad local minima: Data independent training error guarantees for multilayer neural networks.
CoRR, 2016

Accelerated Methods for Non-Convex Optimization.
CoRR, 2016

Gradient Descent Efficiently Finds the Cubic-Regularized Non-Convex Newton Step.
CoRR, 2016

2015
Comparison of the Achievable Rates in OFDM and Single Carrier Modulation with I.I.D. Inputs.
IEEE Trans. Inf. Theory, 2015

Lower Bounds and Approximations for the Information Rate of the ISI Channel.
IEEE Trans. Inf. Theory, 2015

2013
The role of lookahead in estimation under Gaussian noise.
Proceedings of the 2013 IEEE International Symposium on Information Theory, 2013

2012
On information, estimation and lookahead.
Proceedings of the 50th Annual Allerton Conference on Communication, 2012

2009
Markov decision processes with exponentially representable discounting.
Oper. Res. Lett., 2009

Partial Similarity of Shapes Using a Statistical Significance Measure.
IPSJ Trans. Comput. Vis. Appl., 2009

2008
Eventually-stationary policies for Markov decision models with non-constant discounting.
Proceedings of the 3rd International ICST Conference on Performance Evaluation Methodologies and Tools, 2008


  Loading...