Sebastian U. Stich

CoRR, 2024

2023

Stochastic distributed learning with gradient quantization and double-variance reduction.

[BibT_eX]

[DOI]

Samuel Horváth

Dmitry Kovalev

Konstantin Mishchenko

Peter Richtárik

Optim. Methods Softw., January, 2023

EControl: Fast Distributed Optimization with Compression and Error Control.

[BibT_eX]

[DOI]

Yuan Gao

Rustem Islamov

CoRR, 2023

Locally Adaptive Federated Learning via Stochastic Polyak Stepsizes.

[BibT_eX]

[DOI]

Sohom Mukherjee

Nicolas Loizou

CoRR, 2023

Synthetic data shuffling accelerates the convergence of federated learning under data heterogeneity.

[BibT_eX]

[DOI]

CoRR, 2023

Communication-Efficient Gradient Descent-Accent Methods for Distributed Variational Inequalities: Unified Analysis and Local Updates.

[BibT_eX]

[DOI]

CoRR, 2023

Shuffle SGD is Always Better than SGD: Improved Analysis of SGD with Arbitrary Data Orders.

[BibT_eX]

[DOI]

CoRR, 2023

Decentralized Gradient Tracking with Local Steps.

[BibT_eX]

[DOI]

CoRR, 2023

Adaptive SGD with Polyak stepsize and Line-search: Robust Convergence and Variance Reduction.

[BibT_eX]

[DOI]

Xiaowen Jiang

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Special Properties of Gradient Descent with Large Learning Rates.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

Revisiting Gradient Clipping: Stochastic bias and tight convergence guarantees.

[BibT_eX]

[DOI]

Hadrien Hendrikx

Proceedings of the International Conference on Machine Learning, 2023

On the Effectiveness of Partial Variance Reduction in Federated Learning with Heterogeneous Data.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022

Partial Variance Reduction improves Non-Convex Federated learning on heterogeneous data.

[BibT_eX]

[DOI]

CoRR, 2022

On Avoiding Local Minima Using Gradient Descent With Large Learning Rates.

[BibT_eX]

[DOI]

CoRR, 2022

Data-heterogeneity-aware Mixing for Decentralized Learning.

[BibT_eX]

[DOI]

CoRR, 2022

Tackling benign nonconvexity with smoothing and stochastic gradients.

[BibT_eX]

[DOI]

Harsh Vardhan

CoRR, 2022

Characterizing & Finding Good Data Orderings for Fast Convergence of Sequential Gradient Methods.

[BibT_eX]

[DOI]

CoRR, 2022

Sharper Convergence Guarantees for Asynchronous SGD for Distributed and Federated Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Decentralized Local Stochastic Extra-Gradient for Variational Inequalities.

[BibT_eX]

[DOI]

Aleksandr Beznosikov

Pavel E. Dvurechensky

Valentin Samokhin

Alexander V. Gasnikov

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

ProgFed: Effective, Communication, and Computation Efficient Federated Learning by Progressive Training.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

ProxSkip: Yes! Local Gradient Steps Provably Lead to Communication Acceleration! Finally!

[BibT_eX]

[DOI]

Konstantin Mishchenko

Grigory Malinovsky

Peter Richtárik

Proceedings of the International Conference on Machine Learning, 2022

Masked Training of Neural Networks with Partial Gradients.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2022

2021

Advances and Open Problems in Federated Learning.

[BibT_eX]

[DOI]

Rafael G. L. D'Oliveira

Found. Trends Mach. Learn., 2021

The Peril of Popular Deep Learning Uncertainty Estimation Methods.

[BibT_eX]

[DOI]

CoRR, 2021

Linear Speedup in Personalized Collaborative Learning.

[BibT_eX]

[DOI]

El Mahdi Chayti

Nicolas Flammarion

CoRR, 2021

On Second-order Optimization Methods for Federated Learning.

[BibT_eX]

[DOI]

CoRR, 2021

A Field Guide to Federated Optimization.

[BibT_eX]

[DOI]

CoRR, 2021

Simultaneous Training of Partially Masked Neural Networks.

[BibT_eX]

[DOI]

CoRR, 2021

RelaySum for Decentralized Deep Learning on Heterogeneous Data.

[BibT_eX]

[DOI]

Thijs Vogels

Lie He

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

An Improved Analysis of Gradient Tracking for Decentralized Machine Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Breaking the centralized barrier for cross-device federated learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Quasi-global Momentum: Accelerating Decentralized Deep Learning on Heterogeneous Data.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

Consensus Control for Decentralized Deep Learning.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

Taming GANs with Lookahead-Minmax.

[BibT_eX]

[DOI]

Proceedings of the 9th International Conference on Learning Representations, 2021

Semantic Perturbations with Normalizing Flows for Improved Generalization.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Critical Parameters for Scalable Distributed Learning with Large Batches and Asynchronous Updates.

[BibT_eX]

[DOI]

Hossein Shokri Ghadikolaei

Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021

A Linearly Convergent Algorithm for Decentralized Optimization: Sending Less Bits for Free!

[BibT_eX]

[DOI]

Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021

LENA: Communication-Efficient Distributed Learning with Self-Triggered Gradient Uploads.

[BibT_eX]

[DOI]

Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021

2020

On Communication Compression for Distributed Optimization on Heterogeneous Data.

[BibT_eX]

[DOI]

CoRR, 2020

Mime: Mimicking Centralized Stochastic Algorithms in Federated Learning.

[BibT_eX]

[DOI]

CoRR, 2020

Analysis of SGD with Biased Gradient Estimators.

[BibT_eX]

[DOI]

Ahmad Ajalloeian

CoRR, 2020

Ensemble Distillation for Robust Model Fusion in Federated Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Is Local SGD Better than Minibatch SGD?

[BibT_eX]

[DOI]

Proceedings of the 37th International Conference on Machine Learning, 2020

Extrapolation for Large-batch Training in Deep Learning.

[BibT_eX]

[DOI]

Proceedings of the 37th International Conference on Machine Learning, 2020

A Unified Theory of Decentralized SGD with Changing Topology and Local Updates.

[BibT_eX]

[DOI]

Proceedings of the 37th International Conference on Machine Learning, 2020

SCAFFOLD: Stochastic Controlled Averaging for Federated Learning.

[BibT_eX]

[DOI]

Proceedings of the 37th International Conference on Machine Learning, 2020

Don't Use Large Mini-batches, Use Local SGD.

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Learning Representations, 2020

Dynamic Model Pruning with Feedback.

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Learning Representations, 2020

Decentralized Deep Learning with Arbitrary Communication Compression.

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Learning Representations, 2020

2019

Advances and Open Problems in Federated Learning.

[BibT_eX]

[DOI]

Rafael G. L. D'Oliveira

CoRR, 2019

SCAFFOLD: Stochastic Controlled Averaging for On-Device Federated Learning.

[BibT_eX]

[DOI]

CoRR, 2019

The Error-Feedback Framework: Better Rates for SGD with Delayed Gradients and Compressed Communication.

[BibT_eX]

[DOI]

CoRR, 2019

Unified Optimal Analysis of the (Stochastic) Gradient Method.

[BibT_eX]

[DOI]

CoRR, 2019

Decentralized Stochastic Optimization and Gossip Algorithms with Compressed Communication.

[BibT_eX]

[DOI]

Proceedings of the 36th International Conference on Machine Learning, 2019

Error Feedback Fixes SignSGD and other Gradient Compression Schemes.

[BibT_eX]

[DOI]

Quentin Rebjock

Proceedings of the 36th International Conference on Machine Learning, 2019

Local SGD Converges Fast and Communicates Little.

[BibT_eX]

[DOI]

Proceedings of the 7th International Conference on Learning Representations, 2019

Efficient Greedy Coordinate Descent for Composite Problems.

[BibT_eX]

[DOI]

Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, 2019

2018

Don't Use Large Mini-Batches, Use Local SGD.

[BibT_eX]

[DOI]

CoRR, 2018

Global linear convergence of Newton's method without strong-convexity or Lipschitz gradients.

[BibT_eX]

[DOI]

CoRR, 2018

SVRG meets SAGA: k-SVRG - A Tale of Limited Memory.

[BibT_eX]

[DOI]

CoRR, 2018

Revisiting First-Order Convex Optimization Over Linear Spaces.

[BibT_eX]

[DOI]

Francesco Locatello

CoRR, 2018

Sparsified SGD with Memory.

[BibT_eX]

[DOI]

Jean-Baptiste Cordonnier

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Accelerated Stochastic Matrix Inversion: General Theory and Speeding up BFGS Rules for Faster Second-Order Optimization.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

On Matching Pursuit and Coordinate Descent.

[BibT_eX]

[DOI]

Francesco Locatello

Sai Praneeth Reddy Karimireddy

Proceedings of the 35th International Conference on Machine Learning, 2018

Adaptive balancing of gradient and update computation times using global geometry and approximate subproblems.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2018

2017

Efficiency of the Accelerated Coordinate Descent Method on Structured Optimization Problems.

[BibT_eX]

[DOI]

Yurii E. Nesterov

Hossein Nassajian Mojarrad

SIAM J. Optim., 2017

On the existence of ordinary triangles.

[BibT_eX]

[DOI]

Radoslav Fulek

Comput. Geom., 2017

Safe Adaptive Importance Sampling.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Approximate Steepest Coordinate Descent.

[BibT_eX]

[DOI]

Proceedings of the 34th International Conference on Machine Learning, 2017

2016

On Two Continuum Armed Bandit Problems in High Dimensions.

[BibT_eX]

[DOI]

Hemant Tyagi

Theory Comput. Syst., 2016

Variable metric random pursuit.

[BibT_eX]

[DOI]

Christian L. Müller

Math. Program., 2016

2014

Convex Optimization with Random Pursuit.

[BibT_eX]

[DOI]

PhD thesis, 2014

On low complexity Acceleration Techniques for Randomized Optimization: Supplementary Online Material.

[BibT_eX]

[DOI]

CoRR, 2014

On Low Complexity Acceleration Techniques for Randomized Optimization.

[BibT_eX]

[DOI]

Sebastian Urban Stich

Proceedings of the Parallel Problem Solving from Nature - PPSN XIII, 2014

2013

Optimization of Convex Functions with Random Pursuit.

[BibT_eX]

[DOI]

Christian L. Müller

SIAM J. Optim., 2013

Stochastic continuum armed bandit problem of few linear parameters in high dimensions.

[BibT_eX]

[DOI]

Hemant Tyagi

CoRR, 2013

2012

On Spectral Invariance of Randomized Hessian and Covariance Matrix Adaptation Schemes.

[BibT_eX]

[DOI]

Christian L. Müller

Proceedings of the Parallel Problem Solving from Nature - PPSN XII, 2012

2009

On Two Problems Regarding the Hamiltonian Cycle Game.

[BibT_eX]

[DOI]

Dan Hefetz