Sebastian U. Stich

According to our database1, Sebastian U. Stich authored at least 78 papers between 2009 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Non-Convex Stochastic Composite Optimization with Polyak Momentum.
CoRR, 2024

2023
Stochastic distributed learning with gradient quantization and double-variance reduction.
Optim. Methods Softw., January, 2023

EControl: Fast Distributed Optimization with Compression and Error Control.
CoRR, 2023

Locally Adaptive Federated Learning via Stochastic Polyak Stepsizes.
CoRR, 2023

Synthetic data shuffling accelerates the convergence of federated learning under data heterogeneity.
CoRR, 2023

Communication-Efficient Gradient Descent-Accent Methods for Distributed Variational Inequalities: Unified Analysis and Local Updates.
CoRR, 2023

Shuffle SGD is Always Better than SGD: Improved Analysis of SGD with Arbitrary Data Orders.
CoRR, 2023

Decentralized Gradient Tracking with Local Steps.
CoRR, 2023

Adaptive SGD with Polyak stepsize and Line-search: Robust Convergence and Variance Reduction.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Special Properties of Gradient Descent with Large Learning Rates.
Proceedings of the International Conference on Machine Learning, 2023

Revisiting Gradient Clipping: Stochastic bias and tight convergence guarantees.
Proceedings of the International Conference on Machine Learning, 2023

On the Effectiveness of Partial Variance Reduction in Federated Learning with Heterogeneous Data.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022
Partial Variance Reduction improves Non-Convex Federated learning on heterogeneous data.
CoRR, 2022

On Avoiding Local Minima Using Gradient Descent With Large Learning Rates.
CoRR, 2022

Data-heterogeneity-aware Mixing for Decentralized Learning.
CoRR, 2022

Tackling benign nonconvexity with smoothing and stochastic gradients.
CoRR, 2022

Characterizing & Finding Good Data Orderings for Fast Convergence of Sequential Gradient Methods.
CoRR, 2022

Sharper Convergence Guarantees for Asynchronous SGD for Distributed and Federated Learning.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Decentralized Local Stochastic Extra-Gradient for Variational Inequalities.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

ProgFed: Effective, Communication, and Computation Efficient Federated Learning by Progressive Training.
Proceedings of the International Conference on Machine Learning, 2022

ProxSkip: Yes! Local Gradient Steps Provably Lead to Communication Acceleration! Finally!
Proceedings of the International Conference on Machine Learning, 2022

Masked Training of Neural Networks with Partial Gradients.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2022

2021
Advances and Open Problems in Federated Learning.
Found. Trends Mach. Learn., 2021

The Peril of Popular Deep Learning Uncertainty Estimation Methods.
CoRR, 2021

Linear Speedup in Personalized Collaborative Learning.
CoRR, 2021

On Second-order Optimization Methods for Federated Learning.
CoRR, 2021

A Field Guide to Federated Optimization.
CoRR, 2021

Simultaneous Training of Partially Masked Neural Networks.
CoRR, 2021

RelaySum for Decentralized Deep Learning on Heterogeneous Data.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

An Improved Analysis of Gradient Tracking for Decentralized Machine Learning.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Breaking the centralized barrier for cross-device federated learning.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Quasi-global Momentum: Accelerating Decentralized Deep Learning on Heterogeneous Data.
Proceedings of the 38th International Conference on Machine Learning, 2021

Consensus Control for Decentralized Deep Learning.
Proceedings of the 38th International Conference on Machine Learning, 2021

Taming GANs with Lookahead-Minmax.
Proceedings of the 9th International Conference on Learning Representations, 2021

Semantic Perturbations with Normalizing Flows for Improved Generalization.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Critical Parameters for Scalable Distributed Learning with Large Batches and Asynchronous Updates.
Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021

A Linearly Convergent Algorithm for Decentralized Optimization: Sending Less Bits for Free!
Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021

LENA: Communication-Efficient Distributed Learning with Self-Triggered Gradient Uploads.
Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021

2020
On Communication Compression for Distributed Optimization on Heterogeneous Data.
CoRR, 2020

Mime: Mimicking Centralized Stochastic Algorithms in Federated Learning.
CoRR, 2020

Analysis of SGD with Biased Gradient Estimators.
CoRR, 2020

Ensemble Distillation for Robust Model Fusion in Federated Learning.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Is Local SGD Better than Minibatch SGD?
Proceedings of the 37th International Conference on Machine Learning, 2020

Extrapolation for Large-batch Training in Deep Learning.
Proceedings of the 37th International Conference on Machine Learning, 2020

A Unified Theory of Decentralized SGD with Changing Topology and Local Updates.
Proceedings of the 37th International Conference on Machine Learning, 2020

SCAFFOLD: Stochastic Controlled Averaging for Federated Learning.
Proceedings of the 37th International Conference on Machine Learning, 2020

Don't Use Large Mini-batches, Use Local SGD.
Proceedings of the 8th International Conference on Learning Representations, 2020

Dynamic Model Pruning with Feedback.
Proceedings of the 8th International Conference on Learning Representations, 2020

Decentralized Deep Learning with Arbitrary Communication Compression.
Proceedings of the 8th International Conference on Learning Representations, 2020

2019
Advances and Open Problems in Federated Learning.
CoRR, 2019

SCAFFOLD: Stochastic Controlled Averaging for On-Device Federated Learning.
CoRR, 2019

The Error-Feedback Framework: Better Rates for SGD with Delayed Gradients and Compressed Communication.
CoRR, 2019

Unified Optimal Analysis of the (Stochastic) Gradient Method.
CoRR, 2019

Decentralized Stochastic Optimization and Gossip Algorithms with Compressed Communication.
Proceedings of the 36th International Conference on Machine Learning, 2019

Error Feedback Fixes SignSGD and other Gradient Compression Schemes.
Proceedings of the 36th International Conference on Machine Learning, 2019

Local SGD Converges Fast and Communicates Little.
Proceedings of the 7th International Conference on Learning Representations, 2019

Efficient Greedy Coordinate Descent for Composite Problems.
Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, 2019

2018
Don't Use Large Mini-Batches, Use Local SGD.
CoRR, 2018

Global linear convergence of Newton's method without strong-convexity or Lipschitz gradients.
CoRR, 2018

SVRG meets SAGA: k-SVRG - A Tale of Limited Memory.
CoRR, 2018

Revisiting First-Order Convex Optimization Over Linear Spaces.
CoRR, 2018

Sparsified SGD with Memory.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Accelerated Stochastic Matrix Inversion: General Theory and Speeding up BFGS Rules for Faster Second-Order Optimization.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

On Matching Pursuit and Coordinate Descent.
Proceedings of the 35th International Conference on Machine Learning, 2018

Adaptive balancing of gradient and update computation times using global geometry and approximate subproblems.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2018

2017
Efficiency of the Accelerated Coordinate Descent Method on Structured Optimization Problems.
SIAM J. Optim., 2017

On the existence of ordinary triangles.
Comput. Geom., 2017

Safe Adaptive Importance Sampling.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Approximate Steepest Coordinate Descent.
Proceedings of the 34th International Conference on Machine Learning, 2017

2016
On Two Continuum Armed Bandit Problems in High Dimensions.
Theory Comput. Syst., 2016

Variable metric random pursuit.
Math. Program., 2016

2014
Convex Optimization with Random Pursuit.
PhD thesis, 2014

On low complexity Acceleration Techniques for Randomized Optimization: Supplementary Online Material.
CoRR, 2014

On Low Complexity Acceleration Techniques for Randomized Optimization.
Proceedings of the Parallel Problem Solving from Nature - PPSN XIII, 2014

2013
Optimization of Convex Functions with Random Pursuit.
SIAM J. Optim., 2013

Stochastic continuum armed bandit problem of few linear parameters in high dimensions.
CoRR, 2013

2012
On Spectral Invariance of Randomized Hessian and Covariance Matrix Adaptation Schemes.
Proceedings of the Parallel Problem Solving from Nature - PPSN XII, 2012

2009
On Two Problems Regarding the Hamiltonian Cycle Game.
Electron. J. Comb., 2009


  Loading...