Kun Yuan

CoRR, September, 2025

From PowerSGD to PowerSGD+: Low-Rank Gradient Compression for Distributed Optimization with Convergence Guarantees.

[BibT_eX]

[DOI]

Shengping Xie

Chuyan Chen

CoRR, September, 2025

Greedy Low-Rank Gradient Compression for Distributed Learning with Convergence Guarantees.

[BibT_eX]

[DOI]

CoRR, July, 2025

BEVHeight++: Toward Robust Visual Centric 3D Object Detection.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., June, 2025

Efficient Long-Context LLM Inference via KV Cache Clustering.

[BibT_eX]

[DOI]

CoRR, June, 2025

TAH-QUANT: Effective Activation Quantization in Pipeline Parallelism over Slow Network.

[BibT_eX]

[DOI]

CoRR, June, 2025

Optimal Complexity in Byzantine-Robust Distributed Stochastic Optimization with Data Heterogeneity.

[BibT_eX]

[DOI]

CoRR, March, 2025

A Memory Efficient Randomized Subspace Optimization Method for Training Large Language Models.

[BibT_eX]

[DOI]

CoRR, February, 2025

CE-LoRA: Computation-Efficient LoRA Fine-Tuning for Language Models.

[BibT_eX]

[DOI]

CoRR, February, 2025

Understanding the Influence of Digraphs on Decentralized Optimization: Effective Metrics, Lower Bound, and Optimal Algorithm.

[BibT_eX]

[DOI]

SIAM J. Optim., 2025

Subspace Optimization for Large Language Models with Convergence Guarantees.

[BibT_eX]

[DOI]

Proceedings of the Forty-second International Conference on Machine Learning, 2025

Enhancing Zeroth-order Fine-tuning for Language Models with Low-rank Structures.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

HATA: Trainable and Hardware-Efficient Hash-Aware Top-k Attention for Scalable Large Model Inference.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2025

2024

S$^\text{3}$Attention: Improving Long Sequence Attention With Smoothed Skeleton Sketching.

[BibT_eX]

[DOI]

IEEE J. Sel. Top. Signal Process., September, 2024

An enhanced gradient-tracking bound for distributed online stochastic convex optimization.

[BibT_eX]

[DOI]

Signal Process., April, 2024

Gradient Normalization with(out) Clipping Ensures Convergence of Nonconvex SGD under Heavy-Tailed Noise with Improved Results.

[BibT_eX]

[DOI]

Tao Sun

Xinwang Liu

CoRR, 2024

S<sup>3</sup>Attention: Improving Long Sequence Attention with Smoothed Skeleton Sketching.

[BibT_eX]

[DOI]

CoRR, 2024

Decentralized Bilevel Optimization over Graphs: Loopless Algorithmic Update and Transient Iteration Complexity.

[BibT_eX]

[DOI]

CoRR, 2024

SPARKLE: A Unified Single-Loop Primal-Dual Framework for Decentralized Bilevel Optimization.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Distributed Bilevel Optimization with Communication Compression.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Momentum Benefits Non-iid Federated Learning Simply and Provably.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Asynchronous Diffusion Learning with Agent Subsampling and Local Updates.

[BibT_eX]

[DOI]

Elsa Rizk

Proceedings of the IEEE International Conference on Acoustics, 2024

2023

Removing Data Heterogeneity Influence Enhances Network Topology Dependence of Decentralized SGD.

[BibT_eX]

[DOI]

J. Mach. Learn. Res., 2023

Model-free Test Time Adaptation for Out-Of-Distribution Detection.

[BibT_eX]

[DOI]

CoRR, 2023

RandCom: Random Communication Skipping Method for Decentralized Stochastic Optimization.

[BibT_eX]

[DOI]

Luyao Guo

Laurent Condat

CoRR, 2023

BEVHeight++: Toward Robust Visual Centric 3D Object Detection.

[BibT_eX]

[DOI]

CoRR, 2023

Momentum Benefits Non-IID Federated Learning Simply and Provably.

[BibT_eX]

[DOI]

Ziheng Cheng

CoRR, 2023

Lower Bounds and Accelerated Algorithms in Distributed Stochastic Optimization with Communication Compression.

[BibT_eX]

[DOI]

CoRR, 2023

Unbiased Compression Saves Communication in Distributed Optimization: When and How Much?

[BibT_eX]

[DOI]

Yutong He

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

AdaNPC: Exploring Non-Parametric Classifier for Test-Time Adaptation.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

DSGD-CECA: Decentralized SGD with Communication-Optimal Exact Consensus Algorithm.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

BEVHeight: A Robust Framework for Vision-based Roadside 3D Object Detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Achieving Linear Speedup with Network-Independent Learning Rates in Decentralized Stochastic Optimization.

[BibT_eX]

[DOI]

Hao Yuan

Proceedings of the 62nd IEEE Conference on Decision and Control, 2023

On the Performance of Gradient Tracking with Local Updates.

[BibT_eX]

[DOI]

Edward Duc Hien Nguyen

César A. Uribe

Proceedings of the 62nd IEEE Conference on Decision and Control, 2023

2022

A Unified and Refined Convergence Analysis for Non-Convex Decentralized Learning.

[BibT_eX]

[DOI]

IEEE Trans. Signal Process., 2022

Optimal Complexity in Non-Convex Decentralized Learning over Time-Varying Networks.

[BibT_eX]

[DOI]

CoRR, 2022

Heavy-Tail Phenomenon in Decentralized SGD.

[BibT_eX]

[DOI]

CoRR, 2022

Revisiting Optimal Convergence Rate for Smooth and Non-convex Stochastic Decentralized Optimization.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Communication-Efficient Topologies for Decentralized Learning with $O(1)$ Consensus Rate.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Lower Bounds and Nearly Optimal Algorithms in Distributed Learning with Communication Compression.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Effective Model Sparsification by Scheduled Grow-and-Prune Methods.

[BibT_eX]

[DOI]

Proceedings of the Tenth International Conference on Learning Representations, 2022

A Byzantine-Resilient Dual Subgradient Method for Vertical Federated Learning.

[BibT_eX]

[DOI]

Zhaoxian Wu

Proceedings of the IEEE International Conference on Acoustics, 2022

CHEX: CHannel EXploration for CNN Model Compression.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021

Multiagent Fully Decentralized Value Function Learning With Linear Convergence Rates.

[BibT_eX]

[DOI]

Lucas Cassano

IEEE Trans. Autom. Control., 2021

Decentralized Proximal Gradient Algorithms With Linear Convergence Rates.

[BibT_eX]

[DOI]

Ernest K. Ryu

IEEE Trans. Autom. Control., 2021

BlueFog: Make Decentralized Algorithms Practical for Optimization and Deep Learning.

[BibT_eX]

[DOI]

CoRR, 2021

Decentralized Composite Optimization with Compression.

[BibT_eX]

[DOI]

CoRR, 2021

Removing Data Heterogeneity Influence Enhances Network Topology Dependence of Decentralized SGD.

[BibT_eX]

[DOI]

CoRR, 2021

On the Comparison between Cyclic Sampling and Random Reshuffling.

[BibT_eX]

[DOI]

CoRR, 2021

Exponential Graph is Provably Efficient for Decentralized Deep Training.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

An Improved Analysis and Rates for Variance Reduction under Without-replacement Sampling Orders.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Accelerating Gossip SGD with Periodic Global Averaging.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

DecentLaM: Decentralized Momentum SGD for Large-batch Deep Training.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

2020

Can Primal Methods Outperform Primal-Dual Methods in Decentralized Dynamic Optimization?

[BibT_eX]

[DOI]

Wei Xu

IEEE Trans. Signal Process., 2020

On the Influence of Bias-Correction on Distributed Stochastic Optimization.

[BibT_eX]

[DOI]

IEEE Trans. Signal Process., 2020

Variance-Reduced Stochastic Learning Under Random Reshuffling.

[BibT_eX]

[DOI]

IEEE Trans. Signal Process., 2020

Walkman: A Communication-Efficient Random-Walk Algorithm for Decentralized Optimization.

[BibT_eX]

[DOI]

IEEE Trans. Signal Process., 2020

A Proximal Diffusion Strategy for Multiagent Optimization With Sparse Affine Constraints.

[BibT_eX]

[DOI]

IEEE Trans. Autom. Control., 2020

2019

Exact Diffusion for Distributed Optimization and Learning - Part II: Convergence Analysis.

[BibT_eX]

[DOI]

IEEE Trans. Signal Process., 2019

Exact Diffusion for Distributed Optimization and Learning - Part I: Algorithm Development.

[BibT_eX]

[DOI]

IEEE Trans. Signal Process., 2019

Variance-Reduced Stochastic Learning by Networked Agents Under Random Reshuffling.

[BibT_eX]

[DOI]

IEEE Trans. Signal Process., 2019

Stochastic Learning Under Random Reshuffling With Constant Step-Sizes.

[BibT_eX]

[DOI]

IEEE Trans. Signal Process., 2019

Supervised Learning Under Distributed Features.

[BibT_eX]

[DOI]

IEEE Trans. Signal Process., 2019

Dynamic Average Diffusion With Randomized Coordinate Updates.

[BibT_eX]

[DOI]

IEEE Trans. Signal Inf. Process. over Networks, 2019

ODE Analysis of Stochastic Gradient Methods with Optimism and Anchoring for Minimax Problems and GANs.

[BibT_eX]

[DOI]

Ernest K. Ryu

Wotao Yin

CoRR, 2019

A Linearly Convergent Proximal Gradient Algorithm for Decentralized Optimization.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

COVER: A Cluster-based Variance Reduced Method for Online Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

Distributed Value-Function Learning with Linear Convergence Rates.

[BibT_eX]

[DOI]

Lucas Cassano

Proceedings of the 17th European Control Conference, 2019

On the Performance of Exact Diffusion over Adaptive Networks.

[BibT_eX]

[DOI]

Proceedings of the 58th IEEE Conference on Decision and Control, 2019

Decentralized Dynamic ADMM with Quantized and Censored Communications.

[BibT_eX]

[DOI]

Proceedings of the 53rd Asilomar Conference on Signals, Systems, and Computers, 2019

On the Comparison between Primal and Primal-dual Methods in Decentralized Dynamic Optimization.

[BibT_eX]

[DOI]

Proceedings of the 53rd Asilomar Conference on Signals, Systems, and Computers, 2019

2018

Multi-Agent Fully Decentralized Off-Policy Learning with Linear Convergence Rates.

[BibT_eX]

[DOI]

Lucas Cassano

CoRR, 2018

Learning Under Distributed Features.

[BibT_eX]

[DOI]

CoRR, 2018

A Communication-Efficient Random-Walk Algorithm for Decentralized Optimization.

[BibT_eX]

[DOI]

CoRR, 2018

Stochastic Learning under Random Reshuffling.

[BibT_eX]

[DOI]

CoRR, 2018

Convergence of Variance-Reduced Learning Under Random Reshuffling.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Efficient Variance-Reduced Learning Over Multi-Agent Networks.

[BibT_eX]

[DOI]

Proceedings of the 26th European Signal Processing Conference, 2018

An Exponentially Convergent Algorithm for Learning Under Distributed Features.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE Data Science Workshop, 2018

Dual Coupled Diffusion for Distributed Optimization with Affine Constraints.

[BibT_eX]

[DOI]

Proceedings of the 57th IEEE Conference on Decision and Control, 2018

2017

Efficient Variance-Reduced Learning for Fully Decentralized On-Device Intelligence.

[BibT_eX]

[DOI]

CoRR, 2017

Convergence of Variance-Reduced Stochastic Learning under Random Reshuffling.

[BibT_eX]

[DOI]

CoRR, 2017

On the performance of random reshuffling in stochastic learning.

[BibT_eX]

[DOI]

Proceedings of the 2017 Information Theory and Applications Workshop, 2017

Exact diffusion strategy for optimization by networked agents.

[BibT_eX]

[DOI]

Proceedings of the 25th European Signal Processing Conference, 2017

Decentralized exact coupled optimization.

[BibT_eX]

[DOI]

Proceedings of the 55th Annual Allerton Conference on Communication, 2017

2016

On the Convergence of Decentralized Gradient Descent.

[BibT_eX]

[DOI]

Wotao Yin

SIAM J. Optim., 2016

Cooperative tracking for nonlinear multi-agent systems with hybrid time-delayed protocol.

[BibT_eX]

[DOI]

Neurocomputing, 2016

Stochastic gradient descent with finite samples sizes.

[BibT_eX]

[DOI]

Proceedings of the 26th IEEE International Workshop on Machine Learning for Signal Processing, 2016

On the influence of momentum acceleration on online learning.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Online dual coordinate ascent learning.

[BibT_eX]

[DOI]

Proceedings of the 24th European Signal Processing Conference, 2016

Decentralized consensus optimization with asynchrony and delays.

[BibT_eX]

[DOI]

Proceedings of the 50th Asilomar Conference on Signals, Systems and Computers, 2016

2015

Communication-Efficient Decentralized Event Monitoring in Wireless Sensor Networks.

[BibT_eX]

[DOI]

Zhi Tian

IEEE Trans. Parallel Distributed Syst., 2015

A decentralised linear programming approach to energy-efficient event detection.

[BibT_eX]

[DOI]

Zhi Tian

Int. J. Sens. Networks, 2015

2014

On the Linear Convergence of the ADMM in Decentralized Consensus Optimization.

[BibT_eX]

[DOI]

IEEE Trans. Signal Process., 2014

Partial synchronization of the distributed parameter system with time delay via fuzzy control.

[BibT_eX]

[DOI]

Shumin Fei

IMA J. Math. Control. Inf., 2014

2013

Linearly convergent decentralized consensus optimization with the alternating direction method of multipliers.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2013

A linearized bregman algorithm for decentralized basis pursuit.

[BibT_eX]

[DOI]

Proceedings of the 21st European Signal Processing Conference, 2013

2012

Synchronization of Coupled Networks with Mixed Delays by Intermittent Control.

[BibT_eX]

[DOI]

Shumin Fei

J. Appl. Math., 2012

2008

Robust Stabilization of the Distributed Parameter System With Time Delay via Fuzzy Control.

[BibT_eX]

[DOI]

Han-Xiong Li

IEEE Trans. Fuzzy Syst., 2008

2006

Robust Stability of Switched Cohen-Grossberg Neural Networks With Mixed Time-Varying Delays.

[BibT_eX]

[DOI]

Han-Xiong Li

IEEE Trans. Syst. Man Cybern. Part B, 2006

Global Asymptotical Stability of Recurrent Neural Networks With Multiple Discrete Delays and Distributed Delays.

[BibT_eX]

[DOI]

Han-Xiong Li

IEEE Trans. Neural Networks, 2006

Exponential stability and periodic solutions of fuzzy cellular neural networks with time-varying delays.

[BibT_eX]

[DOI]

Jianming Deng

Neurocomputing, 2006

2005

An analysis of global asymptotic stability of delayed Cohen-Grossberg neural networks via nonsmooth analysis.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. I Regul. Pap., 2005

2004

Global Exponential Stability of Cohen-Grossberg Neural Networks with Multiple Time-Varying Delays.

[BibT_eX]

[DOI]