Jingzhao Zhang

CoRR, March, 2026

2025

On the Condition Number Dependency in Bilevel Optimization.

[BibT_eX]

[DOI]

CoRR, November, 2025

Finite Sample Analyses for Continuous-time Linear Systems: System Identification and Online Control.

[BibT_eX]

[DOI]

Hongyi Zhou

Jingwei Li

CoRR, September, 2025

PiERN: Token-Level Routing for Integrating High-Precision Computation and Reasoning.

[BibT_eX]

[DOI]

CoRR, September, 2025

Faster Gradient Methods for Highly-smooth Stochastic Bilevel Optimization.

[BibT_eX]

[DOI]

Junru Li

CoRR, September, 2025

Multitask Battery Management with Flexible Pretraining.

[BibT_eX]

[DOI]

CoRR, September, 2025

NeuralDB: Scaling Knowledge Editing in LLMs to 100,000 Facts with Neural KV Database.

[BibT_eX]

[DOI]

CoRR, July, 2025

QuestA: Expanding Reasoning Capacity in LLMs via Question Augmentation.

[BibT_eX]

[DOI]

CoRR, July, 2025

Solving Convex-Concave Problems with 𝒪(ε-4/7) Second-Order Oracle Complexity.

[BibT_eX]

[DOI]

CoRR, June, 2025

Task Generalization With AutoRegressive Compositional Structure: Can Learning From D Tasks Generalize to DT Tasks?

[BibT_eX]

[DOI]

CoRR, February, 2025

Near-Optimal Nonconvex-Strongly-Convex Bilevel Optimization with Fully First-Order Oracles.

[BibT_eX]

[DOI]

Yaohua Ma

J. Mach. Learn. Res., 2025

Data Mixing Can Induce Phase Transitions in Knowledge Acquisition.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

Understanding Nonlinear Implicit Bias via Region Counts in Input Space.

[BibT_eX]

[DOI]

Proceedings of the Forty-second International Conference on Machine Learning, 2025

Towards Black-Box Membership Inference Attack for Diffusion Models.

[BibT_eX]

[DOI]

Proceedings of the Forty-second International Conference on Machine Learning, 2025

Task Generalization with Autoregressive Compositional Structure: Can Learning from D Tasks Generalize to DT Tasks?

[BibT_eX]

[DOI]

Proceedings of the Forty-second International Conference on Machine Learning, 2025

Scalable Model Merging with Progressive Layer-wise Distillation.

[BibT_eX]

[DOI]

Jiazheng Li

Proceedings of the Forty-second International Conference on Machine Learning, 2025

From Sparse Dependence to Sparse Attention: Unveiling How Chain-of-Thought Enhances Transformer Sample Efficiency.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Second-Order Min-Max Optimization with Lazy Hessians.

[BibT_eX]

[DOI]

Chengchang Liu

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Fast and Multiphase Rates for Nearest Neighbor Classifiers.

[BibT_eX]

[DOI]

Pengkun Yang

Proceedings of the Thirty Eighth Annual Conference on Learning Theory, 2025

Solving Convex-Concave Problems with 풪(ε-4/7) Second-Order Oracle Complexity.

[BibT_eX]

[DOI]

Proceedings of the Thirty Eighth Annual Conference on Learning Theory, 2025

Generalization Lower Bounds for GD and SGD in Smooth Stochastic Convex Optimization.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2025

2024

Time Series Prediction of Gas Emission in Coal Mining Face Based on Optimized Variational Mode Decomposition and SSA-LSTM.

[BibT_eX]

[DOI]

Sensors, October, 2024

Functionally Constrained Algorithm Solves Convex Simple Bilevel Problems.

[BibT_eX]

[DOI]

CoRR, 2024

Online Policy Optimization for Robust Markov Decision Process.

[BibT_eX]

[DOI]

Proceedings of the Uncertainty in Artificial Intelligence, 2024

Functionally Constrained Algorithm Solves Convex Simple Bilevel Problem.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Online Control with Adversarial Disturbance for Continuous-time Linear Systems.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Random Masking Finds Winning Tickets for Parameter Efficient Fine-tuning.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

A Quadratic Synchronization Rule for Distributed Deep Learning.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

On Finding Small Hyper-Gradients in Bilevel Optimization: Hardness Results and Improved Analysis.

[BibT_eX]

[DOI]

Proceedings of the Thirty Seventh Annual Conference on Learning Theory, June 30, 2024

2023

Sion's Minimax Theorem in Geodesic Metric Spaces and a Riemannian Extragradient Algorithm.

[BibT_eX]

[DOI]

SIAM J. Optim., December, 2023

Two Phases of Scaling Laws for Nearest Neighbor Classifiers.

[BibT_eX]

[DOI]

Pengkun Yang

CoRR, 2023

Near-Optimal Fully First-Order Algorithms for Finding Stationary Points in Bilevel Optimization.

[BibT_eX]

[DOI]

Yaohua Ma

CoRR, 2023

Online Control with Adversarial Disturbance for Continuous-time Linear Systems.

[BibT_eX]

[DOI]

CoRR, 2023

Lower Generalization Bounds for GD and SGD in Smooth Stochastic Convex Optimization.

[BibT_eX]

[DOI]

CoRR, 2023

On Bilevel Optimization without Lower-level Strong Convexity.

[BibT_eX]

[DOI]

CoRR, 2023

On the Overlooked Pitfalls of Weight Decay and How to Mitigate Them: A Gradient-Norm Perspective.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Iteratively Learn Diverse Strategies with State Distance Information.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Fast Conditional Mixing of MCMC Algorithms for Non-log-concave Distributions.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Benign Overfitting in Classification: Provably Counter Label Noise with Larger Models.

[BibT_eX]

[DOI]

Kaiyue Wen

Proceedings of the Eleventh International Conference on Learning Representations, 2023

2022

Optimization Theory and Machine Learning Practice: Mind the Gap.

[BibT_eX]

[DOI]

PhD thesis, 2022

Online Policy Optimization for Robust MDP.

[BibT_eX]

[DOI]

CoRR, 2022

Realistic Deep Learning May Not Fit Benignly.

[BibT_eX]

[DOI]

Kaiyue Wen

CoRR, 2022

Minimax in Geodesic Metric Spaces: Sion's Theorem and Algorithms.

[BibT_eX]

[DOI]

CoRR, 2022

Detecting Electric Vehicle Battery Failure via Dynamic-VAE.

[BibT_eX]

[DOI]

CoRR, 2022

Efficient Sampling on Riemannian Manifolds via Langevin MCMC.

[BibT_eX]

[DOI]

Xiang Cheng

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Neural Network Weights Do Not Converge to Stationary Points: An Invariant Measure Perspective.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

Beyond Worst-Case Analysis in Stochastic Approximation: Moment Estimation Improves Instance Complexity.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

Understanding the unstable convergence of gradient descent.

[BibT_eX]

[DOI]

Kwangjun Ahn

Proceedings of the International Conference on Machine Learning, 2022

2021

Monitoring, Analyzing, and Modeling for Single Subsidence Basin in Coal Mining Areas Based on SAR Interferometry with L-Band Data.

[BibT_eX]

[DOI]

Sci. Program., 2021

On Convergence of Training Loss Without Reaching Stationary Points.

[BibT_eX]

[DOI]

CoRR, 2021

Complexity Lower Bounds for Nonconvex-Strongly-Concave Min-Max Optimization.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Fast Federated Learning in the Presence of Arbitrary Device Unavailability.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Provably Efficient Algorithms for Multi-Objective Competitive RL.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

Coping with Label Shift via Distributionally Robust Optimisation.

[BibT_eX]

[DOI]

Proceedings of the 9th International Conference on Learning Representations, 2021

Exposure Bias versus Self-Recovery: Are Distortions Really Incremental for Autoregressive Text Generation?

[BibT_eX]

[DOI]

Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

2020

Stochastic Optimization with Non-stationary Noise.

[BibT_eX]

[DOI]

CoRR, 2020

On Complexity of Finding Stationary Points of Nonsmooth Nonconvex Functions.

[BibT_eX]

[DOI]

CoRR, 2020

Why are Adaptive Methods Good for Attention Models?

[BibT_eX]

[DOI]

Sai Praneeth Karimireddy

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Complexity of Finding Stationary Points of Nonconvex Nonsmooth Functions.

[BibT_eX]

[DOI]

Proceedings of the 37th International Conference on Machine Learning, 2020

Why Gradient Clipping Accelerates Training: A Theoretical Justification for Adaptivity.

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Learning Representations, 2020

2019

Why ADAM Beats SGD for Attention Models.

[BibT_eX]

[DOI]

Sai Praneeth Karimireddy

CoRR, 2019

Analysis of Gradient Clipping and Adaptive Scaling with a Relaxed Smoothness Condition.

[BibT_eX]

[DOI]

CoRR, 2019

Quantifying Exposure Bias for Neural Language Generation.

[BibT_eX]

[DOI]

CoRR, 2019

Acceleration in First Order Quasi-strongly Convex Optimization by ODE Discretization.

[BibT_eX]

[DOI]

Ali Jadbabaie

Proceedings of the 58th IEEE Conference on Decision and Control, 2019

Achieving Acceleration in Distributed Optimization via Direct Discretization of the Heavy-Ball ODE.

[BibT_eX]

[DOI]

Proceedings of the 2019 American Control Conference, 2019

2018

A Probe Towards Understanding GAN and VAE Models.

[BibT_eX]

[DOI]

Lu Mi

Macheng Shen

CoRR, 2018

R-SPIDER: A Fast Riemannian Stochastic Optimization Algorithm with Curvature Independent Rate.

[BibT_eX]

[DOI]

Hongyi Zhang