Jingfeng Wu

Orcid: 0009-0009-3414-4487

According to our database¹, Jingfeng Wu authored at least 53 papers between 2016 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2025

Seesaw: Accelerating Training by Balancing Learning Rate and Batch Size Scheduling.

[BibT_eX]

[DOI]

Alexandru Meterez

Depen Morwani

Jingfeng Wu

Costin-Andrei Oncescu

Cengiz Pehlevan

Sham M. Kakade

CoRR, October, 2025

BanaServe: Unified KV Cache and Dynamic Module Migration for Balancing Disaggregated LLM Serving in AI Infrastructure.

[BibT_eX]

[DOI]

CoRR, October, 2025

Risk Comparisons in Linear Regression: Implicit Regularization Dominates Explicit Regularization.

[BibT_eX]

[DOI]

CoRR, September, 2025

On the Collapse Errors Induced by the Deterministic Sampler for Diffusion Models.

[BibT_eX]

[DOI]

CoRR, August, 2025

Cloud Native System for LLM Inference Serving.

[BibT_eX]

[DOI]

CoRR, July, 2025

Unlock the Potential of Fine-grained LLM Serving via Dynamic Module Scaling.

[BibT_eX]

[DOI]

CoRR, July, 2025

A Simplified Analysis of SGD for Linear Regression with Weight Averaging.

[BibT_eX]

[DOI]

Alexandru Meterez

Depen Morwani

Costin-Andrei Oncescu

Jingfeng Wu

Cengiz Pehlevan

Sham M. Kakade

CoRR, June, 2025

Improved Scaling Laws in Linear Regression via Data Reuse.

[BibT_eX]

[DOI]

Licong Lin

Jingfeng Wu

Peter L. Bartlett

CoRR, June, 2025

Large Stepsizes Accelerate Gradient Descent for Regularized Logistic Regression.

[BibT_eX]

[DOI]

Jingfeng Wu

Pierre Marion

Peter L. Bartlett

CoRR, June, 2025

Minimax Optimal Convergence of Gradient Descent in Logistic Regression via Large and Adaptive Stepsizes.

[BibT_eX]

[DOI]

CoRR, April, 2025

Memory-Statistics Tradeoff in Continual Learning with Structural Regularization.

[BibT_eX]

[DOI]

Haoran Li

Jingfeng Wu

Vladimir Braverman

CoRR, April, 2025

Implicit Bias of Gradient Descent for Non-Homogeneous Deep Networks.

[BibT_eX]

[DOI]

CoRR, February, 2025

Benefits of Early Stopping in Gradient Descent for Overparameterized Logistic Regression.

[BibT_eX]

[DOI]

CoRR, February, 2025

Cloudnativesim: A Toolkit for Modeling and Simulation of Cloud-Native Applications.

[BibT_eX]

[DOI]

Softw. Pract. Exp., 2025

How Does Critical Batch Size Scale in Pre-training?

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

2024

A collective AI via lifelong learning and sharing at the edge.

[BibT_eX]

[DOI]

Nat. Mac. Intell., 2024

Context-Scaling versus Task-Scaling in In-Context Learning.

[BibT_eX]

[DOI]

Amirhesam Abedsoltan

Adityanarayanan Radhakrishnan

Jingfeng Wu

Mikhail Belkin

CoRR, 2024

UELLM: A Unified and Efficient Approach for LLM Inference Serving.

[BibT_eX]

[DOI]

CoRR, 2024

In-Context Learning of a Linear Transformer Block: Benefits of the MLP Component and One-Step GD Initialization.

[BibT_eX]

[DOI]

Ruiqi Zhang

Jingfeng Wu

Peter L. Bartlett

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Scaling Laws in Linear Regression: Compute, Parameters, and Data.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Large Stepsize Gradient Descent for Non-Homogeneous Two-Layer Networks: Margin Improvement and Fast Optimization.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

UELLM: A Unified and Efficient Approach for Large Language Model Inference Serving.

[BibT_eX]

[DOI]

Proceedings of the Service-Oriented Computing - 22nd International Conference, 2024

How Many Pretraining Tasks Are Needed for In-Context Learning of Linear Regression?

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Risk Bounds of Accelerated SGD for Overparameterized Linear Regression.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Large Stepsize Gradient Descent for Logistic Loss: Non-Monotonicity of the Loss Improves Optimization Efficiency.

[BibT_eX]

[DOI]

Proceedings of the Thirty Seventh Annual Conference on Learning Theory, June 30, 2024

2023

Learning High-Dimensional Single-Neuron ReLU Networks with Finite Samples.

[BibT_eX]

[DOI]

CoRR, 2023

Private Federated Frequency Estimation: Adapting to the Hardness of the Instance.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Implicit Bias of Gradient Descent for Logistic Regression at the Edge of Stability.

[BibT_eX]

[DOI]

Jingfeng Wu

Vladimir Braverman

Jason D. Lee

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Finite-Sample Analysis of Learning High-Dimensional Single ReLU Neuron.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

Fixed Design Analysis of Regularization-Based Continual Learning.

[BibT_eX]

[DOI]

Haoran Li

Jingfeng Wu

Vladimir Braverman

Proceedings of the Conference on Lifelong Learning Agents, 2023

2022

Risk Bounds of Multi-Pass SGD for Least Squares in the Interpolation Regime.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

The Power and Limitation of Pretraining-Finetuning for Linear Regression under Covariate Shift.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Last Iterate Risk Bounds of SGD with Decaying Stepsize for Overparameterized Linear Regression.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

Gap-Dependent Unsupervised Exploration for Reinforcement Learning.

[BibT_eX]

[DOI]

Jingfeng Wu

Vladimir Braverman

Lin Yang

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2022

2021

Programmable packet scheduling with a single queue.

[BibT_eX]

[DOI]

Proceedings of the ACM SIGCOMM 2021 Conference, Virtual Event, USA, August 23-27, 2021., 2021

Twenty Years After: Hierarchical Core-Stateless Fair Queueing.

[BibT_eX]

[DOI]

Proceedings of the 18th USENIX Symposium on Networked Systems Design and Implementation, 2021

Ship Compute or Ship Data? Why Not Both?

[BibT_eX]

[DOI]

Proceedings of the 18th USENIX Symposium on Networked Systems Design and Implementation, 2021

The Benefits of Implicit Regularization from SGD in Least Squares Problems.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Accommodating Picky Customers: Regret Bound and Exploration Complexity for Multi-Objective Reinforcement Learning.

[BibT_eX]

[DOI]

Jingfeng Wu

Vladimir Braverman

Lin Yang

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Direction Matters: On the Implicit Bias of Stochastic Gradient Descent with Moderate Learning Rate.

[BibT_eX]

[DOI]

Proceedings of the 9th International Conference on Learning Representations, 2021

Benign Overfitting of Constant-Stepsize SGD for Linear Regression.

[BibT_eX]

[DOI]

Proceedings of the Conference on Learning Theory, 2021

Lifelong Learning with Sketched Structural Regularization.

[BibT_eX]

[DOI]

Proceedings of the Asian Conference on Machine Learning, 2021

2020

Direction Matters: On the Implicit Regularization Effect of Stochastic Gradient Descent with Moderate Learning Rate.

[BibT_eX]

[DOI]

CoRR, 2020

On the Noisy Gradient Descent that Generalizes as SGD.

[BibT_eX]

[DOI]

Proceedings of the 37th International Conference on Machine Learning, 2020

Obtaining Adjustable Regularization for Free via Iterate Averaging.

[BibT_eX]

[DOI]

Jingfeng Wu

Vladimir Braverman

Lin Yang

Proceedings of the 37th International Conference on Machine Learning, 2020

2019

The Multiplicative Noise in Stochastic Gradient Descent: Data-Dependent Regularization, Continuous and Discrete Approximation.

[BibT_eX]

[DOI]

CoRR, 2019

The Anisotropic Noise in Stochastic Gradient Descent: Its Behavior of Escaping from Sharp Minima and Regularization Effects.

[BibT_eX]

[DOI]

Proceedings of the 36th International Conference on Machine Learning, 2019

Automatic Cloud Segmentation Based on Fused Fully Convolutional Networks.

[BibT_eX]

[DOI]

Jie An

Jingfeng Wu

Jinwen Ma

Proceedings of the Intelligent Computing Theories and Application, 2019

Tangent-Normal Adversarial Regularization for Semi-Supervised Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

2018

Tangent-Normal Adversarial Regularization for Semi-supervised Learning.

[BibT_eX]

[DOI]

Bing Yu

Jingfeng Wu

Zhanxing Zhu

CoRR, 2018

The Regularization Effects of Anisotropic Noise in Stochastic Gradient Descent.

[BibT_eX]

[DOI]

CoRR, 2018

2017

数据异常的监测技术综述 (Survey on Monitoring Techniques for Data Abnormalities).

[BibT_eX]

[DOI]

Jingfeng Wu

Weidong Jin

Peng Tang

计算机科学, 2017

2016

Research on human body composition prediction model based on Akaike Information Criterion and improved entropy method.

[BibT_eX]

[DOI]

Proceedings of the 9th International Congress on Image and Signal Processing, 2016

Jingfeng Wu

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...