Difan Zou

According to our database1, Difan Zou authored at least 91 papers between 2014 and 2025.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
Towards Theoretical Understanding of Transformer Test-Time Computing: Investigation on In-Context Linear Regression.
CoRR, August, 2025

STGAN: Spatial-Temporal Graph Autoregression Network for Pavement Distress Deterioration Prediction.
IEEE Trans. Intell. Transp. Syst., July, 2025

Self-Contradiction as Self-Improvement: Mitigating the Generation-Understanding Gap in MLLMs.
CoRR, July, 2025

A Random Matrix Analysis of In-context Memorization for Nonlinear Attention.
CoRR, June, 2025

On the Mechanism of Reasoning Pattern Selection in Reinforcement Learning for Language Models.
CoRR, June, 2025

Model Unlearning via Sparse Autoencoder Subspace Guided Projections.
CoRR, May, 2025

Physics-Informed Distillation of Diffusion Models for PDE-Constrained Generation.
CoRR, May, 2025

Almost Linear Convergence under Minimal Score Assumptions: Quantized Transition Diffusion.
CoRR, May, 2025

Capturing Conditional Dependence via Auto-regressive Diffusion Models.
CoRR, April, 2025

Gradient Descent Robustly Learns the Intrinsic Dimension of Data in Training Convolutional Neural Networks.
CoRR, April, 2025

Per-example gradient regularization improves learning signals from noisy data.
Mach. Learn., March, 2025

On the Robustness of Transformers against Context Hijacking for Linear Classification.
CoRR, February, 2025

Towards Understanding Fine-Tuning Mechanisms of LLMs via Circuit Analysis.
CoRR, February, 2025

Hyperspherical Energy Transformer with Recurrent Depth.
CoRR, February, 2025

Can Diffusion Models Learn Hidden Inter-Feature Rules Behind Images?
CoRR, February, 2025

Masked Autoencoders Are Effective Tokenizers for Diffusion Models.
CoRR, February, 2025

How Does Critical Batch Size Scale in Pre-training?
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

HyPoGen: Optimization-Biased Hypernetworks for Generalizable Policy Generation.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Beyond Surface Structure: A Causal Assessment of LLMs' Comprehension ability.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

On the Feature Learning in Diffusion Models.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Parallelized Autoregressive Visual Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

SWE-Fixer: Training Open-Source LLMs for Effective and Efficient GitHub Issue Resolution.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

2024
Challenges of COVID-19 Case Forecasting in the US, 2020-2021.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
PLoS Comput. Biol., 2024

Initialization Matters: On the Benign Overfitting of Two-Layer ReLU CNN with Fully Trainable Layers.
CoRR, 2024

Towards a Theoretical Understanding of Memorization in Diffusion Models.
CoRR, 2024

Self-Control of LLM Behaviors by Compressing Suffix Gradient into Prefix Controller.
CoRR, 2024

A Human-Like Reasoning Framework for Multi-Phases Planning Task with Large Language Models.
CoRR, 2024

The Dog Walking Theory: Rethinking Convergence in Federated Learning.
CoRR, 2024

On the Benefits of Over-parameterization for Out-of-Distribution Generalization.
CoRR, 2024

Improving Implicit Regularization of SGD with Preconditioning for Least Square Problems.
CoRR, 2024

An Improved Analysis of Langevin Algorithms with Prior Diffusion for Non-Log-Concave Sampling.
CoRR, 2024

The Implicit Bias of Adam on Separable Data.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Reverse Transition Kernel: A Flexible Framework to Accelerate Diffusion Inference.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

An In-depth Investigation of Sparse Rate Reduction in Transformer-like Models.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

How Transformers Utilize Multi-Head Attention in In-Context Learning? A Case Study on Sparse Linear Regression.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Slight Corruption in Pre-training Data Makes Better Diffusion Models.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Benign Overfitting in Two-Layer ReLU Convolutional Neural Networks for XOR Data.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Faster Sampling via Stochastic Gradient Proximal Sampler.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Improving Group Robustness on Spurious Correlation Requires Preciser Group Inference.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

What Can Transformer Learn with Varying Depth? Case Studies on Sequence Learning Tasks.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

How Many Pretraining Tasks Are Needed for In-Context Learning of Linear Regression?
Proceedings of the Twelfth International Conference on Learning Representations, 2024

PRES: Toward Scalable Memory-Based Dynamic Graph Neural Networks.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Benign Oscillation of Stochastic Gradient Descent with Large Learning Rate.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Optimized Transmit Beamformers for Dual-Function RadCom System.
Proceedings of the IEEE Globecom Workshops 2024, 2024

Faster Sampling without Isoperimetry via Diffusion-based Monte Carlo.
Proceedings of the Thirty Seventh Annual Conference on Learning Theory, June 30, 2024

On the Limitation and Experience Replay for GNNs in Continual Learning.
Proceedings of the Conference on Lifelong Learning Agents, 2024

2023
Benign Overfitting of Constant-Stepsize SGD for Linear Regression.
J. Mach. Learn. Res., 2023

Benign Oscillation of Stochastic Gradient Descent with Large Learning Rates.
CoRR, 2023

Less is More: On the Feature Redundancy of Pretrained Models When Transferring to Few-shot Tasks.
CoRR, 2023

Learning High-Dimensional Single-Neuron ReLU Networks with Finite Samples.
CoRR, 2023

The Benefits of Mixup for Feature Learning.
Proceedings of the International Conference on Machine Learning, 2023

Finite-Sample Analysis of Learning High-Dimensional Single ReLU Neuron.
Proceedings of the International Conference on Machine Learning, 2023

Towards Robust Graph Incremental Learning on Evolving Graphs.
Proceedings of the International Conference on Machine Learning, 2023

Understanding the Generalization of Adam in Learning Neural Networks with Proper Regularization.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

The Implicit Bias of Batch Normalization in Linear Models and Two-layer Linear Convolutional Neural Networks.
Proceedings of the Thirty Sixth Annual Conference on Learning Theory, 2023

2022
Understanding the Role of Optimization Algorithms in Learning Over-parameterized Models
PhD thesis, 2022

Two-Dimensional Intensity Distribution and Adaptive Power Allocation for Ultraviolet Ad-Hoc Network.
IEEE Trans. Green Commun. Netw., 2022

Risk Bounds of Multi-Pass SGD for Least Squares in the Interpolation Regime.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

The Power and Limitation of Pretraining-Finetuning for Linear Regression under Covariate Shift.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Last Iterate Risk Bounds of SGD with Decaying Stepsize for Overparameterized Linear Regression.
Proceedings of the International Conference on Machine Learning, 2022

Self-training Converts Weak Learners to Strong Learners in Mixture Models.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2022

2021
Laplacian Smoothing Stochastic Gradient Markov Chain Monte Carlo.
SIAM J. Sci. Comput., 2021

Faster Convergence of Stochastic Gradient Langevin Dynamics for Non-Log-Concave Sampling.
Proceedings of the Thirty-Seventh Conference on Uncertainty in Artificial Intelligence, 2021

The Benefits of Implicit Regularization from SGD in Least Squares Problems.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

On the Convergence of Hamiltonian Monte Carlo with Stochastic Gradients.
Proceedings of the 38th International Conference on Machine Learning, 2021

Provable Robustness of Adversarial Training for Learning Halfspaces with Noise.
Proceedings of the 38th International Conference on Machine Learning, 2021

Direction Matters: On the Implicit Bias of Stochastic Gradient Descent with Moderate Learning Rate.
Proceedings of the 9th International Conference on Learning Representations, 2021

How Much Over-parameterization Is Sufficient to Learn Deep ReLU Networks?
Proceedings of the 9th International Conference on Learning Representations, 2021

2020
Gradient descent optimizes over-parameterized deep ReLU networks.
Mach. Learn., 2020

Direction Matters: On the Implicit Regularization Effect of Stochastic Gradient Descent with Moderate Learning Rate.
CoRR, 2020

On the Global Convergence of Training Deep Linear ResNets.
Proceedings of the 8th International Conference on Learning Representations, 2020

Improving Adversarial Robustness Requires Revisiting Misclassified Examples.
Proceedings of the 8th International Conference on Learning Representations, 2020

Two-dimensional Intensity Distribution and Connectivity in Ultraviolet Ad-Hoc Network.
Proceedings of the 2020 IEEE International Conference on Communications, 2020

2019
Characterization on Practical Photon Counting Receiver in Optical Scattering Communication.
IEEE Trans. Commun., 2019

Signal Characterization and Achievable Transmission Rate of VLC Under Receiver Nonlinearity.
IEEE Access, 2019

Layer-Dependent Importance Sampling for Training Deep and Large Graph Convolutional Networks.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

An Improved Analysis of Training Over-parameterized Deep Neural Networks.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Stochastic Gradient Hamiltonian Monte Carlo Methods with Recursive Variance Reduction.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Sampling from Non-Log-Concave Distributions via Variance-Reduced Gradient Langevin Dynamics.
Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, 2019

2018
Signal Detection Under Short-Interval Sampling of Continuous Waveforms for Optical Wireless Scattering Communication.
IEEE Trans. Wirel. Commun., 2018

Secrecy Rate of MISO Optical Wireless Scattering Communications.
IEEE Trans. Commun., 2018

Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU Networks.
CoRR, 2018

Subsampled Stochastic Variance-Reduced Gradient Langevin Dynamics.
Proceedings of the Thirty-Fourth Conference on Uncertainty in Artificial Intelligence, 2018

Global Convergence of Langevin Dynamics Based Algorithms for Nonconvex Optimization.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Stochastic Variance-Reduced Hamilton Monte Carlo Methods.
Proceedings of the 35th International Conference on Machine Learning, 2018

2017
Saving Gradient and Negative Curvature Computations: Finding Local Minima More Efficiently.
CoRR, 2017

Analysis on Practical Photon Counting Receiver in Optical Scattering Communication.
CoRR, 2017

2016
Turbulence channel modeling and non-parametric estimation for optical wireless scattering communication.
Proceedings of the 2016 IEEE International Conference on Communication Systems, 2016

Performance of non-line-of-sight ultraviolet scattering communication under different altitudes.
Proceedings of the 2016 IEEE/CIC International Conference on Communications in China, 2016

Optical wireless scattering communication system with a non-ideal photon-counting receiver.
Proceedings of the 2016 IEEE Global Conference on Signal and Information Processing, 2016

2014
Improving the NLOS optical scattering channel via beam reshaping.
Proceedings of the 48th Asilomar Conference on Signals, Systems and Computers, 2014


  Loading...