Hadi Daneshmand

According to our database1, Hadi Daneshmand authored at least 24 papers between 2014 and 2023.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2023
Towards Training Without Depth Limits: Batch Normalization Without Gradient Explosion.
CoRR, 2023

On the impact of activation and normalization in obtaining isometric embeddings at initialization.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Transformers learn to implement preconditioned gradient descent for in-context learning.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

On Bridging the Gap between Mean Field and Finite Width Deep Random Multilayer Perceptron with Batch Normalization.
Proceedings of the International Conference on Machine Learning, 2023

Efficient displacement convex optimization with particle gradient descent.
Proceedings of the International Conference on Machine Learning, 2023

2022
Entropy Maximization with Depth: A Variational Principle for Random Neural Networks.
CoRR, 2022

Polynomial-time sparse measure recovery.
CoRR, 2022

2021
Rethinking the Variational Interpretation of Accelerated Optimization Methods.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Batch Normalization Orthogonalizes Representations in Deep Random Networks.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Revisiting the Role of Euler Numerical Integration on Acceleration and Stability in Convex Optimization.
Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021

2020
Optimization for Neural Networks: Quest for Theoretical Understandings.
PhD thesis, 2020

Theoretical Understanding of Batch-normalization: A Markov Chain Perspective.
CoRR, 2020

Batch normalization provably avoids ranks collapse for randomly initialised deep networks.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

2019
Mixing of Stochastic Accelerated Gradient Descent.
CoRR, 2019

Exponential convergence rates for Batch Normalization: The power of length-direction decoupling in non-convex optimization.
Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, 2019

Local Saddle Point Optimization: A Curvature Exploitation Approach.
Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, 2019

2018
Towards a Theoretical Understanding of Batch Normalization.
CoRR, 2018

Escaping Saddles with Stochastic Gradients.
Proceedings of the 35th International Conference on Machine Learning, 2018

2017
Accelerated Dual Learning by Homotopic Initialization.
CoRR, 2017

2016
Estimating Diffusion Networks: Recovery Conditions, Sample Complexity and Soft-thresholding Algorithm.
J. Mach. Learn. Res., 2016

DynaNewton - Accelerating Newton's Method for Machine Learning.
CoRR, 2016

Adaptive Newton Method for Empirical Risk Minimization to Statistical Accuracy.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

Starting Small - Learning with Adaptive Sample Sizes.
Proceedings of the 33nd International Conference on Machine Learning, 2016

2014
Estimating Diffusion Network Structures: Recovery Conditions, Sample Complexity & Soft-thresholding Algorithm.
Proceedings of the 31th International Conference on Machine Learning, 2014


  Loading...