Samuel S. Schoenholz

María José Ramírez-Quintana

José Hernández-Orallo

Karthik Gopalakrishnan

Lidia Contreras Ochando

Louis-Philippe Morency

Michael I. Ivanitskiy

Neta Gur-Ari Krakover

Nitish Shirish Keskar

Pablo Antonio Moreno Casares

Pegah Alipoormolabashi

Shyamolima (Shammie) Debnath

Sneha Priscilla Makini

Yadollah Yaghoobzadeh

Trans. Mach. Learn. Res., 2023

Temperature check: theory and practice for training models with softmax-cross-entropy losses.

[BibT_eX]

[DOI]

Atish Agarwala

Samuel Stern Schoenholz

Yann N. Dauphin

Trans. Mach. Learn. Res., 2023

Scaling deep learning for materials discovery.

[BibT_eX]

[DOI]

Nat., 2023

End-to-End Differentiable Reactive Molecular Dynamics Simulations Using JAX.

[BibT_eX]

[DOI]

Proceedings of the High Performance Computing - 38th International Conference, 2023

2022

∂<i>PV</i>: An end-to-end differentiable solar-cell simulator.

[BibT_eX]

[DOI]

Comput. Phys. Commun., 2022

What does a deep neural network confidently perceive? The effective dimension of high certainty class manifolds and their low confidence boundaries.

[BibT_eX]

[DOI]

CoRR, 2022

Fast Finite Width Neural Tangent Kernel.

[BibT_eX]

[DOI]

Roman Novak

Proceedings of the International Conference on Machine Learning, 2022

Deep equilibrium networks are sensitive to initialization statistics.

[BibT_eX]

[DOI]

Atish Agarwala

Proceedings of the International Conference on Machine Learning, 2022

2021

Gradients are Not All You Need.

[BibT_eX]

[DOI]

CoRR, 2021

Rapid training of deep neural networks without skip connections or normalization layers using Deep Kernel Shaping.

[BibT_eX]

[DOI]

CoRR, 2021

Whitening and Second Order Optimization Both Make Information in the Dataset Unusable During Training, and Can Reduce or Prevent Generalization.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

Tilting the playing field: Dynamical loss functions for machine learning.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

Learn2Hop: Learned Optimization on Rough Landscapes.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

2020

Whitening and second order optimization both destroy information about the dataset, and can make generalization impossible.

[BibT_eX]

[DOI]

CoRR, 2020

On the infinite width limit of neural networks with a standard parameterization.

[BibT_eX]

[DOI]

Roman Novak

Jaehoon Lee

CoRR, 2020

JAX MD: A Framework for Differentiable Physics.

[BibT_eX]

[DOI]

Ekin Dogus Cubuk

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Finite Versus Infinite Neural Networks: an Empirical Study.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Disentangling Trainability and Generalization in Deep Neural Networks.

[BibT_eX]

[DOI]

Lechao Xiao

Samuel Stern Schoenholz

Proceedings of the 37th International Conference on Machine Learning, 2020

Neural Tangents: Fast and Easy Infinite Neural Networks in Python.

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Learning Representations, 2020

2019

Disentangling trainability and generalization in deep learning.

[BibT_eX]

[DOI]

Lechao Xiao

CoRR, 2019

Wide Neural Networks of Any Depth Evolve as Linear Models Under Gradient Descent.

[BibT_eX]

[DOI]

CoRR, 2019

Dynamical Isometry and a Mean Field Theory of LSTMs and GRUs.

[BibT_eX]

[DOI]

CoRR, 2019

Wide Neural Networks of Any Depth Evolve as Linear Models Under Gradient Descent.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

MetaInit: Initializing learning by learning to initialize.

[BibT_eX]

[DOI]

Yann N. Dauphin

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

A Mean Field Theory of Batch Normalization.

[BibT_eX]

[DOI]

Greg Yang

Vinay Rao

Proceedings of the 7th International Conference on Learning Representations, 2019

2018

Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10, 000-Layer Vanilla Convolutional Neural Networks.

[BibT_eX]

[DOI]

Lechao Xiao

Yasaman Bahri

Proceedings of the 35th International Conference on Machine Learning, 2018

Dynamical Isometry and a Mean Field Theory of RNNs: Gating Enables Signal Propagation in Recurrent Neural Networks.

[BibT_eX]

[DOI]

Minmin Chen

Proceedings of the 35th International Conference on Machine Learning, 2018

Deep Neural Networks as Gaussian Processes.

[BibT_eX]

[DOI]

Proceedings of the 6th International Conference on Learning Representations, 2018

Adversarial Spheres.

[BibT_eX]

[DOI]

Proceedings of the 6th International Conference on Learning Representations, 2018

Intriguing Properties of Adversarial Examples.

[BibT_eX]

[DOI]

Proceedings of the 6th International Conference on Learning Representations, 2018

The emergence of spectral universality in deep networks.

[BibT_eX]

[DOI]

Surya Ganguli

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2018

2017

A Correspondence Between Random Neural Networks and Statistical Field Theory.

[BibT_eX]

[DOI]

CoRR, 2017

Mean Field Residual Networks: On the Edge of Chaos.

[BibT_eX]

[DOI]

Greg Yang

Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Resurrecting the sigmoid in deep learning through dynamical isometry: theory and practice.

[BibT_eX]

[DOI]

Surya Ganguli

Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Neural Message Passing for Quantum Chemistry.

[BibT_eX]

[DOI]

Proceedings of the 34th International Conference on Machine Learning, 2017

Deep Information Propagation.

[BibT_eX]

[DOI]

Justin Gilmer

Surya Ganguli

Proceedings of the 5th International Conference on Learning Representations, 2017

Explaining the Learning Dynamics of Direct Feedback Alignment.

[BibT_eX]

[DOI]