Samuel L. Smith

According to our database1, Samuel L. Smith authored at least 26 papers between 2016 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models.
CoRR, 2024

2023
ConvNets Match Vision Transformers at Scale.
CoRR, 2023

Unlocking Accuracy and Fairness in Differentially Private Image Classification.
CoRR, 2023

On the Universality of Linear Recurrences Followed by Nonlinear Projections.
CoRR, 2023

Differentially Private Diffusion Models Generate Useful Synthetic Images.
CoRR, 2023

Resurrecting Recurrent Neural Networks for Long Sequences.
Proceedings of the International Conference on Machine Learning, 2023

Deep Transformers without Shortcuts: Modifying Self-attention for Faithful Signal Propagation.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

2022
Unlocking High-Accuracy Differentially Private Image Classification through Scale.
CoRR, 2022

2021
A study on the plasticity of neural networks.
CoRR, 2021

Drawing Multiple Augmentation Samples Per Image During Training Efficiently Decreases Test Error.
CoRR, 2021

High-Performance Large-Scale Image Recognition Without Normalization.
Proceedings of the 38th International Conference on Machine Learning, 2021

On the Origin of Implicit Regularization in Stochastic Gradient Descent.
Proceedings of the 9th International Conference on Learning Representations, 2021

Characterizing signal propagation to close the performance gap in unnormalized ResNets.
Proceedings of the 9th International Conference on Learning Representations, 2021

2020
BYOL works even without batch statistics.
CoRR, 2020

Cold Posteriors and Aleatoric Uncertainty.
CoRR, 2020

Batch Normalization Biases Deep Residual Networks Towards Shallow Paths.
CoRR, 2020

Batch Normalization Biases Residual Blocks Towards the Identity Function in Deep Networks.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

On the Generalization Benefit of Noise in Stochastic Gradient Descent.
Proceedings of the 37th International Conference on Machine Learning, 2020

2019
The Effect of Network Width on Stochastic Gradient Descent and Generalization: an Empirical Study.
Proceedings of the 36th International Conference on Machine Learning, 2019

2018
Stochastic natural gradient descent draws posterior samples in function space.
CoRR, 2018

Decoding Decoders: Finding Optimal Representation Spaces for Unsupervised Similarity Tasks.
Proceedings of the 6th International Conference on Learning Representations, 2018

A Bayesian Perspective on Generalization and Stochastic Gradient Descent.
Proceedings of the 6th International Conference on Learning Representations, 2018

Don't Decay the Learning Rate, Increase the Batch Size.
Proceedings of the 6th International Conference on Learning Representations, 2018

2017
Don't Decay the Learning Rate, Increase the Batch Size.
CoRR, 2017

Offline bilingual word vectors, orthogonal transformations and the inverted softmax.
Proceedings of the 5th International Conference on Learning Representations, 2017

2016
Monte Carlo Sort for unreliable human comparisons.
CoRR, 2016


  Loading...