Erich Elsen

According to our database1, Erich Elsen authored at least 37 papers between 2006 and 2022.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2022
Training Compute-Optimal Large Language Models.
CoRR, 2022

Unified Scaling Laws for Routed Language Models.
CoRR, 2022

An empirical analysis of compute-optimal large language model training.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

The State of Sparse Training in Deep Reinforcement Learning.
Proceedings of the International Conference on Machine Learning, 2022



Step-unrolled Denoising Autoencoders for Text Generation.
Proceedings of the Tenth International Conference on Learning Representations, 2022

2021
Scaling Language Models: Methods, Analysis & Insights from Training Gopher.
CoRR, 2021

Practical Real Time Recurrent Learning with a Sparse Approximation.
Proceedings of the 9th International Conference on Learning Representations, 2021

End-to-end Adversarial Text-to-Speech.
Proceedings of the 9th International Conference on Learning Representations, 2021

2020
AlgebraNets.
CoRR, 2020

A Practical Sparse Approximation for Real Time Recurrent Learning.
CoRR, 2020

Sparse GPU kernels for deep learning.
Proceedings of the International Conference for High Performance Computing, 2020

Top-KAST: Top-K Always Sparse Training.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

On the Generalization Benefit of Noise in Stochastic Gradient Descent.
Proceedings of the 37th International Conference on Machine Learning, 2020

Rigging the Lottery: Making All Tickets Winners.
Proceedings of the 37th International Conference on Machine Learning, 2020

High Fidelity Speech Synthesis with Adversarial Networks.
Proceedings of the 8th International Conference on Learning Representations, 2020

Fast Sparse ConvNets.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019
The Difficulty of Training Sparse Neural Networks.
CoRR, 2019

Non-Differentiable Supervised Learning with Evolution Strategies and Hybrid Methods.
CoRR, 2019

The State of Sparsity in Deep Neural Networks.
CoRR, 2019

Enabling Factorized Piano Music Modeling and Generation with the MAESTRO Dataset.
Proceedings of the 7th International Conference on Learning Representations, 2019

2018
Onsets and Frames: Dual-Objective Piano Transcription.
Proceedings of the 19th International Society for Music Information Retrieval Conference, 2018


Efficient Neural Audio Synthesis.
Proceedings of the 35th International Conference on Machine Learning, 2018

Mixed Precision Training.
Proceedings of the 6th International Conference on Learning Representations, 2018

2017
Exploring Sparsity in Recurrent Neural Networks.
Proceedings of the 5th International Conference on Learning Representations, 2017

DSD: Dense-Sparse-Dense Training for Deep Neural Networks.
Proceedings of the 5th International Conference on Learning Representations, 2017

2016
DSD: Regularizing Deep Neural Networks with Dense-Sparse-Dense Training Flow.
CoRR, 2016

Persistent RNNs: Stashing Recurrent Weights On-Chip.
Proceedings of the 33nd International Conference on Machine Learning, 2016


2015
Deep Speech 2: End-to-End Speech Recognition in English and Mandarin.
CoRR, 2015

2014
Deep Speech: Scaling up end-to-end speech recognition.
CoRR, 2014

2011
Liszt: a domain specific language for building portable mesh-based PDE solvers.
Proceedings of the Conference on High Performance Computing Networking, 2011

2008
Large calculation of the flow over a hypersonic vehicle using a GPU.
J. Comput. Phys., 2008

2007
N-Body Simulations on GPUs
CoRR, 2007

2006
Poster reception - N-Body simulation on GPUs.
Proceedings of the ACM/IEEE SC2006 Conference on High Performance Networking and Computing, 2006


  Loading...