Noam Shazeer

According to our database1, Noam Shazeer authored at least 29 papers between 2010 and 2018.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Other 

Links

On csauthors.net:

Bibliography

2018
An Improved Relative Self-Attention Mechanism for Transformer with Application to Music Generation.
CoRR, 2018

Adafactor: Adaptive Learning Rates with Sublinear Memory Cost.
CoRR, 2018

Tensor2Tensor for Neural Machine Translation.
CoRR, 2018

Fast Decoding in Sequence Models using Discrete Latent Variables.
CoRR, 2018

Image Transformer.
CoRR, 2018

Generating Wikipedia by Summarizing Long Sequences.
CoRR, 2018

Adafactor: Adaptive Learning Rates with Sublinear Memory Cost.
Proceedings of the 35th International Conference on Machine Learning, 2018

Image Transformer.
Proceedings of the 35th International Conference on Machine Learning, 2018

Fast Decoding in Sequence Models Using Discrete Latent Variables.
Proceedings of the 35th International Conference on Machine Learning, 2018

Tensor2Tensor for Neural Machine Translation.
Proceedings of the 13th Conference of the Association for Machine Translation in the Americas, 2018

The Best of Both Worlds: Combining Recent Advances in Neural Machine Translation.
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018

2017
Attention Is All You Need.
CoRR, 2017

Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer.
CoRR, 2017

One Model To Learn Them All.
CoRR, 2017

Attention is All you Need.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

2016
Sparse Non-negative Matrix Language Modeling.
TACL, 2016

Swivel: Improving Embeddings by Noticing What's Missing.
CoRR, 2016

Exploring the Limits of Language Modeling.
CoRR, 2016

NN-grams: Unifying neural network and n-gram language models for Speech Recognition.
CoRR, 2016

NN-Grams: Unifying Neural Network and n-Gram Language Models for Speech Recognition.
Proceedings of the Interspeech 2016, 2016

End-to-end text-dependent speaker verification.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

2015
End-to-End Text-Dependent Speaker Verification.
CoRR, 2015

Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks.
CoRR, 2015

Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Sparse non-negative matrix language modeling for skip-grams.
Proceedings of the INTERSPEECH 2015, 2015

Pruning sparse non-negative matrix n-gram language models.
Proceedings of the INTERSPEECH 2015, 2015

Sparse non-negative matrix language modeling for geo-annotated query session data.
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

2014
Skip-gram Language Modeling Using Sparse Non-negative Matrix Probability Estimation.
CoRR, 2014

2010
Variational Program Inference
CoRR, 2010


  Loading...