Sharan Narang

According to our database1, Sharan Narang authored at least 30 papers between 2015 and 2023.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2023
PaLM: Scaling Language Modeling with Pathways.
J. Mach. Learn. Res., 2023

Effective Long-Context Scaling of Foundation Models.
CoRR, 2023

Llama 2: Open Foundation and Fine-Tuned Chat Models.
CoRR, 2023

A Theory on Adam Instability in Large-Scale Machine Learning.
CoRR, 2023

UniMax: Fairer and More Effective Language Sampling for Large-Scale Multilingual Pretraining.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Self-Consistency Improves Chain of Thought Reasoning in Language Models.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Scaling Laws vs Model Architectures: How does Inductive Bias Influence Scaling?
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Understanding HTML with Large Language Models.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Character-Aware Models Improve Visual Text Rendering.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2022
ByT5: Towards a Token-Free Future with Pre-trained Byte-to-Byte Models.
Trans. Assoc. Comput. Linguistics, 2022

FCM: Forgetful Causal Masking Makes Causal Language Models Better Zero-Shot Learners.
CoRR, 2022

Scaling Instruction-Finetuned Language Models.
CoRR, 2022

Scaling Up Models and Data with t5x and seqio.
CoRR, 2022

Scale Efficiently: Insights from Pretraining and Finetuning Transformers.
Proceedings of the Tenth International Conference on Learning Representations, 2022

2021
Scale Efficiently: Insights from Pre-training and Fine-tuning Transformers.
CoRR, 2021

Do Transformer Modifications Transfer Across Implementations and Applications?
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

2020
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer.
J. Mach. Learn. Res., 2020

WT5?! Training Text-to-Text Models to Explain their Predictions.
CoRR, 2020

On Task-Level Dialogue Composition of Generative Transformer Model.
Proceedings of the First Workshop on Insights from Negative Results in NLP, 2020

2019
Neural Assistant: Joint Action Prediction, Response Generation, and Latent Knowledge Reasoning.
CoRR, 2019

2018
Deep Voice 3: Scaling Text-to-Speech with Convolutional Sequence Learning.
Proceedings of the 6th International Conference on Learning Representations, 2018

Mixed Precision Training.
Proceedings of the 6th International Conference on Learning Representations, 2018

2017
Deep Learning Scaling is Predictable, Empirically.
CoRR, 2017

Block-Sparse Recurrent Neural Networks.
CoRR, 2017

Deep Voice 3: 2000-Speaker Neural Text-to-Speech.
CoRR, 2017

Exploring Sparsity in Recurrent Neural Networks.
Proceedings of the 5th International Conference on Learning Representations, 2017

DSD: Dense-Sparse-Dense Training for Deep Neural Networks.
Proceedings of the 5th International Conference on Learning Representations, 2017

2016
DSD: Regularizing Deep Neural Networks with Dense-Sparse-Dense Training Flow.
CoRR, 2016


2015
Deep Speech 2: End-to-End Speech Recognition in English and Mandarin.
CoRR, 2015


  Loading...