Mohammad Shoeybi

According to our database1, Mohammad Shoeybi authored at least 40 papers between 2006 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Nemotron-4 15B Technical Report.
CoRR, 2024

ODIN: Disentangled Reward Mitigates Hacking in RLHF.
CoRR, 2024

ChatQA: Building GPT-4 Level Conversational QA Models.
CoRR, 2024

2023
VILA: On Pre-training for Visual Language Models.
CoRR, 2023

InstructRetro: Instruction Tuning post Retrieval-Augmented Pretraining.
CoRR, 2023

Retrieval meets Long Context Large Language Models.
CoRR, 2023

RAVEN: In-Context Learning with Retrieval Augmented Encoder-Decoder Language Models.
CoRR, 2023

Re-ViLM: Retrieval-Augmented Visual Language Model for Zero and Few-Shot Image Captioning.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Shall We Pretrain Autoregressive Language Models with Retrieval? A Comprehensive Study.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Context Generation Improves Open Domain Question Answering.
Proceedings of the Findings of the Association for Computational Linguistics: EACL 2023, 2023

Adding Instructions during Pretraining: Effective way of Controlling Toxicity in Language Models.
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, 2023

2022
FP8 Formats for Deep Learning.
CoRR, 2022

Factuality Enhanced Language Models for Open-Ended Text Generation.
CoRR, 2022

Reducing Activation Recomputation in Large Transformer Models.
CoRR, 2022

Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model.
CoRR, 2022

Exploring the Limits of Domain-Adaptive Training for Detoxifying Large-Scale Language Models.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Factuality Enhanced Language Models for Open-Ended Text Generation.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Prompt Compression and Contrastive Conditioning for Controllability and Toxicity Reduction in Language Models.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

Evaluating Parameter Efficient Learning for Generation.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

Multi-Stage Prompting for Knowledgeable Dialogue Generation.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, 2022

2021
Few-shot Instruction Prompts for Pretrained Language Models to Detect Social Biases.
CoRR, 2021

Efficient Large-Scale Language Model Training on GPU Clusters.
CoRR, 2021

Efficient large-scale language model training on GPU clusters using megatron-LM.
Proceedings of the International Conference for High Performance Computing, 2021

Long-Short Transformer: Efficient Transformers for Language and Vision.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

End-to-End Training of Neural Retrievers for Open-Domain Question Answering.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

2020
Local Knowledge Powered Conversational Agents.
CoRR, 2020

Style Example-Guided Text Generation using Generative Adversarial Transformers.
CoRR, 2020

MEGATRON-CNTRL: Controllable Story Generation with External Knowledge Using Large-Scale Language Models.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

BioMegatron: Larger Biomedical Domain Language Model.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

Training Question Answering Models From Synthetic Data.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

Large Scale Multi-Actor Generative Dialog Modeling.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

2019
Neural ODEs for Image Segmentation with Level Sets.
CoRR, 2019

Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism.
CoRR, 2019

Unsupervised Video Interpolation Using Cycle Consistency.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

2017
Trace norm regularization and faster inference for embedded speech recognition RNNs.
CoRR, 2017

Deep Voice: Real-time Neural Text-to-Speech.
CoRR, 2017

Deep Voice: Real-time Neural Text-to-Speech.
Proceedings of the 34th International Conference on Machine Learning, 2017

2010
An adaptive implicit-explicit scheme for the DNS and LES of compressible flows on unstructured grids.
J. Comput. Phys., 2010

2008
Stable and accurate schemes for the compressible Navier-Stokes equations.
J. Comput. Phys., 2008

2006
Towards Wall-Normal Filtering for Large-Eddy Simulation.
Multiscale Model. Simul., 2006


  Loading...