Shubham Toshniwal

According to our database1, Shubham Toshniwal authored at least 35 papers between 2015 and 2025.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
GenSelect: A Generative Approach to Best-of-N.
CoRR, July, 2025

The Challenge of Teaching Reasoning to LLMs Without RL or Distillation.
CoRR, July, 2025

Llama-Nemotron: Efficient Reasoning Models.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
CoRR, May, 2025

AIMO-2 Winning Solution: Building State-of-the-Art Mathematical Reasoning Models with OpenMathReasoning dataset.
CoRR, April, 2025

IdentifyMe: A Challenging Long-Context Mention Resolution Benchmark for LLMs.
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025

OpenMathInstruct-2: Accelerating AI for Math with Massive Open-Source Instruction Data.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

2024
IdentifyMe: A Challenging Long-Context Mention Resolution Benchmark.
CoRR, 2024

Major Entity Identification: A Generalizable Alternative to Coreference Resolution.
CoRR, 2024

Nemotron-4 340B Technical Report.
CoRR, 2024

Code Pretraining Improves Entity Tracking Abilities of Language Models.
CoRR, 2024

OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Major Entity Identification: A Generalizable Alternative to Coreference Resolution.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

2023
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
Trans. Mach. Learn. Res., 2023

Learning to Reason and Memorize with Self-Notes.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Adapting Pretrained Text-to-Text Models for Long Text Sequences.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Robustness of Named-Entity Replacements for In-Context Learning.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

2022
Efficient and Interpretable Neural Models for Entity Tracking.
CoRR, 2022

Baked-in State Probing.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

Chess as a Testbed for Language Model State Tracking.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
On Generalization in Coreference Resolution.
CoRR, 2021

Learning Chess Blindfolded: Evaluating Language Models on State Tracking.
CoRR, 2021

2020
A Cross-Task Analysis of Text Span Representations.
Proceedings of the 5th Workshop on Representation Learning for NLP, 2020

Learning to Ignore: Long Document Coreference with Bounded Memory Neural Networks.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

PeTra: A Sparsely Supervised Memory Model for People Tracking.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

2019
Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling.
CoRR, 2019

Pre-Trained Text Embeddings for Enhanced Text-to-Speech Synthesis.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

2018
Hierarchical Multitask Learning for CTC-based Speech Recognition.
CoRR, 2018

A Comparison of Techniques for Language Model Integration in Encoder-Decoder Speech Recognition.
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

Parsing Speech: a Neural Approach to Integrating Lexical and Acoustic-Prosodic Information.
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018

Multilingual Speech Recognition with a Single End-to-End Model.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017
Joint Modeling of Text and Acoustic-Prosodic Cues for Neural Parsing.
CoRR, 2017

Multitask Learning with Low-Level Auxiliary Tasks for Encoder-Decoder Based Speech Recognition.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

2016
Jointly learning to align and convert graphemes to phonemes with neural attention models.
Proceedings of the 2016 IEEE Spoken Language Technology Workshop, 2016

2015
VibRein: An Engaging and Assistive Mobile Learning Companion for Students with Intellectual Disabilities.
Proceedings of the Annual Meeting of the Australian Special Interest Group for Computer Human Interaction, 2015

USHER: An Intelligent Tour Companion.
Proceedings of the 20th International Conference on Intelligent User Interfaces Companion, 2015


  Loading...