Igor Gitman

According to our database1, Igor Gitman authored at least 18 papers between 2017 and 2025.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
CoRR, August, 2025

GenSelect: A Generative Approach to Best-of-N.
CoRR, July, 2025

The Challenge of Teaching Reasoning to LLMs Without RL or Distillation.
CoRR, July, 2025

Llama-Nemotron: Efficient Reasoning Models.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
CoRR, May, 2025

NeMo-Inspector: A Visualization Tool for LLM Generation Analysis.
CoRR, May, 2025

AIMO-2 Winning Solution: Building State-of-the-Art Mathematical Reasoning Models with OpenMathReasoning dataset.
CoRR, April, 2025

Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models.
CoRR, April, 2025

OpenMathInstruct-2: Accelerating AI for Math with Massive Open-Source Instruction Data.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

2024
Nemotron-4 340B Technical Report.
CoRR, 2024

OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

2023
Confidence-based Ensembles of End-to-End Speech Recognition Models.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Powerful and Extensible WFST Framework for Rnn-Transducer Losses.
Proceedings of the IEEE International Conference on Acoustics, 2023

2019
Understanding the Role of Momentum in Stochastic Gradient Methods.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

2018
OpenSeq2Seq: extensible toolkit for distributed and mixed precision training of sequence-to-sequence models.
CoRR, 2018

Novel Prediction Techniques Based on Clusterwise Linear Regression.
CoRR, 2018

Convergence Analysis of Gradient Descent Algorithms with Proportional Updates.
CoRR, 2018

2017
Comparison of Batch Normalization and Weight Normalization Algorithms for the Large-scale Image Classification.
CoRR, 2017

Scaling SGD Batch Size to 32K for ImageNet Training.
CoRR, 2017


  Loading...