Ehsan Variani

According to our database¹, Ehsan Variani authored at least 36 papers between 2011 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

Gemini Embedding 2: A Native Multimodal Embedding Model from Gemini.

[BibT_eX]

[DOI]

Madhuri Shanbhogue

Zhe Li

Shanfeng Zhang

Gustavo Hernández Ábrego

Henrique Schechter Vera

Mojtaba Seyedhosseini

CoRR, May, 2026

Benchmarking LLMs on the Massive Sound Embedding Benchmark (MSEB).

[BibT_eX]

[DOI]

CoRR, May, 2026

Massive Sound Embedding Benchmark (MSEB).

[BibT_eX]

[DOI]

CoRR, February, 2026

2023

Last: Scalable Lattice-Based Speech Modelling in Jax.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Alignment Entropy Regularization.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

JEIT: Joint End-to-End Model and Internal Language Model Training for Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

2022

Modular Hybrid Autoregressive Transducer.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Global Normalization for Streaming Speech Recognition in a Modular Framework.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Improving Rare Word Recognition with LM-aware MWER Training.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

On Adaptive Weight Interpolation of the Hybrid Autoregressive Transducer.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

UserLibri: A Dataset for ASR Personalization Using Only Text.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Multilingual Second-Pass Rescoring for Automatic Speech Recognition Systems.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

2021

An Efficient Streaming Non-Recurrent On-Device End-to-End Model with Improvements to Rare-Word Modeling.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

A Hybrid Seq-2-Seq ASR Design for On-Device and Server Applications.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Cascaded Encoders for Unifying Streaming and Non-Streaming ASR.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

2020

Hybrid Autoregressive Transducer (HAT).

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Neural Oracle Search on N-BEST Hypotheses.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019

West: Word Encoded Sequence Transducers.

[BibT_eX]

[DOI]

Ehsan Variani

Ananda Theertha Suresh

Mitchel Weintraub

Proceedings of the IEEE International Conference on Acoustics, 2019

A Density Ratio Approach to Language Model Fusion in End-to-End Automatic Speech Recognition.

[BibT_eX]

[DOI]

Erik McDermott

Hasim Sak

Ehsan Variani

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

2018

Efficient Implementation of the Room Simulator for Training Deep Neural Network Acoustic Models.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Sampled Connectionist Temporal Classification.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017

Multichannel Signal Processing With Deep Neural Networks for Automatic Speech Recognition.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2017

End-to-End Training of Acoustic Models for Large Vocabulary Continuous Speech Recognition with TensorFlow.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Acoustic Modeling for Google Home.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Raw Multichannel Processing Using Deep Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the New Era for Robust Speech Recognition, Exploiting Deep Learning., 2017

2016

Stream fusion for multi-stream automatic speech recognition.

[BibT_eX]

[DOI]

Int. J. Speech Technol., 2016

Complex Linear Projection (CLP): A Discriminative Approach to Joint Feature Extraction and Acoustic Modeling.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Reducing the Computational Complexity of Multimicrophone Acoustic Models with Integrated Feature Extraction.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

2015

Non-Adaptative Policies for 20 Questions Target Localization.

[BibT_eX]

[DOI]

CoRR, 2015

NON-adaptive policies for 20 questions target localization.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Information Theory, 2015

A Gaussian Mixture Model layer jointly optimized with discriminative features within a Deep Neural Network architecture.

[BibT_eX]

[DOI]

Ehsan Variani

Erik McDermott

Georg Heigold

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

2014

Deep neural networks for small footprint text-dependent speaker verification.

[BibT_eX]

[DOI]

Javier Gonzalez-Dominguez

Proceedings of the IEEE International Conference on Acoustics, 2014

2013

Multi-stream recognition of noisy speech with performance monitoring.

[BibT_eX]

[DOI]

Ehsan Variani

Feipeng Li

Hynek Hermansky

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Mean temporal distance: Predicting ASR error from temporal properties of speech signal.

[BibT_eX]

[DOI]

Hynek Hermansky

Ehsan Variani

Vijayaditya Peddinti

Proceedings of the IEEE International Conference on Acoustics, 2013

2012

Estimating Classifier Performance in Unknown Noise.

[BibT_eX]

[DOI]

Ehsan Variani

Hynek Hermansky

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

2011

VTLN in the MFCC Domain: Band-Limited versus Local Interpolation.

[BibT_eX]

[DOI]

Ehsan Variani

Thomas Schaaf

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Ehsan Variani

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...