Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024

Towards measuring fairness in speech recognition: Fair-Speech dataset.

[BibT_eX]

[DOI]

Irina-Elena Veliche

Zhuangqun Huang

Vineeth Ayyat Kochaniyan

Fuchun Peng

Ozlem Kalinli

Michael L. Seltzer

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Evaluating Speech Recognition Performance Towards Large Language Model Based Voice Assistants.

[BibT_eX]

[DOI]

Zhe Liu

Suyoun Kim

Ozlem Kalinli

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Dynamic ASR Pathways: An Adaptive Masking Approach Towards Efficient Pruning of a Multilingual ASR Model.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Recovering from Privacy-Preserving Masking with Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Contextual Biasing of Named-Entities with Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

TODM: Train Once Deploy Many Efficient Supernet-Based RNN-T Compression For On-Device ASR Models.

[BibT_eX]

[DOI]

Raghuraman Krishnamoorthi

Proceedings of the IEEE International Conference on Acoustics, 2024

Correction Focused Language Model Training For Speech Recognition.

[BibT_eX]

[DOI]

Yingyi Ma

Zhe Liu

Ozlem Kalinli

Proceedings of the IEEE International Conference on Acoustics, 2024

End-to-End Speech Recognition Contextualization with Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Effective Internal Language Model Training and Fusion for Factorized Transducer Model.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Prompting Large Language Models with Speech Recognition Abilities.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Forgetting Private Textual Sequences in Language Models Via Leave-One-Out Ensemble.

[BibT_eX]

[DOI]

Zhe Liu

Ozlem Kalinli

Proceedings of the IEEE International Conference on Acoustics, 2024

2023

Towards General-Purpose Speech Abilities for Large Language Models Using Unpaired Data.

[BibT_eX]

[DOI]

CoRR, 2023

Augmenting text for spoken language understanding with Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2023

Towards Selection of Text-to-speech Data to Augment ASR Training.

[BibT_eX]

[DOI]

CoRR, 2023

Modality Confidence Aware Training for Robust End-to-End Spoken Language Understanding.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Multi-Head State Space Model for Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Learning ASR Pathways: A Sparse Multilingual ASR Model.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Massively Multilingual ASR on 70 Languages: Tokenization, Architecture, and Generalization Capabilities.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Anchored Speech Recognition with Neural Transducers.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Improving fast-slow Encoder based Transducer with Streaming Deliberation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Factorized Blank Thresholding for Improved Runtime Efficiency of Neural Transducers.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Joint Federated Learning and Personalization for on-Device ASR.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022

Joint Audio/Text Training for Transformer Rescorer of Streaming Speech Recognition.

[BibT_eX]

[DOI]

CoRR, 2022

Learning ASR pathways: A sparse multilingual ASR model.

[BibT_eX]

[DOI]

CoRR, 2022

Learning a Dual-Mode Speech Recognition Model VIA Self-Pruning.

[BibT_eX]

[DOI]

Raghuraman Krishnamoorthi

Ozlem Kalinli

Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Scaling ASR Improves Zero and Few Shot Learning.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Streaming parallel transducer beam search with fast slow cascaded encoders.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Deliberation Model for On-Device Spoken Language Understanding.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Evaluating User Perception of Speech Recognition System Quality with Semantic Distance Metric.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Federated Domain Adaptation for ASR with Full Self-Supervision.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Omni-Sparsity DNN: Fast Sparsity Optimization for On-Device Streaming E2E ASR Via Supernet.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Streaming Transformer Transducer based Speech Recognition Using Non-Causal Convolution.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Neural-FST Class Language Model for End-to-End Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Joint Audio/Text Training for Transformer Rescorer of Streaming Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

2021

Transferring Voice Knowledge for Acoustic Event Detection: An Empirical Study.

[BibT_eX]

[DOI]

CoRR, 2021

Noisy Training Improves E2E ASR for the Edge.

[BibT_eX]

[DOI]

CoRR, 2021

Flexi-Transducer: Optimizing Latency, Accuracy and Compute forMulti-Domain On-Device Scenarios.

[BibT_eX]

[DOI]

CoRR, 2021

Transformer-Based Acoustic Modeling for Streaming Speech Synthesis.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Dynamic Encoder Transducer: A Flexible Solution for Trading Off Accuracy for Latency.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Dissecting User-Perceived Latency of On-Device E2E Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Collaborative Training of Acoustic Encoders for Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Flexi-Transducer: Optimizing Latency, Accuracy and Compute for Multi-Domain On-Device Scenarios.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Contextualized Streaming End-to-End Speech Recognition with Trie-Based Deep Biasing and Shallow Fusion.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Semantic Distance: A New Metric for ASR Performance Analysis Towards Spoken Language Understanding.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

SapAugment: Learning A Sample Adaptive Policy for Data Augmentation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

2019

Bandwidth Embeddings for Mixed-Bandwidth Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Parametric Cepstral Mean Normalization for Robust Speech Recognition.

[BibT_eX]

[DOI]

Ozlem Kalinli

Gautam Bhattacharya

Chao Weng

Proceedings of the IEEE International Conference on Acoustics, 2019

2016

Analysis of Multi-Lingual Emotion Recognition Using Auditory Attention Features.

[BibT_eX]

[DOI]

Ozlem Kalinli

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

2015

Emotion clustering based on probabilistic linear discriminant analysis.

[BibT_eX]

[DOI]

Mahnoosh Mehrabani

Ozlem Kalinli

Ruxin Chen

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

2013

Combination of auditory attention features with phone posteriors for better automatic phoneme segmentation.

[BibT_eX]

[DOI]

Ozlem Kalinli

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

2012

Automatic Phoneme Segmentation Using Auditory Attention Features.

[BibT_eX]

[DOI]

Ozlem Kalinli

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

2011

Syllable Segmentation of Continuous Speech Using Auditory Attention Cues.

[BibT_eX]

[DOI]

Ozlem Kalinli

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Tone and pitch accent classification using auditory attention cues.

[BibT_eX]

[DOI]

Ozlem Kalinli

Proceedings of the IEEE International Conference on Acoustics, 2011

2010

Noise Adaptive Training for Robust Automatic Speech Recognition.

[BibT_eX]

[DOI]

IEEE Trans. Speech Audio Process., 2010

2009

Prominence Detection Using Auditory Attention Cues and Task-Dependent High Level Information.

[BibT_eX]

[DOI]

Ozlem Kalinli

Shrikanth S. Narayanan

IEEE Trans. Speech Audio Process., 2009

Saliency-driven unstructured acoustic scene classification using latent perceptual indexing.

[BibT_eX]

[DOI]

Ozlem Kalinli

Shiva Sundaram

Shrikanth S. Narayanan

Proceedings of the 2009 IEEE International Workshop on Multimedia Signal Processing, 2009

Continuous speech recognition using attention shift decoding with soft decision.

[BibT_eX]

[DOI]

Ozlem Kalinli

Shrikanth S. Narayanan

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Noise adaptive training using a vector taylor series approach for noise robust automatic speech recognition.

[BibT_eX]

[DOI]

Ozlem Kalinli

Michael L. Seltzer

Alex Acero

Proceedings of the IEEE International Conference on Acoustics, 2009

2008

Combining task-dependent information with auditory attention cues for prominence detection in speech.

[BibT_eX]

[DOI]

Ozlem Kalinli

Shrikanth S. Narayanan

Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

A top-down auditory attention model for learning task dependent influences on prominence detection in speech.

[BibT_eX]

[DOI]

Ozlem Kalinli

Shrikanth S. Narayanan

Proceedings of the IEEE International Conference on Acoustics, 2008

2007

A saliency-based auditory attention model with applications to unsupervised prominent syllable detection in speech.

[BibT_eX]

[DOI]

Ozlem Kalinli

Shrikanth S. Narayanan

Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Early auditory processing inspired features for robust automatic speech recognition.

[BibT_eX]

[DOI]

Ozlem Kalinli

Shrikanth S. Narayanan

Proceedings of the 15th European Signal Processing Conference, 2007

Ozlem Kalinli

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...