Rama Sanand Doddipatla

CoRR, 2024

2023

Evaluating Large Language Models for Document-grounded Response Generation in Information-Seeking Dialogues.

[BibT_eX]

[DOI]

CoRR, 2023

Adversarial learning of neural user simulators for dialogue policy optimisation.

[BibT_eX]

[DOI]

Caroline Dockes

CoRR, 2023

A Teacher-Student approach for extracting informative speaker embeddings from speech mixtures.

[BibT_eX]

[DOI]

CoRR, 2023

On the Effectiveness of Monoaural Target Source Extraction for Distant end-to-end Automatic Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Cumulative Attention Based Streaming Transformer ASR with Internal Language Model Joint Training and Rescoring.

[BibT_eX]

[DOI]

Cong-Thanh Do

Proceedings of the IEEE International Conference on Acoustics, 2023

Frame-Wise and Overlap-Robust Speaker Embeddings for Meeting Diarization.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Enabling Semi-Structured Knowledge Access via a Question-Answering Module in Task-oriented Dialogue Systems.

[BibT_eX]

[DOI]

Proceedings of the 5th International Conference on Conversational User Interfaces, 2023

Towards a Unified End-to-End Language Understanding System for Speech and Text Inputs.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

Robust Recognition of Speaker Emotion With Difference Feature Extraction Using a Few Enrollment Utterances.

[BibT_eX]

[DOI]

Daichi Hayakawa

Takehiko Kagoshima

Kenji Iwata

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022

Factors in Emotion Recognition With Deep Learning Models Using Speech and Text on Multiple Corpora.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2022

Non-Autoregressive End-to-End Approaches for Joint Automatic Speech Recognition and Spoken Language Understanding.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Combining Structured and Unstructured Knowledge in an Interactive Search Dialogue System.

[BibT_eX]

[DOI]

Suraj Pandey

Proceedings of the 23rd Annual Meeting of the Special Interest Group on Discourse and Dialogue, 2022

Monaural Source Separation: From Anechoic To Reverberant Environments.

[BibT_eX]

[DOI]

Proceedings of the 17th International Workshop on Acoustic Signal Enhancement, 2022

On monoaural speech enhancement for automatic recognition of real noisy speech using mixture invariant training.

[BibT_eX]

[DOI]

Proceedings of the Interspeech 2022, 2022

Self-regularised Minimum Latency Training for Streaming Transformer-based Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the Interspeech 2022, 2022

Multiple-hypothesis RNN-T Loss for Unsupervised Fine-tuning and Self-training of Neural Transducer.

[BibT_eX]

[DOI]

Cong-Thanh Do

Proceedings of the Interspeech 2022, 2022

Speaker Reinforcement Using Target Source Extraction for Robust Automatic Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Transformer-Based Streaming ASR with Cumulative Attention.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

2021

An Investigation into the Multi-channel Time Domain Speaker Extraction Network.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Transformer-Based Online Speech Recognition with Decoder-end Adaptive Computation Steps.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Towards Handling Unconstrained User Preferences in Dialogue.

[BibT_eX]

[DOI]

Suraj Pandey

Proceedings of the Conversational AI for Natural Human-Centric Interaction, 2021

Teacher-Student MixIT for Unsupervised and Semi-Supervised Speech Separation.

[BibT_eX]

[DOI]

Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Time-Domain Speech Extraction with Spatial Information and Multi Speaker Conditioning Mechanism.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Train Your Classifier First: Cascade Neural Networks Training from Upper Layers to Lower Layers.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Action State Update Approach to Dialogue Management.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Head-Synchronous Decoding for Transformer-Based Streaming ASR.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Multiple-Hypothesis CTC-Based Semi-Supervised Adaptation of End-to-End Speech Recognition.

[BibT_eX]

[DOI]

Cong-Thanh Do

Proceedings of the IEEE International Conference on Acoustics, 2021

Improving HS-DACS Based Streaming Transformer ASR with Deep Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

Dialogue Strategy Adaptation to New Action Sets Using Multi-Dimensional Modelling.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

A Study on Cross-Corpus Speech Emotion Recognition and Data Augmentation.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

2020

On End-to-end Multi-channel Time Domain Speech Separation in Reverberant Environments.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Learning Noise Invariant Features Through Transfer Learning For Robust End-to-End Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019

On Reducing the Effect of Speaker Overlap for Chime-5.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

An Unsupervised Learning Approach to Neural-net-supported Wpe Dereverberation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

An Investigation into the Effectiveness of Enhancement in ASR Training and Test for Chime-5 Dinner Party Transcription.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

2017

Speaker Adaptation in DNN-Based Speech Synthesis Using d-Vectors.

[BibT_eX]

[DOI]

Ranniery Maia

Proceedings of the Interspeech 2017, 2017

2016

Speaker adaptive training in deep neural networks using speaker dependent bottleneck features.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

2015

The USFD Spoken Language Translation System for IWSLT 2014.

[BibT_eX]

[DOI]

CoRR, 2015

Noise-matched training of CRF based sentence end detection models.

[BibT_eX]

[DOI]

Madina Hasan

Proceedings of the INTERSPEECH 2015, 2015

2014

The USFD SLT system for IWSLT 2014.

[BibT_eX]

[DOI]

Proceedings of the 11th International Workshop on Spoken Language Translation: Evaluation Campaign@IWSLT 2014, 2014

Multi-pass sentence-end detection of lecture speech.

[BibT_eX]

[DOI]

Madina Hasan

Proceedings of the INTERSPEECH 2014, 2014

Speaker dependent bottleneck layer training for speaker adaptation in automatic speech recognition.

[BibT_eX]

[DOI]

Madina Hasan

Proceedings of the INTERSPEECH 2014, 2014

2013

Non-negative durational HMM.

[BibT_eX]

[DOI]

Jarle Bauck Hamar

Torbjørn Svendsen

Thippur Sreenivas

Proceedings of the IEEE International Workshop on Machine Learning for Signal Processing, 2013

Synthetic speaker models using VTLN to improve the performance of children in mismatched speaker conditions for ASR.

[BibT_eX]

[DOI]

Torbjørn Svendsen

Proceedings of the INTERSPEECH 2013, 2013

2012

VTLN Using Analytically Determined Linear-Transformation on Conventional MFCC.

[BibT_eX]

[DOI]

IEEE Trans. Speech Audio Process., 2012

Creating synthetic voices for children by adapting adult average voice using stacked transformations and VTLN.

[BibT_eX]

[DOI]

Reima Karhila

Mikko Kurimo

Peter Smit

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

2011

A Study on Combining VTLN and SAT to Improve the Performance of Automatic Speech Recognition.

[BibT_eX]

[DOI]

Mikko Kurimo

Proceedings of the INTERSPEECH 2011, 2011

2010

Revisiting VTLN using linear transformation on conventional MFCC.

[BibT_eX]

[DOI]

Ralf Schlüter

Hermann Ney

Proceedings of the INTERSPEECH 2010, 2010

2009

A study on the influence of covariance adaptation on jacobian compensation in vocal tract length normalization.

[BibT_eX]

[DOI]

Shakti Prasad Rath

Proceedings of the INTERSPEECH 2009, 2009

Characterizing speaker variability using spectral envelopes of vowel sounds.

[BibT_eX]

[DOI]

A. N. Harish

Proceedings of the INTERSPEECH 2009, 2009

Improving the performance of VTLN under mismatched speaker conditions and making it approach that of matched speaker conditions.

[BibT_eX]

[DOI]

Shakti Prasad Rath

Proceedings of the IEEE International Conference on Acoustics, 2009

2008

Study of jacobian compensation using linear transformation of conventional MFCC for VTLN.

[BibT_eX]

[DOI]

Proceedings of the INTERSPEECH 2008, 2008

Use of spectral centre of gravity for generating speaker invariant features for automatic speech recognition.

[BibT_eX]

[DOI]

Proceedings of the INTERSPEECH 2008, 2008

A computationally efficient approach to warp factor estimation in VTLN using EM algorithm and sufficient statistics.

[BibT_eX]

[DOI]

Proceedings of the INTERSPEECH 2008, 2008

2007

Linear transformation approach to VTLN using dynamic frequency warping.

[BibT_eX]

[DOI]

D. Dinesh Kumar

Proceedings of the INTERSPEECH 2007, 2007

Speaker-Invariant Features for Automatic Speech Recognition.

[BibT_eX]

[DOI]