Shi-Xiong Zhang

CoRR, 2023

M3-AUDIODEC: Multi-channel multi-speaker multi-spatial audio codec.

[BibT_eX]

[DOI]

Anton Ratnarajah

CoRR, 2023

3D Neural Beamforming for Multi-channel Speech Separation Against Location Uncertainty.

[BibT_eX]

[DOI]

Rongzhi Gu

CoRR, 2023

MMCosine: Multi-Modal Cosine Loss Towards Balanced Audio-Visual Fine-Grained Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Deep Neural Mel-Subband Beamformer for in-Car Speech Separation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Neuralecho: Hybrid of Full-Band and Sub-Band Recurrent Neural Network For Acoustic Echo Cancellation and Speech Enhancement.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022

NeuralEcho: A Self-Attentive Recurrent Neural Network For Unified Acoustic Echo Suppression And Speech Enhancement.

[BibT_eX]

[DOI]

CoRR, 2022

EEND-SS: Joint End-to-End Neural Speaker Diarization and Speech Separation for Flexible Number of Speakers.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Joint Neural AEC and Beamforming with Double-Talk Detection.

[BibT_eX]

[DOI]

Proceedings of the Interspeech 2022, 2022

Joint Modeling of Code-Switched and Monolingual ASR via Conditional Factorization.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Consistent Training and Decoding for End-to-End Speech Recognition Using Lattice-Free MMI.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Multi-Channel Multi-Speaker ASR Using 3D Spatial Feature.

[BibT_eX]

[DOI]

Yiwen Shao

Proceedings of the IEEE International Conference on Acoustics, 2022

Fast-Rir: Fast Neural Diffuse Room Impulse Response Generator.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

2021

Multi-Channel Multi-Frame ADL-MVDR for Target Speech Separation.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2021

Audio-Visual Multi-Channel Integration and Recognition of Overlapped Speech.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2021

An Overview of Deep-Learning-Based Audio-Visual Speech Enhancement and Separation.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2021

Complex Neural Spatial Filter: Enhancing Multi-Channel Target Speech Separation in Complex Domain.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2021

Joint AEC AND Beamforming with Double-Talk Detection using RNN-Transformer.

[BibT_eX]

[DOI]

CoRR, 2021

Generalized RNN beamformer for target speech separation.

[BibT_eX]

[DOI]

CoRR, 2021

WPD++: An Improved Neural Beamformer for Simultaneous Speech Separation and Dereverberation.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Neural Mask based Multi-channel Convolutional Beamforming for Joint Dereverberation, Echo Cancellation and Denoising.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2021

MetricNet: Towards Improved Modeling For Non-Intrusive Speech Quality Assessment.

[BibT_eX]

[DOI]

Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Generalized Spatio-Temporal RNN Beamformer for Target Speech Separation.

[BibT_eX]

[DOI]

Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

TeCANet: Temporal-Contextual Attention Network for Environment-Aware Speech Dereverberation.

[BibT_eX]

[DOI]

Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

MIMO Self-Attentive RNN Beamformer for Multi-Speaker Speech Separation.

[BibT_eX]

[DOI]

Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Multi-Channel Speaker Verification for Single and Multi-Talker Speech.

[BibT_eX]

[DOI]

Saurabh Kataria

Aswin Shanmugam Subramanian

Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

ADL-MVDR: All Deep Learning MVDR Beamformer for Target Speech Separation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Directional ASR: A New Paradigm for E2E Multi-Speaker Speech Recognition with Source Localization.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

3D Spatial Features for Multi-Channel Target Speech Separation.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

2020

Audio-Visual Speech Separation and Dereverberation With a Two-Stage Multimodal Network.

[BibT_eX]

[DOI]

IEEE J. Sel. Top. Signal Process., 2020

Multi-Modal Multi-Channel Target Speech Separation.

[BibT_eX]

[DOI]

IEEE J. Sel. Top. Signal Process., 2020

Audio-Visual Multi-Channel Recognition of Overlapped Speech.

[BibT_eX]

[DOI]

Proceedings of the Interspeech 2020, 2020

Neural Spatio-Temporal Beamformer for Target Speech Separation.

[BibT_eX]

[DOI]

Proceedings of the Interspeech 2020, 2020

Exploiting Cross-Domain Visual Feature Generation for Disordered Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the Interspeech 2020, 2020

Audio-Visual Recognition of Overlapped Speech for the LRS2 Dataset.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Far-Field Location Guided Target Speech Extraction Using End-to-End Speech Recognition Objectives.

[BibT_eX]

[DOI]

Aswin Shanmugam Subramanian

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Enhancing End-to-End Multi-Channel Speech Separation Via Spatial Feature Learning.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Self-Supervised Learning for Audio-Visual Speaker Diarization.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019

A Unified Framework for Speech Separation.

[BibT_eX]

[DOI]

Fahimeh Bahmaninezhad

CoRR, 2019

End-to-End Multi-Channel Speech Separation.

[BibT_eX]

[DOI]

CoRR, 2019

Improved Speaker-Dependent Separation for CHiME-5 Challenge.

[BibT_eX]

[DOI]

Proceedings of the Interspeech 2019, 2019

Neural Spatial Filter: Target Speaker Speech Separation Assisted with Directional Information.

[BibT_eX]

[DOI]

Proceedings of the Interspeech 2019, 2019

A Comprehensive Study of Speech Separation: Spectrogram vs Waveform Separation.

[BibT_eX]

[DOI]

Fahimeh Bahmaninezhad

Proceedings of the Interspeech 2019, 2019

Encrypted Speech Recognition Using Deep Polynomial Networks.

[BibT_eX]

[DOI]

Yifan Gong

Proceedings of the IEEE International Conference on Acoustics, 2019

Time Domain Audio Visual Speech Separation.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

2018

Domain and Speaker Adaptation for Cortana Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Exploring Sequential Characteristics in Speaker Bottleneck Feature for Text-Dependent Speaker Verification.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017

Challenges in and Solutions to Deep Learning Network Acoustic Modeling in Speech Recognition Products at Microsoft.

[BibT_eX]

[DOI]

Proceedings of the New Era for Robust Speech Recognition, Exploiting Deep Learning., 2017

2016

End-to-End attention based text-dependent speaker verification.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE Spoken Language Technology Workshop, 2016

Recurrent support vector machines for speech recognition.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Simplifying long short-term memory acoustic models for fast training and decoding.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

2015

Deep neural support vector machines for speech recognition.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

2014

Infinite structured support vector machines for speech recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2014

2013

Structured SVMs for Automatic Speech Recognition.

[BibT_eX]

[DOI]

IEEE Trans. Speech Audio Process., 2013

Kernelized log linear models for continuous speech recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2013

Investigation of multilingual deep neural networks for spoken term detection.

[BibT_eX]

[DOI]

Proceedings of the 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, 2013

2011

Optimized Discriminative Kernel for SVM Scoring and Its Application to Speaker Verification.

[BibT_eX]

[DOI]

IEEE Trans. Neural Networks, 2011

Structured Support Vector Machines for Noise Robust Continuous Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the INTERSPEECH 2011, 2011

Extending noise robust structured support vector machines to larger vocabulary tasks.

[BibT_eX]

[DOI]

Proceedings of the 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, 2011

2010

Structured Log Linear Models for Noise Robust Speech Recognition.

[BibT_eX]

[DOI]

Anton Ragni

IEEE Signal Process. Lett., 2010

2009

A new adaptation approach to high-level speaker-model creation in speaker verification.

[BibT_eX]

[DOI]

Speech Commun., 2009

Optimization of discriminative kernels in SVM speaker verification.

[BibT_eX]

[DOI]

Proceedings of the INTERSPEECH 2009, 2009

2008

High-level speaker verification via articulatory-feature based sequence kernels and SVM.

[BibT_eX]

[DOI]

Proceedings of the INTERSPEECH 2008, 2008

2007

Speaker Verification via High-Level Feature Based Phonetic-Class Pronunciation Modeling.

[BibT_eX]

[DOI]

Helen M. Meng

IEEE Trans. Computers, 2007

A New Adaptation Method for Speaker-Model Creation in High-Level Speaker Verification.

[BibT_eX]

[DOI]

Proceedings of the Advances in Multimedia Information Processing, 2007

High-level feature-based speaker verification via articulatory phonetic-class pronunciation modeling.

[BibT_eX]

[DOI]