Masakiyo Fujimoto

Hisashi Kawai

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

2020

Automatic Speech Recognition.

[BibT_eX]

[DOI]

Xugang Lu

Sheng Li

Proceedings of the Speech-to-Speech Translation, 2020

2019

One-Pass Single-Channel Noisy Speech Recognition Using a Combination of Noisy and Enhanced Features.

[BibT_eX]

[DOI]

Hisashi Kawai

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

2018

Comparative Evaluations of Various Factored Deep Convolutional Rnn Architectures for Noise Robust Speech Recognition.

[BibT_eX]

[DOI]

Hisashi Kawai

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017

Integration of Spatial Cue-Based Noise Reduction and Speech Model-Based Source Restoration for Real Time Speech Enhancement.

[BibT_eX]

[DOI]

IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2017

Factored Deep Convolutional Neural Networks for Noise Robust Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Multichannel Speech Enhancement Approaches to DNN-Based Far-Field Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the New Era for Robust Speech Recognition, Exploiting Deep Learning., 2017

2016

A generative-discriminative hybrid approach to multi-channel noise reduction for robust automatic speech recognition.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Real-time integration of statistical model-based speech enhancement with unsupervised noise PSD estimation using microphone array.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Multi-pass feature enhancement based on generative-discriminative hybrid approach for noise robust speech recognition.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

2015

Acoustic Event Detection in Speech Overlapping Scenarios Based on High-Resolution Spectral Input and Deep Learning.

[BibT_eX]

[DOI]

Miquel Espi

IEICE Trans. Inf. Syst., 2015

Strategies for distant speech recognitionin reverberant environments.

[BibT_eX]

[DOI]

EURASIP J. Adv. Signal Process., 2015

Exploiting spectro-temporal locality in deep learning based acoustic event detection.

[BibT_eX]

[DOI]

EURASIP J. Audio Speech Music. Process., 2015

Feature extraction strategies in deep learning based acoustic event detection.

[BibT_eX]

[DOI]

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Feature enhancement based on generative-discriminative hybrid approach with gmms and DNNS for noise robust speech recognition.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Exploring multi-channel features for denoising-autoencoder-based speech enhancement.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

The NTT CHiME-3 system: Advances in speech enhancement and recognition for mobile multi-microphone devices.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

2014

Unsupervised non-parametric Bayesian modeling of non-stationary noise for model-based noise suppression.

[BibT_eX]

[DOI]

Yotaro Kubo

Proceedings of the IEEE International Conference on Acoustics, 2014

Spectrogram patch based acoustic event detection and classification in speech overlapping conditions.

[BibT_eX]

[DOI]

Proceedings of the 4th Joint Workshop on Hands-free Speech Communication and Microphone Arrays, 2014

Defeating reverberation: Advanced dereverberation and recognition techniques for hands-free speech recognition.

[BibT_eX]

[DOI]

Proceedings of the 2014 IEEE Global Conference on Signal and Information Processing, 2014

2013

Dominance Based Integration of Spatial and Spectral Features for Speech Enhancement.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2013

Prior-shared feature and model space speaker adaptation by consistently employing map estimation.

[BibT_eX]

[DOI]

Speech Commun., 2013

Speech recognition in living rooms: Integrated speech enhancement and recognition system based on spatial, spectral and temporal modeling of sounds.

[BibT_eX]

[DOI]

Comput. Speech Lang., 2013

Model-based noise suppression using unsupervised estimation of hidden Markov model for non-stationary noise.

[BibT_eX]

[DOI]

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Feature space variational Bayesian linear regression and its combination with model space VBLR.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2013

2012

Low-Latency Real-Time Meeting Recognition and Understanding Using Distant Microphones and Omni-Directional Camera.

[BibT_eX]

[DOI]

IEEE Trans. Speech Audio Process., 2012

Frame-wise model re-estimation method based on Gaussian pruning with weight normalization for noise robust voice activity detection.

[BibT_eX]

[DOI]

Speech Commun., 2012

Speaker Adaptation Using Variational Bayesian Linear Regression in Normalized Feature Space.

[BibT_eX]

[DOI]

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

LogMax observation model with MFCC-based spectral prior for reduction of highly nonstationary ambient noise.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Noise suppression with unsupervised joint speaker adaptation and noise mixture model estimation.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

A tandem connectionist model using combination of multi-scale spectro-temporal features for acoustic event detection.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

2011

Reduction of Highly Nonstationary Ambient Noise by Integrating Spectral and Locational Characteristics of Speech and Noise for Robust ASR.

[BibT_eX]

[DOI]

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

A Robust Estimation Method of Noise Mixture Model for Noise Suppression.

[BibT_eX]

[DOI]

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Joint unsupervised learning of hidden Markov source models and source location models for multichannel source separation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2011

Non-stationary noise estimation method based on bias-residual component decomposition for robust speech recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2011

2010

Noise robust voice activity detection based on periodic to aperiodic component ratio.

[BibT_eX]

[DOI]

Speech Commun., 2010

Real-time meeting recognition and understanding using distant microphones and omni-directional camera.

[BibT_eX]

[DOI]

Proceedings of the 2010 IEEE Spoken Language Technology Workshop, 2010

Multichannel source separation based on source location cue with log-spectral shaping by hidden Markov source model.

[BibT_eX]

[DOI]

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Voice activity detection using frame-wise model re-estimation method based on Gaussian pruning with weight normalization.

[BibT_eX]

[DOI]

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

CENSREC-1-AV: an audio-visual corpus for noisy bimodal speech recognition.

[BibT_eX]

[DOI]

Proceedings of the Auditory-Visual Speech Processing, 2010

2009

A study of mutual front-end processing method based on statistical model for noise robust speech recognition.

[BibT_eX]

[DOI]

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Realtime meeting analysis and 3D meeting viewer based on omnidirectional multimodal sensors.

[BibT_eX]

[DOI]

Proceedings of the 11th International Conference on Multimodal Interfaces, 2009

A speaker diarization method based on the probabilistic fusion of audio-visual location information.

[BibT_eX]

[DOI]

Proceedings of the 11th International Conference on Multimodal Interfaces, 2009

2008

Voice activity detection based on adjustable linear prediction and GARCH models.

[BibT_eX]

[DOI]

Hiroko Kato Solvang

Speech Commun., 2008

Evaluation Framework for Distant-talking Speech Recognition under Reverberant Environments: newest Part of the CENSREC Series -.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Language Resources and Evaluation, 2008

CENSREC-4: development of evaluation framework for distant-talking speech recognition under reverberant environments.

[BibT_eX]

[DOI]

Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Study of integration of statistical model-based voice activity detection and noise suppression.

[BibT_eX]

[DOI]

Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

A realtime multimodal system for analyzing group meetings by combining face pose tracking and speaker diarization.

[BibT_eX]

[DOI]

Proceedings of the 10th International Conference on Multimodal Interfaces, 2008

A voice activity detection based on the adaptive integration of multiple speech features and a signal decision scheme.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2008

Speaker indexing and speech enhancement in real meetings / conversations.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2008

2007

Combination of GMM-based speech estimation method and temporal domain SVD-based speech enhancement for noise robust speech recognition.

[BibT_eX]

[DOI]

Syst. Comput. Jpn., 2007

Noise robust front-end processing with voice activity detection based on periodic to aperiodic component ratio.

[BibT_eX]

[DOI]

Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Noise robust voice activity detection based on switching kalman filter.

[BibT_eX]

[DOI]

Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Two-Microphone Voice Activity Detection Based on the Homogeneity of the Direction of Arrival Estimates.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2007

Noise Robust Voice Activity Detection Based on Statistical Model and Parallel Non-Linear Kalman Filtering.

[BibT_eX]

[DOI]

Hiroko Kato Solvang

Proceedings of the IEEE International Conference on Acoustics, 2007

Development of VAD evaluation framework CENSREC-1-C and investigation of relationship between VAD and speech recognition performance.

[BibT_eX]

[DOI]

Proceedings of the IEEE Workshop on Automatic Speech Recognition & Understanding, 2007

2006

CENSREC-3: An Evaluation Framework for Japanese Speech Recognition in Real Car-Driving Environments.

[BibT_eX]

[DOI]

Kazuya Takeda

IEICE Trans. Inf. Syst., 2006

A Non-stationary Noise Suppression Method Based on Particle Filtering and Polyak Averaging.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2006

CENSREC2: corpus and evaluation environments for in car continuous digit speech recognition.

[BibT_eX]

[DOI]

Kazuya Takeda

Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Sequential Non-Stationary Noise Tracking Using Particle Filtering with Switching Dynamical System.

[BibT_eX]

[DOI]

Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

2005

Recognition of speech from live sports coverage using acoustic and language model adaptation.

[BibT_eX]

[DOI]

Syst. Comput. Jpn., 2005

AURORA-2J: An Evaluation Framework for Japanese Noisy Speech Recognition.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2005

CENSREC-3: Data Collection for In-Car Speech Recognition and Its Common Evaluation Framework.

[BibT_eX]

[DOI]

Proceedings of the 21st International Conference on Data Engineering Workshops, 2005

Particle Filter Based Non-Stationary Noise Tracking for Robust Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

2004

Speech recognition in a noisy environment using a speech signal estimation method based on the Kalman filter.

[BibT_eX]

[DOI]

Syst. Comput. Jpn., 2004

Robust speech recognition in additive and channel noise environments using GMM and EM algorithm.

[BibT_eX]

[DOI]

Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

2003

Integration of noise reduction algorithms for Aurora2 task.

[BibT_eX]

[DOI]

Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Combination of temporal domain SVD based speech enhancement and GMM based speech estimation for ASR in noise - evaluation on the AURORA2 task -.

[BibT_eX]

[DOI]

Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Live speech recognition in sports games by adaptation of acoustic model and language model.

[BibT_eX]

[DOI]

Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

2002

Evaluation of noisy speech recognition based on noise reduction and acoustic model adaptation on the Aurora2 tasks.

[BibT_eX]

[DOI]

Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Noise robust hands-free speech recognition using microphone array and Kalman filter as front-end system of conversational TV.

[BibT_eX]

[DOI]

Proceedings of the IEEE 5th Workshop on Multimedia Signal Processing, 2002

2001

Speech recognition under musical environments using kalman filter and iterative MLLR adaptation.

[BibT_eX]

[DOI]

Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

Continuous speech recognition under non-stationary musical environments based on speech state transition model.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2001

2000

Noisy speech recognition using noise reduction method based on Kalman filter.

[BibT_eX]

[DOI]