Kazumasa Yamamoto

Akinori Ishiki

Proceedings of the 10th IEEE Global Conference on Consumer Electronics, 2021

2020

Effectiveness of Fine Linear Frequency Spectral Feature for Acoustic Event Detection.

[BibT_eX]

[DOI]

Ryo Yamamoto

Proceedings of the 9th IEEE Global Conference on Consumer Electronics, 2020

2019

Discriminative Learning of Filterbank Layer within Deep Neural Network Based Speech Recognition for Speaker Adaptation.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2019

Learning Position Evaluation Functions Used in Monte Carlo Softmax Search.

[BibT_eX]

[DOI]

Harukazu Igarashi

Yuichi Morioka

CoRR, 2019

Evaluation of Real Robot Agent Interface for Spoken Dialogue System.

[BibT_eX]

[DOI]

Akira Tamagawa

Proceedings of the IEEE 8th Global Conference on Consumer Electronics, 2019

2018

Rapid Speaker Adaptation of Neural Network Based Filterbank Layer for Automatic Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

2017

Automatic Explanation Spot Estimation Method Targeted at Text and Figures in Lecture Slides.

[BibT_eX]

[DOI]

Shoko Tsujimura

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

A deep neural network integrated with filterbank learning for speech recognition.

[BibT_eX]

[DOI]

Hiroshi Seki

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Lyric recognition in monophonic singing using pitch-dependent DNN.

[BibT_eX]

[DOI]

Dairoku Kawai

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Detection of overlapping acoustic events based on NMF with shared basis vectors.

[BibT_eX]

[DOI]

Proceedings of the IEEE 6th Global Conference on Consumer Electronics, 2017

2016

Speech analysis of sung-speech and lyric recognition in monophonic singing.

[BibT_eX]

[DOI]

Dairoku Kawai

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Investigation of glottal features and annotation procedures for speech emotion recognition.

[BibT_eX]

[DOI]

Masaaki Takebe

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2016

Domain adaptation of a speech translation system for lectures by utilizing frequently appearing parallel phrases in-domain.

[BibT_eX]

[DOI]

Norioki Goto

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2016

2015

Robust speech recognition using DNN-HMM acoustic model combining noise-aware training with spectral subtraction.

[BibT_eX]

[DOI]

Akihiro Abe

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Combination of syllable based N-gram search and word search for spoken term detection through spoken queries and IV/OOV classification.

[BibT_eX]

[DOI]

Nagisa Sakamoto

Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

Deep neural network based acoustic model using speaker-class information for short time utterance.

[BibT_eX]

[DOI]

Hiroshi Seki

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015

Speech recognition for mixed speech and music by NMF using various cost functions and noise adaptive training methods.

[BibT_eX]

[DOI]

Naoaki Hashimoto

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015

2014

Architecture and Evaluation of Low Power Many-Core SoC with Two 32-Core Clusters.

[BibT_eX]

[DOI]

IEICE Trans. Electron., 2014

Single-channel dereverberation by feature mapping using cascade neural networks for robust distant speaker identification and speech recognition.

[BibT_eX]

[DOI]

Aditya Arie Nugraha

EURASIP J. Audio Speech Music. Process., 2014

Sopoken Term Detection Based on a Syllable N-gram Index at the NTCIR-11 SpokenQuery&Doc Task.

[BibT_eX]

[DOI]

Nagisa Sakamoto

Proceedings of the 11th NTCIR Conference on Evaluation of Information Access Technologies, 2014

Speech recognition based on Itakura-Saito divergence and dynamics/sparseness constraints from mixed sound of speech and music by non-negative matrix factorization.

[BibT_eX]

[DOI]

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

2013

A robust/fast spoken term detection method based on a syllable n-gram index with a distance metric.

[BibT_eX]

[DOI]

Speech Commun., 2013

Development and Evaluation of Spoken Dialog Systems with One or Two Agents through Two Domains.

[BibT_eX]

[DOI]

Proceedings of the Text, Speech, and Dialogue - 16th International Conference, 2013

Development and evaluation of spoken dialog systems with one or two agents.

[BibT_eX]

[DOI]

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Speaker tracking with spherical microphone arrays.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2013

Single channel dereverberation method in log-melspectral domain using limited stereo data for distant speaker identification.

[BibT_eX]

[DOI]

Aditya Arie Nugraha

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2013

Fast NMF based approach and VQ based approach using MFCC distance measure for speech recognition from mixed sound.

[BibT_eX]

[DOI]

Shoichi Nakano

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2013

2012

Hidden Conditional Neural Fields for Continuous Phoneme Speech Recognition.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2012

Improving the Readability of ASR Results for Lectures Using Multiple Hypotheses and Sentence-Level Knowledge.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2012

A low power many-core SoC with two 32-core clusters connected by tree based NoC for multimedia applications.

[BibT_eX]

[DOI]

Proceedings of the Symposium on VLSI Circuits, 2012

Development of large vocabulary continuous speech recognition system for Mongolian language.

[BibT_eX]

[DOI]

Proceedings of the Third Workshop on Spoken Language Technologies for Under-resourced Languages, 2012

Fast NMF based approach and improved VQ based approach for speech recognition from mixed sound.

[BibT_eX]

[DOI]

Shoichi Nakano

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012

Microphone array processing for distant speech recognition: Towards real-world deployment.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012

Soft-clustering technique for training data in Age-and gender-independent speech recognition.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012

2011

Speech Recognition in Mixed Sound of Speech and Music Based on Vector Quantization and Non-Negative Matrix Factorization.

[BibT_eX]

[DOI]

Shoichi Nakano

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Hidden Boosted MMI and Hierarchical State Posterior Feature for Automatic Speech Recognition Based on Hidden Conditional Neural Fields.

[BibT_eX]

[DOI]

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Efficient out-of-vocabulary term detection by n-gram array indices with distance from a syllable lattice.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2011

Automatic speech recognition using Hidden Conditional Neural Fields.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2011

2010

Speaker Recognition by Combining MFCC and Phase Information in Noisy Conditions.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2010

Distant Speech Recognition Using a Microphone Array Network.

[BibT_eX]

[DOI]

Alberto Yoshihiro Nakano

IEICE Trans. Inf. Syst., 2010

Out-of-vocabulary term detection by n-gram array with distance from continuous syllable recognition results.

[BibT_eX]

[DOI]

Proceedings of the 2010 IEEE Spoken Language Technology Workshop, 2010

Evaluation of Privacy Protection Techniques for Speech Signals.

[BibT_eX]

[DOI]

Proceedings of the Information Processing and Management of Uncertainty in Knowledge-Based Systems. Applications, 2010

Speech recognition using long-term phase information.

[BibT_eX]

[DOI]

Eiichi Sueyoshi

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Improving the readability of class lecture ASR results using a confusion network.

[BibT_eX]

[DOI]

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Speaker identification by combining MFCC and phase information in noisy environments.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2010

CENSREC-1-AV: an audio-visual corpus for noisy bimodal speech recognition.

[BibT_eX]

[DOI]

Proceedings of the Auditory-Visual Speech Processing, 2010

2009

Estimating the position and orientation of an acoustic source with a microphone array network.

[BibT_eX]

[DOI]

Alberto Yoshihiro Nakano

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Privacy Protection for Speech Information.

[BibT_eX]

[DOI]

Proceedings of the Fifth International Conference on Information Assurance and Security, 2009

2008

Evaluation Framework for Distant-talking Speech Recognition under Reverberant Environments: newest Part of the CENSREC Series -.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Language Resources and Evaluation, 2008

CENSREC-4: development of evaluation framework for distant-talking speech recognition under reverberant environments.

[BibT_eX]

[DOI]

Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Speech recognition performance of CJLC: corpus of Japanese lecture contents.

[BibT_eX]

[DOI]

Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Class lecture summarization taking into account consecutiveness of important sentences.

[BibT_eX]

[DOI]

Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

2007

Mel-Wiener Filter for Mel-LPC Based Speech Recognition.

[BibT_eX]

[DOI]

Md. Babul Islam

IEICE Trans. Inf. Syst., 2007

Development of VAD evaluation framework CENSREC-1-C and investigation of relationship between VAD and speech recognition performance.

[BibT_eX]

[DOI]

Proceedings of the IEEE Workshop on Automatic Speech Recognition & Understanding, 2007

2006

An improved mel-wiener filter for mel-LPC based speech recognition.

[BibT_eX]

[DOI]

Md. Babul Islam

Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

2005

AURORA-2J: An Evaluation Framework for Japanese Noisy Speech Recognition.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2005

CENSREC-3: Data Collection for In-Car Speech Recognition and Its Common Evaluation Framework.

[BibT_eX]

[DOI]

Proceedings of the 21st International Conference on Data Engineering Workshops, 2005

2003

Integration of noise reduction algorithms for Aurora2 task.

[BibT_eX]

[DOI]

Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

2002

Speech recognition under noisy environments using segmental unit input HMM.

[BibT_eX]

[DOI]

Syst. Comput. Jpn., 2002

Differences of speech rate, interphoneme distance and likelihood caused by speaking style, their relationship, and recognition performance.

[BibT_eX]

[DOI]

Syst. Comput. Jpn., 2002

2001

Evaluation of a generalized dynamic cepstrum in distant speech recognition.

[BibT_eX]

[DOI]

Akihiko Shimizu

Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

2000

Relationship among speaking style, inter-phoneme's distance and speech recognition performance.

[BibT_eX]

[DOI]

Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Forward masking on a generalized logarithmic scale for robust speech recognition.

[BibT_eX]

[DOI]

Yoshihiro Ito

Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

1999

HMM composition of segmental unit input HMM for noisy speech recognition.

[BibT_eX]

[DOI]

Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

1998

Continuous speech recognition using segmental unit input HMMs with a mixture of probability density functions and context dependency.

[BibT_eX]

[DOI]

Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

1997

Speech recognition using hidden Markov models based on segmental statistics.

[BibT_eX]

[DOI]

Syst. Comput. Jpn., 1997

1996

Evaluation of segmental unit input HMM.

[BibT_eX]

[DOI]