Norihide Kitaoka

Proceedings of the 2025 28th Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA), 2025

A Corpus-Based Investigation of Acoustic Features Influencing Intelligibility of Super-Elderly Japanese Speech.

[BibT_eX]

[DOI]

Meiko Fukuda

Proceedings of the 2025 28th Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA), 2025

Fine-tuning Parakeet-TDT for Dysarthric Speech Recognition in the Speech Accessibility Project Challenge.

[BibT_eX]

[DOI]

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Backchannel prediction for natural spoken dialog systems using general speaker and listener information.

[BibT_eX]

[DOI]

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Improving Automatic Speech Recognition Model for Super-Elderly Voice Using Speech Synthesis Model.

[BibT_eX]

[DOI]

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2025

Toward Natural System Repair: An Analysis of Human Other-Initiated Self-Repair Patterns in Japanese Casual Conversations.

[BibT_eX]

[DOI]

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2025

Improving Listening Head Generation Performance Using Speech Representations from Self-Supervised Learning.

[BibT_eX]

[DOI]

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2025

Real-time VAD-less Speech Recognition by Fine-tuning SSL Model with Data Containing Tagged Non-speech Segments.

[BibT_eX]

[DOI]

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2025

2024

Recognition of target domain Japanese speech using language model replacement.

[BibT_eX]

[DOI]

EURASIP J. Audio Speech Music. Process., December, 2024

Analysis of the relationship between user response to dialog breakdown and personality traits.

[BibT_eX]

[DOI]

Adv. Robotics, February, 2024

Text-only Domain Adaptation for CTC-based Speech Recognition through Substitution of Implicit Linguistic Information in the Search Space.

[BibT_eX]

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Boosting CTC-based ASR using inter-layer attention-based CTC loss.

[BibT_eX]

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Evaluation of Speech Translation Subtitles Generated by ASR with Unnecessary Word Detection.

[BibT_eX]

[DOI]

Proceedings of the 13th IEEE Global Conference on Consumer Electronics, 2024

Domain Adaptation by Alternating Learning of Acoustic and Linguistic Information for Japanese Deaf and Hard-of-Hearing People.

[BibT_eX]

[DOI]

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2024

2023

A new speech corpus of super-elderly Japanese for acoustic modeling.

[BibT_eX]

[DOI]

Comput. Speech Lang., 2023

Relationships Between Gender, Personality Traits and Features of Multi-Modal Data to Responses to Spoken Dialog Systems Breakdown.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Development of a model for predicting timing of back-channel in a real-time spoken dialog system.

[BibT_eX]

[DOI]

Proceedings of the 12th IEEE Global Conference on Consumer Electronics, 2023

Streaming End-to-End ASR Using CTC Decoder and DRA for Linguistic Information Substitution.

[BibT_eX]

[DOI]

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023

Construction of Automatic Speech Recognition Model that Recognizes Linguistic Information and Verbal/Non-verbal Phenomena.

[BibT_eX]

[DOI]

Nagito Shione

Yukoh Wakabayashi

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023

Language modeling for spontaneous speech recognition based on disfluency labeling and generation of disfluent text.

[BibT_eX]

[DOI]

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023

Combining multiple end-to-end speech recognition models based on density ratio approach.

[BibT_eX]

[DOI]

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023

2022

A Corpus-Based Analysis Of Age-Related Changes In The Acoustic Features Of Elderly To Super Elderly Speech.

[BibT_eX]

[DOI]

Proceedings of the 25th Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques, 2022

Elderly Conversational Speech Corpus with Cognitive Impairment Test and Pilot Dementia Detection Experiment Using Acoustic Characteristics of Speech in Japanese Dialects.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

End-to-End Spontaneous Speech Recognition Using Disfluency Labeling.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

End-to-End Speech to Braille Translation in Japanese.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Consumer Electronics, 2022

Dialog Breakdown Detection using Multimodal Features for Non-task-oriented Dialog Systems.

[BibT_eX]

[DOI]

Proceedings of the 11th IEEE Global Conference on Consumer Electronics, 2022

2021

Normalization of Transliterated Mongolian Words Using Seq2Seq Model with Limited Data.

[BibT_eX]

[DOI]

ACM Trans. Asian Low Resour. Lang. Inf. Process., 2021

Response type selection for chat-like spoken dialog systems based on LSTM and multi-task learning.

[BibT_eX]

[DOI]

Kengo Ohta

Speech Commun., 2021

Dynamic out-of-vocabulary word registration to language model for speech recognition.

[BibT_eX]

[DOI]

Bohan Chen

Yuya Obashi

EURASIP J. Audio Speech Music. Process., 2021

Text-to-speech system for low-resource language using cross-lingual transfer learning and data augmentation.

[BibT_eX]

[DOI]

EURASIP J. Audio Speech Music. Process., 2021

Corpus Design and Automatic Speech Recognition for Deaf and Hard-of-Hearing People.

[BibT_eX]

[DOI]

Proceedings of the 10th IEEE Global Conference on Consumer Electronics, 2021

Advanced language model fusion method for encoder-decoder model in Japanese speech recognition.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021

End-to-End Spontaneous Speech Recognition Using Hesitation Labeling.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021

Multi-speaker TTS system for low-resource language using cross-lingual transfer learning and data augmentation.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021

2020

Example Phrase Adaptation Method for Customized, Example-Based Dialog System Using User Data and Distributed Word Representations.

[BibT_eX]

[DOI]

Eichi Seto

IEICE Trans. Inf. Syst., 2020

Improving Speech Recognition for the Elderly: A New Corpus of Elderly Japanese Speech and Investigation of Acoustic Modeling for Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of The 12th Language Resources and Evaluation Conference, 2020

Development of a Low-Latency and Real-Time Automatic Speech Recognition System.

[BibT_eX]

[DOI]

Proceedings of the 9th IEEE Global Conference on Consumer Electronics, 2020

2019

A New Corpus of Elderly Japanese Speech for Acoustic Modeling, and a Preliminary Investigation of Dialect-Dependent Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 22nd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques, 2019

Small-Footprint Magic Word Detection Method Using Convolutional LSTM Neural Network.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Environmental Sounds Recognition with Convolutional-LSTM.

[BibT_eX]

[DOI]

Akihisa Komatsu

Proceedings of the IEEE 8th Global Conference on Consumer Electronics, 2019

Type of Response Selection utilizing User Utterance Word Sequence, LSTM and Multi-task Learning for Chat-like Spoken Dialog Systems.

[BibT_eX]

[DOI]

Kengo Ohta

Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

2018

Daily Activity Recognition with Large-Scaled Real-Life Recording Datasets Based on Deep Neural Network Using Multi-Modal Signals.

[BibT_eX]

[DOI]

IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2018

Mapping Acoustic Vector Space and Document Vector Space by RNN-LSTM.

[BibT_eX]

[DOI]

Miho Higaki

Proceedings of the IEEE 7th Global Conference on Consumer Electronics, 2018

Multi-modal Geometry Tutoring System Using Speech and Touchscreen Figure Tracing.

[BibT_eX]

[DOI]

Kanta Kiyohara

Proceedings of the IEEE 7th Global Conference on Consumer Electronics, 2018

Construction of a Corpus for Elderly Japanese Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE 7th Global Conference on Consumer Electronics, 2018

Customization of an example-based dialog system with user data and distributed word representations.

[BibT_eX]

[DOI]

Eichi Seto

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

2017

A human machine interface framework for autonomous vehicle control.

[BibT_eX]

[DOI]

Proceedings of the IEEE 6th Global Conference on Consumer Electronics, 2017

Selecting type of response for chat-like spoken dialogue systems based on acoustic features of user utterances.

[BibT_eX]

[DOI]

Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

2016

Investigation of DNN-Based Audio-Visual Speech Recognition.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2016

Foreword.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2016

Impact of acoustic similarity on efficiency of verbal information transmission via subtle prosodic cues.

[BibT_eX]

[DOI]

Bohan Chen

EURASIP J. Audio Speech Music. Process., 2016

Speech Corpus Spoken by Young-old, Old-old and Oldest-old Japanese.

[BibT_eX]

[DOI]

Shuhei Segawa

Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, 2016

2015

Modeling of Physical Characteristics of Speech under Stress.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2015

Development of new speech corpus for elderly Japanese speech recognition.

[BibT_eX]

[DOI]

Shuhei Segawa

Proceedings of the 2015 International Conference Oriental COCOSDA held jointly with 2015 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA/CASLRE), 2015

Integration of deep bottleneck features for audio-visual speech recognition.

[BibT_eX]

[DOI]

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Daily activity recognition based on DNN using environmental sound and acceleration signals.

[BibT_eX]

[DOI]

Proceedings of the 23rd European Signal Processing Conference, 2015

Audio-visual speech recognition using deep bottleneck features and high-performance lipreading.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015

Daily activity recognition based on acoustic signals and acceleration signals estimated with Gaussian process.

[BibT_eX]

[DOI]

Masafumi Nishida

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015

Relationship between speaker/listener similarity and information transmission quality in speech communication.

[BibT_eX]

[DOI]

Bohan Chen

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015

2014

A Graph-Based Spoken Dialog Strategy Utilizing Multiple Understanding Hypotheses.

[BibT_eX]

[DOI]

Inf. Media Technol., 2014

Effective Frame Selection for Blind Source Separation Based on Frequency Domain Independent Component Analysis.

[BibT_eX]

[DOI]

IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2014

Improvement of multimodal gesture and speech recognition performance using time intervals between gestures and accompanying speech.

[BibT_eX]

[DOI]

EURASIP J. Audio Speech Music. Process., 2014

Effect of acoustic and linguistic contexts on human and machine speech recognition.

[BibT_eX]

[DOI]

Daisuke Enami

Comput. Speech Lang., 2014

STD Method Based on Hash Function for NTCIR11 SpokenQuery&Doc Task.

[BibT_eX]

[DOI]

Proceedings of the 11th NTCIR Conference on Evaluation of Information Access Technologies, 2014

Measuring aggressive driving behavior using signals from drive recorders.

[BibT_eX]

[DOI]

Proceedings of the 17th International IEEE Conference on Intelligent Transportation Systems, 2014

Development and preliminary analysis of sensor signal database of continuous daily living activity over the long term.

[BibT_eX]

[DOI]

Masafiimi Nishida

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014

Noisy speech recognition using blind spatial subtraction array technique and deep bottleneck features.

[BibT_eX]

[DOI]

Tomoki Hayashi

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014

Content-Based Driving Scene Retrieval Using Driving Behavior and Environmental Driving Signals.

[BibT_eX]

[DOI]

Proceedings of the Smart Mobile In-Vehicle Systems, Next Generation Advancements, 2014

2013

Classification of speech under stress based on modeling of the vocal folds and vocal tract.

[BibT_eX]

[DOI]

EURASIP J. Audio Speech Music. Process., 2013

Spoken Content Retrieval Using Distance Combination and Spoken Term Detection Using Hash Function for NTCIR10 SpokenDoc2 Task.

[BibT_eX]

[DOI]

Proceedings of the 10th NTCIR Conference on Evaluation of Information Access Technologies, 2013

Integrated modeling of driver gaze and vehicle operation behavior to estimate risk level during lane changes.

[BibT_eX]

[DOI]

Proceedings of the 16th International IEEE Conference on Intelligent Transportation Systems, 2013

Classification of speech under stress by modeling the aerodynamics of the laryngeal ventricle.

[BibT_eX]

[DOI]

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Estimation of vocal tract parameters for the classification of speech under stress.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2013

Analysis and modeling of entrainment in chorus singing.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2013

Modeling subjective evaluation of music similarity using tolerance.

[BibT_eX]

[DOI]

Proceedings of the 21st European Signal Processing Conference, 2013

Spoken document retrieval using both word-based and syllable-based document spaces with latent semantic indexing.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2013

2012

Selective Gammatone Envelope Feature for Robust Sound Event Recognition.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2012

Acoustic Model Training Using Pseudo-Speaker Features Generated by MLLR Transformations for Robust Speaker-Independent Speech Recognition.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2012

Causal analysis of task completion errors in spoken music retrieval interactions.

[BibT_eX]

[DOI]

Proceedings of the Eighth International Conference on Language Resources and Evaluation, 2012

Measuring driver awareness based on correlation between gaze behavior and risks of surrounding vehicles.

[BibT_eX]

[DOI]

Masataka Mori

Chiyomi Miyajima

Pongtep Angkititrakul

Proceedings of the 15th International IEEE Conference on Intelligent Transportation Systems, 2012

Classification of Stressed Speech Using Physical Parameters Derived from Two-Mass Model.

[BibT_eX]

[DOI]

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Physical characteristics of vocal folds during speech under stress.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Fast source separation based on selection of effective temporal frames.

[BibT_eX]

[DOI]

Proceedings of the 20th European Signal Processing Conference, 2012

Acoustic model training using feature vectors generated by manipulating speech parameters of real speakers.

[BibT_eX]

[DOI]

Tetsuto Kawai

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012

Subjective similarity of music: Data collection for individuality analysis.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012

2011

Analysis of Real-World Driver's Frustration.

[BibT_eX]

[DOI]

IEEE Trans. Intell. Transp. Syst., 2011

Distant-Talking Speech Recognition Based on Spectral Subtraction by Multi-Channel LMS Algorithm.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2011

Spoken document retrieval method combining query expansion with continuous syllable recognition for NTCIR-SpokenDoc.

[BibT_eX]

[DOI]

Proceedings of the 9th NTCIR Workshop Meeting on Evaluation of Information Access Technologies: Information Retrieval, 2011

An Analysis of the Speech Under Stress Using the Two-Mass Vocal Fold Model.

[BibT_eX]

[DOI]

Proceedings of the Paralinguistic Information and its Integration in Spoken Dialogue Systems, 2011

On-line detection of task incompletion for spoken dialog systems using utterance and behavior tag N-gram vectors.

[BibT_eX]

[DOI]

Proceedings of the Paralinguistic Information and its Integration in Spoken Dialogue Systems, 2011

Alternative Frequency Scale Cepstral Coefficient for Robust Sound Event Recognition.

[BibT_eX]

[DOI]

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Detection of Task-Incomplete Dialogs Based on Utterance-and-Behavior Tag N-Gram for Spoken Dialog Systems.

[BibT_eX]

[DOI]

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Music Recommendation System Based on Human-to-human Conversation Recognition.

[BibT_eX]

[DOI]

Proceedings of the Workshop Proceedings of the 7th International Conference on Intelligent Environments, 2011

Driver risk evaluation based on acceleration, deceleration, and steering behavior.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2011

Robust seed model training for speaker adaptation using pseudo-speaker features generated by inverse CMLLR transformation.

[BibT_eX]

[DOI]

Proceedings of the 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, 2011

2010

Acoustic Feature Transformation Combining Average and Maximum Classification Error Minimization Criteria.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2010

Acoustic Feature Transformation Based on Discriminant Analysis Preserving Local Structure for Speech Recognition.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2010

Evaluation of Combinational Use of Discriminant Analysis-Based Acoustic Feature Transformation and Discriminative Training.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2010

Estimation Method of User Satisfaction Using N-gram-based Dialog History Model for Spoken Dialog System.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Language Resources and Evaluation, 2010

A browsing and retrieval system for driving data.

[BibT_eX]

[DOI]

Proceedings of the IEEE Intelligent Vehicles Symposium (IV), 2010

Selective gammatone filterbank feature for robust sound event recognition.

[BibT_eX]

[DOI]

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Automatic detection of task-incompleted dialog for spoken dialog system based on dialog act n-gram.

[BibT_eX]

[DOI]

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

CENSREC-1-AV: an audio-visual corpus for noisy bimodal speech recognition.

[BibT_eX]

[DOI]

Proceedings of the Auditory-Visual Speech Processing, 2010

2009

A multimedia corpus of driving behaviors.

[BibT_eX]

[DOI]

Proceedings of the 2009 IEEE International Workshop on Multimedia Signal Processing, 2009

Prediction model of driving behavior based on traffic conditions and driver types.

[BibT_eX]

[DOI]

Proceedings of the 12th International IEEE Conference on Intelligent Transportation Systems, 2009

Subjective experiments on influence of response timing in spoken dialogues.

[BibT_eX]

[DOI]

Toshihiko Itoh

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Driver evaluation based on classification of rapid decelerating patterns.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Vehicular Electronics and Safety, 2009

Feature transformation based on discriminant analysis preserving local structure for speech recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2009

Spoken dialog strategy based on understanding graph search.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2009

2008

Robust Speech Recognition by Combining Short-Term and Long-Term Spectrum Based Position-Dependent CMN with Conventional CMN.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2008

Linear Discriminant Analysis Using a Generalized Mean of Class Covariances and Its Application to Speech Recognition.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2008

Noisy Speech Recognition Based on Integration/Selection of Multiple Noise Suppression Methods Using Noise GMMs.

[BibT_eX]

[DOI]

Souta Hamaguchi

IEICE Trans. Inf. Syst., 2008

In-car Speech Data Collection along with Various Multimodal Signals.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Language Resources and Evaluation, 2008

Evaluation Framework for Distant-talking Speech Recognition under Reverberant Environments: newest Part of the CENSREC Series -.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Language Resources and Evaluation, 2008

Blind dereverberation based on CMN and spectral subtraction by multi-channel LMS algorithm.

[BibT_eX]

[DOI]

Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Building and combining document and music spaces for music query-by-webpage system.

[BibT_eX]

[DOI]

Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Analysis of relationship between impression of human-to-human conversations and prosodic change and its modeling.

[BibT_eX]

[DOI]

Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

CENSREC-4: development of evaluation framework for distant-talking speech recognition under reverberant environments.

[BibT_eX]

[DOI]

Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Class lecture summarization taking into account consecutiveness of important sentences.

[BibT_eX]

[DOI]

Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Generating lane-change trajectories of individual drivers.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Vehicular Electronics and Safety, 2008

An integrative recognition method for speech and gestures.

[BibT_eX]

[DOI]

Proceedings of the 10th International Conference on Multimodal Interfaces, 2008

CENSREC-AV: evaluation frameworks for audio-visual speech recognition.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Auditory-Visual Speech Processing 2008, 2008

2007

Robust distant speaker recognition based on position-dependent CMN by combining speaker-specific GMM with speaker-adapted HMM.

[BibT_eX]

[DOI]

Speech Commun., 2007

A Spoken Dialog System for Chat-Like Conversations Considering Response Timing.

[BibT_eX]

[DOI]

Proceedings of the Text, Speech and Dialogue, 10th International Conference, 2007

Analysis of effect of compensation parameter estimation for CMN on speech/speaker recognition.

[BibT_eX]

[DOI]

Proceedings of the 9th International Symposium on Signal Processing and Its Applications, 2007

Power linear discriminant analysis.

[BibT_eX]

[DOI]

Proceedings of the 9th International Symposium on Signal Processing and Its Applications, 2007

One-pass LVCSR algorithm using linear lexicon search and 1-best approximation tree-structured lexicon search.

[BibT_eX]

[DOI]

Proceedings of the 9th International Symposium on Signal Processing and Its Applications, 2007

Selection of optimal dimensionality reduction method using chernoff bound for segmental unit input HMM.

[BibT_eX]

[DOI]

Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Prosody change and response timing analysis in spontaneously spoken dialogs and their modeling in a spoken dialog system.

[BibT_eX]

[DOI]

Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Automatic extraction of cue phrases for important sentences in lecture speech and automatic lecture speech summarization.

[BibT_eX]

[DOI]

Yasuhisa Fujii

Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Statistical segmentation and recognition of fingertip trajectories for a gesture interface.

[BibT_eX]

[DOI]

Proceedings of the 9th International Conference on Multimodal Interfaces, 2007

Robust Distant Speech Recognition by Combining Position-Dependent CMN with Conventional CMN.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2007

Generalization of Linear Discriminant Analysis used in Segmental Unit Input HMM for Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2007

Development of VAD evaluation framework CENSREC-1-C and investigation of relationship between VAD and speech recognition performance.

[BibT_eX]

[DOI]

Proceedings of the IEEE Workshop on Automatic Speech Recognition & Understanding, 2007

2006

Response Timing Detection Using Prosodic and Linguistic Information for Human-friendly Spoken Dialog Systems.

[BibT_eX]

[DOI]

Inf. Media Technol., 2006

Robust Distant Speech Recognition by Combining Multiple Microphone-Array Processing with Position-Dependent CMN.

[BibT_eX]

[DOI]

EURASIP J. Adv. Signal Process., 2006

A spoken Dialog System with Automatic Recovery Mechanism from misrecognition.

[BibT_eX]

[DOI]

Hirotoshi Yano

Proceedings of the 2006 IEEE ACL Spoken Language Technology Workshop, 2006

Noisy speech recognition based on selection of multiple noise suppression methods using noise GMMs.

[BibT_eX]

[DOI]

Souta Hamaguchi

Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

2005

Large-vocabulary continuous speech recognition using linear lexicon search and 1-best approximation tree-structured lexicon search.

[BibT_eX]

[DOI]

Nobutoshi Takahashi

Syst. Comput. Jpn., 2005

Detection and recognition of correction utterances on misrecognition of spoken dialog system.

[BibT_eX]

[DOI]

Naoko Kakutani

Syst. Comput. Jpn., 2005

AURORA-2J: An Evaluation Framework for Japanese Noisy Speech Recognition.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2005

Robust distant speech recognition based on position dependent CMN using a novel multiple microphone processing technique.

[BibT_eX]

[DOI]

Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Robust distant speaker recognition based on position dependent cepstral mean normalization.

[BibT_eX]

[DOI]

Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Multimodal interface for organization name input based on combination of isolated word recognition and continuous base-word recognition.

[BibT_eX]

[DOI]

Hironori Oshikawa

Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

CENSREC-3: Data Collection for In-Car Speech Recognition and Its Common Evaluation Framework.

[BibT_eX]

[DOI]

Proceedings of the 21st International Conference on Data Engineering Workshops, 2005

2004

Confidence measure and rejection based on correctness probability of recognition candidates.

[BibT_eX]

[DOI]

Ichiro Akahori

Syst. Comput. Jpn., 2004

Speech interface for name input based on combination of recognition methods using syllable-based n-gram and word dictionary.

[BibT_eX]

[DOI]

Hironori Oshikawa

Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Robust distant speech recognition based on position dependent CMN.

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Spoken Language Processing, 2004

2003

Integration of noise reduction algorithms for Aurora2 task.

[BibT_eX]

[DOI]

Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Generation of natural response timing using decision tree based on prosodic and linguistic information.

[BibT_eX]

[DOI]

Masashi Takeuchi

Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Comparison of effects of acoustic and language knowledge on spontaneous speech perception/recognition between human and automatic speech recognizer.

[BibT_eX]

[DOI]

Masahisa Shingu

Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Detection and recognition of correction utterance in spontaneously spoken dialog.

[BibT_eX]

[DOI]

Naoko Kakutani

Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

2002

Speaker independent speech recognition using features based on glottal sound source.

[BibT_eX]

[DOI]

Daisuke Yamada

Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Evaluation of spectral subtraction with smoothing of time direction on the Aurora 2 task.

[BibT_eX]

[DOI]

Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Detection and recognition of repaired speech on misrecognized utterances for speech input of car navigation system.

[BibT_eX]

[DOI]

Naoko Kakutani

Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

1996

Concept-based phrase spotting approach for spontaneous speech understanding.

[BibT_eX]

[DOI]

Tatsuya Kawahara