Takashi Nose

Comput. Speech Lang., 2026

Controllable End-to-End Neural Source-Filter Vocoder With Low-Dimensional Speech Parameters and Data Augmentation.

[BibT_eX]

[DOI]

Sumiharu Kobayashi

Tetsuo Kosaka

IEEE Access, 2026

2025

Adaptive Fine-Grained Pruning via Binary Search for Efficient Environmental Sound Classification.

[BibT_eX]

[DOI]

Changlong Wang

IEEE Access, 2025

Adaptive Depth-Wise Pruning for Efficient Environmental Sound Classification.

[BibT_eX]

[DOI]

Changlong Wang

IEEE Access, 2025

Improving User Impression of Spoken Dialogue Systems by Controlling Para-linguistic Expression Based on Intimacy.

[BibT_eX]

[DOI]

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Improving Speech-to-Speech Translation for Low-Resource Languages via Transfer Learning.

[BibT_eX]

[DOI]

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2025

Fast and Speaker-Independent Utterance Selection for ASR-Free CALL Systems of Minority Languages.

[BibT_eX]

[DOI]

Takaki Koshikawa

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2025

CycleSiFiNF-VC: Controllable Non-Parallel Voice Conversion by Neural Formant Manipulation with Improved Cycle-Consistency Loss.

[BibT_eX]

[DOI]

Sumiharu Kobayashi

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2025

PUNSER: Large-Scale Pre-Trained and Unified Model for Practical Speech Emotion Recognition.

[BibT_eX]

[DOI]

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2025

2024

Simultaneous Adaptation of Acoustic and Language Models for Emotional Speech Recognition Using Tweet Data.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2024

Preserving Speaker Information in Direct Speech-to-Speech Translation with Non-Autoregressive Generation and Pretraining.

[BibT_eX]

[DOI]

CoRR, 2024

Multilingual Meta-Transfer Learning for Low-Resource Speech Recognition.

[BibT_eX]

[DOI]

IEEE Access, 2024

A Replaceable Curiosity-Driven Candidate Agent Exploration Approach for Task-Oriented Dialog Policy Learning.

[BibT_eX]

[DOI]

Xuecheng Niu

IEEE Access, 2024

Scheduled Curiosity-Deep Dyna-Q: Efficient Exploration for Dialog Policy Learning.

[BibT_eX]

[DOI]

Xuecheng Niu

IEEE Access, 2024

Estimation of Offensiveness of Posts on Social Media and Its Application to a Conversation Assistance System.

[BibT_eX]

[DOI]

Tomoki Fujihara

Proceedings of the 2024 8th International Conference on Natural Language Processing and Information Retrieval, 2024

Character Expressions in Meta-Learning for Extremely Low Resource Language Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 2024 16th International Conference on Machine Learning and Computing, 2024

Evaluation of Environmental Sound Classification using Vision Transformer.

[BibT_eX]

[DOI]

Proceedings of the 2024 16th International Conference on Machine Learning and Computing, 2024

Toward Photo-Realistic Facial Animation Generation Based on Keypoint Features.

[BibT_eX]

[DOI]

Zikai Shu

Proceedings of the 2024 16th International Conference on Machine Learning and Computing, 2024

End-to-End Neural Formant Synthesis Using Low-Dimensional Acoustic Parameters.

[BibT_eX]

[DOI]

Sumiharu Kobayashi

Tetsuo Kosaka

Proceedings of the 13th IEEE Global Conference on Consumer Electronics, 2024

Improving Speaker Consistency in Speech-to-Speech Translation Using Speaker Retention Unit-to-Mel Techniques.

[BibT_eX]

[DOI]

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2024

2023

Multimodal Expressive Embodied Conversational Agent Design.

[BibT_eX]

[DOI]

Simon Jolibois

Proceedings of the HCI International 2023 Posters, 2023

2021

Multimodal Dialogue Response Timing Estimation Using Dialogue Context Encoder.

[BibT_eX]

[DOI]

Proceedings of the Conversational AI for Natural Human-Centric Interaction, 2021

Neural Spoken-Response Generation Using Prosodic and Linguistic Context for Conversational Systems.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Improvement of Automatic English Pronunciation Assessment with Small Number of Utterances Using Sentence Speakability.

[BibT_eX]

[DOI]

Satsuki Naijo

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Analysis of Feature Extraction by Convolutional Neural Network for Speech Emotion Recognition.

[BibT_eX]

[DOI]

Daisuke Horii

Proceedings of the 10th IEEE Global Conference on Consumer Electronics, 2021

2020

Automatic assessment of English proficiency for Japanese learners without reference sentences based on deep neural network acoustic models.

[BibT_eX]

[DOI]

Speech Commun., 2020

A Symbol-level Melody Completion Based on a Convolutional Neural Network with Generative Adversarial Learning.

[BibT_eX]

[DOI]

J. Inf. Process., 2020

Construction and Analysis of a Multimodal Chat-talk Corpus for Dialog Systems Considering Interpersonal Closeness.

[BibT_eX]

[DOI]

Proceedings of The 12th Language Resources and Evaluation Conference, 2020

Multi-Stream Attention-Based BLSTM with Feature Segmentation for Speech Emotion Recognition.

[BibT_eX]

[DOI]

Yuya Chiba

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Filler Prediction Based on Bidirectional LSTM for Generation of Natural Response of Spoken Dialog.

[BibT_eX]

[DOI]

Proceedings of the 9th IEEE Global Conference on Consumer Electronics, 2020

Incremental Response Generation Using Prefix-to-Prefix Model for Dialogue System.

[BibT_eX]

[DOI]

Proceedings of the 9th IEEE Global Conference on Consumer Electronics, 2020

Successive Japanese Lyrics Generation Based on Encoder-Decoder Model.

[BibT_eX]

[DOI]

Proceedings of the 9th IEEE Global Conference on Consumer Electronics, 2020

Analysis and Estimation of Sentence Speakability for English Pronunciation Evaluation.

[BibT_eX]

[DOI]

Proceedings of the 9th IEEE Global Conference on Consumer Electronics, 2020

Spoken Term Detection Based on Acoustic Models Trained in Multiple Languages for Zero-Resource Language.

[BibT_eX]

[DOI]

Proceedings of the 9th IEEE Global Conference on Consumer Electronics, 2020

CycleGAN-Based High-Quality Non-Parallel Voice Conversion with Spectrogram and WaveRNN.

[BibT_eX]

[DOI]

Proceedings of the 9th IEEE Global Conference on Consumer Electronics, 2020

Improving Pronunciation Clarity of Dysarthric Speech Using CycleGAN with Multiple Speakers.

[BibT_eX]

[DOI]

Proceedings of the 9th IEEE Global Conference on Consumer Electronics, 2020

A Study on Minimum Spectral Error Analysis of Speech.

[BibT_eX]

[DOI]

Takuma Hayasaka

Proceedings of the 9th IEEE Global Conference on Consumer Electronics, 2020

LJSing: Large-Scale Singing Voice Corpus of Single Japanese Singer.

[BibT_eX]

[DOI]

Takuto Fujimura

Proceedings of the 9th IEEE Global Conference on Consumer Electronics, 2020

Integration of Accent Sandhi and Prosodic Features Estimation for Japanese Text-to-Speech Synthesis.

[BibT_eX]

[DOI]

Daisuke Fujimaki

Proceedings of the 9th IEEE Global Conference on Consumer Electronics, 2020

2019

Improving human scoring of prosody using parametric speech synthesis.

[BibT_eX]

[DOI]

Speech Commun., 2019

Developing a Multi-Platform Speech Recording System Toward Open Service of Building Large-Scale Speech Corpora.

[BibT_eX]

[DOI]

Keita Ishizuka

CoRR, 2019

2018

Improving User Impression in Spoken Dialog System with Gradual Speech Form Control.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual SIGdial Meeting on Discourse and Dialogue, 2018

An Analysis of the Effect of Emotional Speech Synthesis on Non-Task-Oriented Dialogue System.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual SIGdial Meeting on Discourse and Dialogue, 2018

Analyzing Effect of Physical Expression on English Proficiency for Multimodal Computer-Assisted Language Learning.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

A Study on a Spoken Dialogue System with Cooperative Emotional Speech Synthesis Using Acoustic and Linguistic Information.

[BibT_eX]

[DOI]

Proceedings of the Recent Advances in Intelligent Information Hiding and Multimedia Signal Processing, 2018

Melody Completion Based on Convolutional Neural Networks and Generative Adversarial Learning.

[BibT_eX]

[DOI]

Proceedings of the Recent Advances in Intelligent Information Hiding and Multimedia Signal Processing, 2018

Two-Stage Sequence-to-Sequence Neural Voice Conversion with Low-to-High Definition Spectrogram Mapping.

[BibT_eX]

[DOI]

Proceedings of the Recent Advances in Intelligent Information Hiding and Multimedia Signal Processing, 2018

Comparison of Speech Recognition Performance Between Kaldi and Google Cloud Speech API.

[BibT_eX]

[DOI]

Proceedings of the Recent Advances in Intelligent Information Hiding and Multimedia Signal Processing, 2018

DNN-Based Talking Movie Generation with Face Direction Consideration.

[BibT_eX]

[DOI]

Toru Ishikawa

Proceedings of the Recent Advances in Intelligent Information Hiding and Multimedia Signal Processing, 2018

Segmental Pitch Control Using Speech Input Based on Differential Contexts and Features for Customizable Neural Speech Synthesis.

[BibT_eX]

[DOI]

Shinya Hanabusa

Proceedings of the Recent Advances in Intelligent Information Hiding and Multimedia Signal Processing, 2018

Data Collection and Analysis for Automatically Generating Record of Human Behaviors by Environmental Sound Recognition.

[BibT_eX]

[DOI]

Proceedings of the Recent Advances in Intelligent Information Hiding and Multimedia Signal Processing, 2018

Evaluation of English Speech Recognition for Japanese Learners Using DNN-Based Acoustic Models.

[BibT_eX]

[DOI]

Proceedings of the Recent Advances in Intelligent Information Hiding and Multimedia Signal Processing, 2018

Improvement of Accent Sandhi Rules Based on Japanese Accent Dictionaries.

[BibT_eX]

[DOI]

Proceedings of the Recent Advances in Intelligent Information Hiding and Multimedia Signal Processing, 2018

Monitoring system for a single aged person on the basis of electricity use - Heatstroke-prevention system.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Consumer Electronics, 2018

Monitoring System for a Single Aged Person on the Basis of Electricity Use : Performance Improvement by Interpolating Watt Hour Granularity.

[BibT_eX]

[DOI]

Proceedings of the IEEE 7th Global Conference on Consumer Electronics, 2018

Effect of Mutual Self-Disclosure in Spoken Dialog System on User Impression.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

Acoustic Model Adaptation for Emotional Speech Recognition Using Twitter-Based Emotional Speech Corpus.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

2017

Sentence Selection Based on Extended Entropy Using Phonetic and Prosodic Contexts for Statistical Parametric Speech Synthesis.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2017

Dimensional paralinguistic information control based on multiple-regression HSMM for spontaneous dialogue speech synthesis with robust parameter estimation.

[BibT_eX]

[DOI]

Tomohiro Nagata

Hiroki Mori

Speech Commun., 2017

Cluster-based approach to discriminate the user's state whether a user is embarrassed or thinking to an answer to a prompt.

[BibT_eX]

[DOI]

Yuya Chiba

J. Multimodal User Interfaces, 2017

Development and Evaluation of Julius-Compatible Interface for Kaldi ASR.

[BibT_eX]

[DOI]

Proceedings of the Advances in Intelligent Information Hiding and Multimedia Signal Processing, 2017

Response Selection of Interview-Based Dialog System Using User Focus and Semantic Orientation.

[BibT_eX]

[DOI]

Proceedings of the Advances in Intelligent Information Hiding and Multimedia Signal Processing, 2017

A Study on 2D Photo-Realistic Facial Animation Generation Using 3D Facial Feature Points and Deep Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the Advances in Intelligent Information Hiding and Multimedia Signal Processing, 2017

Evaluation of Nonlinear Tempo Modification Methods Based on Sinusoidal Modeling.

[BibT_eX]

[DOI]

Proceedings of the Advances in Intelligent Information Hiding and Multimedia Signal Processing, 2017

Dialog-Based Interactive Movie Recommendation: Comparison of Dialog Strategies.

[BibT_eX]

[DOI]

Proceedings of the Advances in Intelligent Information Hiding and Multimedia Signal Processing, 2017

Voice Conversion from Arbitrary Speakers Based on Deep Neural Networks with Adversarial Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Intelligent Information Hiding and Multimedia Signal Processing, 2017

Detection of Singing Mistakes from Singing Voice.

[BibT_eX]

[DOI]

Proceedings of the Advances in Intelligent Information Hiding and Multimedia Signal Processing, 2017

Collection of Example Sentences for Non-task-Oriented Dialog Using a Spoken Dialog System and Comparison with Hand-Crafted DB.

[BibT_eX]

[DOI]

Proceedings of the HCI International 2017 - Posters' Extended Abstracts, 2017

Monitoring system for a single aged person on the basis of electricity use - Prototype by using smart meter.

[BibT_eX]

[DOI]

Proceedings of the IEEE 6th Global Conference on Consumer Electronics, 2017

Analysis of efficient multimodal features for estimating user's willingness to talk: Comparison of human-machine and human-human dialog.

[BibT_eX]

[DOI]

Yuya Chiba

Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

2016

Efficient Implementation of Global Variance Compensation for Parametric Speech Synthesis.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2016

2015

HMM-based expressive singing voice synthesis with singing style control and robust pitch modeling.

[BibT_eX]

[DOI]

Comput. Speech Lang., 2015

Entropy-based sentence selection for speech synthesis using phonetic and prosodic contexts.

[BibT_eX]

[DOI]

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Conversion of Speaker's Face Image Using PCA and Animation Unit for Video Chatting.

[BibT_eX]

[DOI]

Proceedings of the 2015 International Conference on Intelligent Information Hiding and Multimedia Signal Processing, 2015

Tempo Modification of Mixed Music Signal by Nonlinear Time Scaling and Sinusoidal Modeling.

[BibT_eX]

[DOI]

Tsukasa Nishino

Proceedings of the 2015 International Conference on Intelligent Information Hiding and Multimedia Signal Processing, 2015

On Appropriateness and Estimation of the Emotion of Synthesized Response Speech in a Spoken Dialogue System.

[BibT_eX]

[DOI]

Taketo Kase

Proceedings of the HCI International 2015 - Posters' Extended Abstracts, 2015

2014

Prosodic variation enhancement using unsupervised context labeling for HMM-based expressive speech synthesis.

[BibT_eX]

[DOI]

Speech Commun., 2014

A Parameter Generation Algorithm Using Local Variance for HMM-Based Speech Synthesis.

[BibT_eX]

[DOI]

IEEE J. Sel. Top. Signal Process., 2014

Statistical Parametric Speech Synthesis Based on Gaussian Process Regression.

[BibT_eX]

[DOI]

IEEE J. Sel. Top. Signal Process., 2014

User Modeling by Using Bag-of-Behaviors for Building a Dialog System Sensitive to the Interlocutor's Internal State.

[BibT_eX]

[DOI]

Proceedings of the SIGDIAL 2014 Conference, 2014

Parametric speech synthesis using local and global sparse Gaussian processes.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Workshop on Machine Learning for Signal Processing, 2014

Analysis of spectral enhancement using global variance in HMM-based speech synthesis.

[BibT_eX]

[DOI]

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Transform mapping using shared decision tree context clustering for HMM-based cross-lingual speech synthesis.

[BibT_eX]

[DOI]

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Accent type and phrase boundary estimation using acoustic and language models for automatic prosodic labeling.

[BibT_eX]

[DOI]

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Analysis of English Pronunciation of Singing Voices Sung by Japanese Speakers.

[BibT_eX]

[DOI]

Kazumichi Yoshida

Proceedings of the 2014 Tenth International Conference on Intelligent Information Hiding and Multimedia Signal Processing, 2014

Quantized F0 Context and Its Applications to Speech Synthesis, Speech Coding, and Voice Conversion.

[BibT_eX]

[DOI]

Proceedings of the 2014 Tenth International Conference on Intelligent Information Hiding and Multimedia Signal Processing, 2014

Parametric speech synthesis based on Gaussian process regression using global variance and hyperparameter optimization.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2014

Robot: Have I done something wrong? - Analysis of prosodic features of speech commands under the robot's unintended behavior.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Audio, 2014

A study on the effect of speech rate on perception of spoken easy Japanese using speech synthesis.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Audio, 2014

Subjective evaluation of packet loss recovery techniques for voice over IP.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Audio, 2014

Controlling Switching Pause Using an AR Agent for Interactive CALL System.

[BibT_eX]

[DOI]

Proceedings of the HCI International 2014 - Posters' Extended Abstracts, 2014

Speech recognition in a home environment using parallel decoding with GMM-based noise modeling.

[BibT_eX]

[DOI]

Kohei Machida

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014

2013

An intuitive style control technique in HMM-based expressive speech synthesis using subjective style intensity and multiple-regression global variance model.

[BibT_eX]

[DOI]

Speech Commun., 2013

A style control technique for singing voice synthesis based on multiple-regression HSMM.

[BibT_eX]

[DOI]

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Robust estimation of multiple-regression HMM parameters for dimension-based expressive dialogue speech synthesis.

[BibT_eX]

[DOI]

Tomohiro Nagata

Hiroki Mori

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Statistical nonparametric speech synthesis using sparse Gaussian processes.

[BibT_eX]

[DOI]

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

HMM-based expressive speech synthesis based on phrase-level F0 context labeling.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2013

Frame-level acoustic modeling based on Gaussian process regression for statistical nonparametric speech synthesis.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2013

Speaker-independent style conversion for HMM-based expressive speech synthesis.

[BibT_eX]

[DOI]

Hiroki Kanagawa

Proceedings of the IEEE International Conference on Acoustics, 2013

2012

Very low bit-rate F0 coding for phonetic vocoders using MSD-HMM with quantized F0 symbols.

[BibT_eX]

[DOI]

Speech Commun., 2012

A tone-modeling technique using a quantized F0 context to improve tone correctness in average-voice-based speech synthesis.

[BibT_eX]

[DOI]

Speech Commun., 2012

Discontinuous Observation HMM for Prosodic-Event-Based F0 Generation.

[BibT_eX]

[DOI]

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

A speech parameter generation algorithm using local variance for HMM-based speech synthesis.

[BibT_eX]

[DOI]

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

An F0 modeling technique based on prosodic events for spontaneous speech synthesis.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

2011

Speaker-independent HMM-based voice conversion using adaptive quantization of the fundamental frequency.

[BibT_eX]

[DOI]

Speech Commun., 2011

Performance Prediction of Speech Recognition Using Average-Voice-Based Speech Synthesis.

[BibT_eX]

[DOI]

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

A Perceptual Expressivity Modeling Technique for Speech Synthesis Based on Multiple-Regression HSMM.

[BibT_eX]

[DOI]

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

HMM-Based Emphatic Speech Synthesis Using Unsupervised Context Labeling.

[BibT_eX]

[DOI]

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

On the Use of Extended Context for HMM-Based Spontaneous Conversational Speech Synthesis.

[BibT_eX]

[DOI]

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Very low bit-rate F0 coding for phonetic vocoder using MSD-HMM with quantized F0 context.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2011

Tonal context labeling using quantized F0 symbols for improving tone correctness in average-voice-based speech synthesis.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2011

2010

HMM-Based Voice Conversion Using Quantized F0 Context.

[BibT_eX]

[DOI]

Yuhei Ota

IEICE Trans. Inf. Syst., 2010

A Technique for Estimating Intensity of Emotional Expressions and Speaking Styles in Speech Based on Multiple-Regression HSMM.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2010

A Rapid Model Adaptation Technique for Emotional Speech Recognition with Style Estimation Based on Multiple-Regression HMM.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2010

HMM-based robust voice conversion using adaptive F0 quantization.

[BibT_eX]

[DOI]

Proceedings of the Seventh ISCA Tutorial and Research Workshop on Speech Synthesis, 2010

Evaluation of prosodic contextual factors for HMM-based speech synthesis.

[BibT_eX]

[DOI]

Shuji Yokomizo

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Speaker-independent HMM-based voice conversion using quantized fundamental frequency.

[BibT_eX]

[DOI]

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Conversational spontaneous speech synthesis using average voice model.

[BibT_eX]

[DOI]

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

HMM-based speech synthesis with unsupervised labeling of accentual context based on F0 quantization and average voice model.

[BibT_eX]

[DOI]

Koujirou Ooki

Proceedings of the IEEE International Conference on Acoustics, 2010

Grounding New Words on the Physical World in Multi-Domain Human-Robot Dialogues.

[BibT_eX]

[DOI]

Proceedings of the Dialog with Robots, 2010

2009

Robust Speaker-Adaptive HMM-Based Text-to-Speech Synthesis.

[BibT_eX]

[DOI]

IEEE Trans. Speech Audio Process., 2009

HMM-Based Style Control for Expressive Speech Synthesis with Arbitrary Speaker's Voice Using Model Adaptation.

[BibT_eX]

[DOI]

Makoto Tachibana

IEICE Trans. Inf. Syst., 2009

Learning lexicons from spoken utterances based on statistical model selection.

[BibT_eX]

[DOI]

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

HMM-based speaker characteristics emphasis using average voice model.

[BibT_eX]

[DOI]

Junichi Adada

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Speaking style adaptation for spontaneous speech recognition using multiple-regression HMM.

[BibT_eX]

[DOI]

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Emotional speech recognition based on style estimation and adaptation with multiple-regression HMM.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2009

2008

An estimation technique of style expressiveness for emotional speech using model adaptation based on multiple-regression HSMM.

[BibT_eX]

[DOI]

Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

An on-line adaptation technique for emotional speech recognition using style estimation with multiple-regression HMM.

[BibT_eX]

[DOI]

Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Performance evaluation of the speaker-independent HMM-based speech synthesis system "HTS 2007" for the Blizzard Challenge 2007.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2008

Speaker and style adaptation using average voice model for style control in HMM-based speech synthesis.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2008

2007

A Style Control Technique for HMM-Based Expressive Speech Synthesis.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2007

The HMM-based speech synthesis system (HTS) version 2.0.

[BibT_eX]

[DOI]

Proceedings of the Sixth ISCA Workshop on Speech Synthesis, 2007

Style estimation of speech based on multiple regression hidden semi-Markov model.

[BibT_eX]

[DOI]

Yoichi Kato

Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

A Speaker Adaptation Technique for MRHSMM-Based Style Control of Synthetic Speech.

[BibT_eX]

[DOI]

Yoichi Kato

Proceedings of the IEEE International Conference on Acoustics, 2007

2006

A technique for controlling voice quality of synthetic speech using multiple regression HSMM.

[BibT_eX]

[DOI]

Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

A style control technique for speech synthesis using multiple regression HSMM.

[BibT_eX]

[DOI]

Junichi Yamagishi