Yusuke Ijima

According to our database1, Yusuke Ijima authored at least 48 papers between 2008 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Speech Rhythm-Based Speaker Embeddings Extraction from Phonemes and Phoneme Duration for Multi-Speaker Speech Synthesis.
IEICE Trans. Inf. Syst., January, 2024

What Do Self-Supervised Speech and Speaker Models Learn? New Findings From a Cross Model Layer-Wise Analysis.
CoRR, 2024

Noise-robust zero-shot text-to-speech synthesis conditioned on self-supervised speech-representation model with adapters.
CoRR, 2024

2023
StyleCap: Automatic Speaking-Style Captioning from Speech Based on Speech and Language Self-supervised Learning Models.
CoRR, 2023

SpeechGLUE: How Well Can Self-Supervised Speech Models Capture Linguistic Knowledge?
CoRR, 2023

Enhancement of Text-Predicting Style Token With Generative Adversarial Network for Expressive Speech Synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2023

Zero-Shot Text-to-Speech Synthesis Conditioned Using Self-Supervised Speech Representation Model.
Proceedings of the IEEE International Conference on Acoustics, 2023

2022
SIMD-Size Aware Weight Regularization for Fast Neural Vocoding on CPU.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Automated Recognition of Off Phenomenon in Parkinson's Disease During Walking : - Measurement in Daily Life with Wearable Device -.
Proceedings of the 4th IEEE Global Conference on Life Sciences and Technologies, 2022

Predicting VQVAE-based Character Acting Style from Quotation-Annotated Text for Audiobook Speech Synthesis.
Proceedings of the Interspeech 2022, 2022

Joint Modeling of Multi-Sample and Subband Signals for Fast Neural Vocoding on CPU.
Proceedings of the Interspeech 2022, 2022

Multi-Sample Subband Wavernn Via Multivariate Gaussian.
Proceedings of the IEEE International Conference on Acoustics, 2022

2021
Model architectures to extrapolate emotional expressions in DNN-based text-to-speech.
Speech Commun., 2021

Impact of Emotional State on Estimation of Willingness to Buy from Advertising Speech.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Phonetic and Prosodic Information Estimation from Texts for Genuine Japanese End-to-End Text-to-Speech.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Phoneme Duration Modeling Using Speech Rhythm-Based Speaker Embeddings for Multi-Speaker Speech Synthesis.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Simpleflat: A Simple Whole-Network Pre-Training Approach for RNN Transducer-Based End-to-End Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2021

Speech Emotion Recognition Based on Listener Adaptive Models.
Proceedings of the IEEE International Conference on Acoustics, 2021

Robust Speech-Age Estimation Using Local Maximum Mean Discrepancy Under Mismatched Recording Conditions.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

2020
DNN-based Speech Synthesis Using Abundant Tags of Spontaneous Speech Corpus.
Proceedings of The 12th Language Resources and Evaluation Conference, 2020

Investigating Effective Additional Contextual Factors in DNN-Based Spontaneous Speech Synthesis.
Proceedings of the Interspeech 2020, 2020

Lightweight LPCNet-Based Neural Vocoder with Tensor Decomposition.
Proceedings of the Interspeech 2020, 2020

2019
V2S attack: building DNN-based voice conversion from automatic speaker verification.
CoRR, 2019

End-to-End Automatic Speech Recognition with a Reconstruction Criterion Using Speech-to-Text and Text-to-Speech Encoder-Decoders.
Proceedings of the Interspeech 2019, 2019

Can We Simulate Generative Process of Acoustic Modeling Data? Towards Data Restoration for Acoustic Modeling.
Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

2018
DNN-Based Speech Synthesis Using Speaker Codes.
IEICE Trans. Inf. Syst., 2018

Non-Parallel Voice Conversion Using Variational Autoencoders Conditioned by Phonetic Posteriorgrams and D-Vectors.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Neural Confnet Classification: Fully Neural Network Based Spoken Utterance Classification Using Word Confusion Networks.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Soft-Target Training with Ambiguous Emotional Utterances for DNN-Based Speech Emotion Classification.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017
Prosody Aware Word-Level Encoder Based on BLSTM-RNNs for DNN-Based Speech Synthesis.
Proceedings of the Interspeech 2017, 2017

DNN-SPACE: DNN-HMM-Based Generative Model of Voice F<sub>0</sub> Contours for Statistical Phrase/Accent Command Estimation.
Proceedings of the Interspeech 2017, 2017

Generative adversarial network-based postfilter for statistical parametric speech synthesis.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

An investigation to transplant emotional expressions in DNN-based TTS synthesis.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

2016
Objective Evaluation Using Association Between Dimensions Within Spectral Features for Statistical Parametric Speech Synthesis.
Proceedings of the Interspeech 2016, 2016

An Investigation of DNN-Based Speech Synthesis Using Speaker Codes.
Proceedings of the Interspeech 2016, 2016

2015
Statistical model training technique based on speaker clustering approach for HMM-based speech synthesis.
Speech Commun., 2015

Similar Speaker Selection Technique Based on Distance Metric Learning Using Highly Correlated Acoustic Features with Perceptual Voice Quality Similarity.
IEICE Trans. Inf. Syst., 2015

Sub-band text-to-speech combining sample-based spectrum with statistically generated spectrum.
Proceedings of the INTERSPEECH 2015, 2015

2014
Prosodic variation enhancement using unsupervised context labeling for HMM-based expressive speech synthesis.
Speech Commun., 2014

2013
Statistical model training technique for speech synthesis based on speaker class.
Proceedings of the Eighth ISCA Tutorial and Research Workshop on Speech Synthesis, 2013

HMM-based expressive speech synthesis based on phrase-level F0 context labeling.
Proceedings of the IEEE International Conference on Acoustics, 2013

2012
Similar Speaker Selection Technique Based on Distance Metric Learning with Perceptual Voice Quality Similarity.
Proceedings of the INTERSPEECH 2012, 2012

2011
HMM-Based Emphatic Speech Synthesis Using Unsupervised Context Labeling.
Proceedings of the INTERSPEECH 2011, 2011

Correlation Analysis of Acoustic Features with Perceptual Voice Quality Similarity for Similar Speaker Selection.
Proceedings of the INTERSPEECH 2011, 2011

2010
A Rapid Model Adaptation Technique for Emotional Speech Recognition with Style Estimation Based on Multiple-Regression HMM.
IEICE Trans. Inf. Syst., 2010

2009
Speaking style adaptation for spontaneous speech recognition using multiple-regression HMM.
Proceedings of the INTERSPEECH 2009, 2009

Emotional speech recognition based on style estimation and adaptation with multiple-regression HMM.
Proceedings of the IEEE International Conference on Acoustics, 2009

2008
An on-line adaptation technique for emotional speech recognition using style estimation with multiple-regression HMM.
Proceedings of the INTERSPEECH 2008, 2008


  Loading...