Yusuke Ijima
  According to our database1,
  Yusuke Ijima
  authored at least 59 papers
  between 2008 and 2025.
  
  
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
  2025
  2024
Speech Rhythm-Based Speaker Embeddings Extraction from Phonemes and Phoneme Duration for Multi-Speaker Speech Synthesis.
    
  
    IEICE Trans. Inf. Syst., January, 2024
    
  
Unveiling the Linguistic Capabilities of a Self-Supervised Speech Model Through Cross-Lingual Benchmark and Layer- Wise Similarity Analysis.
    
  
    IEEE Access, 2024
    
  
Pre-training Neural Transducer-based Streaming Voice Conversion for Faster Convergence and Alignment-free Training.
    
  
    Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024
    
  
Knowledge Distillation from Self-Supervised Representation Learning Model with Discrete Speech Units for Any-to-Any Streaming Voice Conversion.
    
  
    Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024
    
  
    Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024
    
  
STYLECAP: Automatic Speaking-Style Captioning from Speech Based on Speech and Language Self-Supervised Learning Models.
    
  
    Proceedings of the IEEE International Conference on Acoustics, 2024
    
  
Noise-Robust Zero-Shot Text-to-Speech Synthesis Conditioned on Self-Supervised Speech-Representation Model with Adapters.
    
  
    Proceedings of the IEEE International Conference on Acoustics, 2024
    
  
What Do Self-Supervised Speech and Speaker Models Learn? New Findings from a Cross Model Layer-Wise Analysis.
    
  
    Proceedings of the IEEE International Conference on Acoustics, 2024
    
  
  2023
    Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
    
  
A stimulus-organism-response model of willingness to buy from advertising speech using voice quality.
    
  
    Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
    
  
    Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
    
  
    Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
    
  
Enhancement of Text-Predicting Style Token With Generative Adversarial Network for Expressive Speech Synthesis.
    
  
    Proceedings of the IEEE International Conference on Acoustics, 2023
    
  
Zero-Shot Text-to-Speech Synthesis Conditioned Using Self-Supervised Speech Representation Model.
    
  
    Proceedings of the IEEE International Conference on Acoustics, 2023
    
  
  2022
    Proceedings of the IEEE Spoken Language Technology Workshop, 2022
    
  
Automated Recognition of Off Phenomenon in Parkinson's Disease During Walking : - Measurement in Daily Life with Wearable Device -.
    
  
    Proceedings of the 4th IEEE Global Conference on Life Sciences and Technologies, 2022
    
  
Predicting VQVAE-based Character Acting Style from Quotation-Annotated Text for Audiobook Speech Synthesis.
    
  
    Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
    
  
    Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
    
  
    Proceedings of the IEEE International Conference on Acoustics, 2022
    
  
  2021
Model architectures to extrapolate emotional expressions in DNN-based text-to-speech.
    
  
    Speech Commun., 2021
    
  
Audiobook Speech Synthesis Conditioned by Cross-Sentence Context-Aware Word Embeddings.
    
  
    Proceedings of the 11th ISCA Speech Synthesis Workshop, 2021
    
  
Impact of Emotional State on Estimation of Willingness to Buy from Advertising Speech.
    
  
    Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
    
  
Phonetic and Prosodic Information Estimation from Texts for Genuine Japanese End-to-End Text-to-Speech.
    
  
    Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
    
  
Phoneme Duration Modeling Using Speech Rhythm-Based Speaker Embeddings for Multi-Speaker Speech Synthesis.
    
  
    Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
    
  
Simpleflat: A Simple Whole-Network Pre-Training Approach for RNN Transducer-Based End-to-End Speech Recognition.
    
  
    Proceedings of the IEEE International Conference on Acoustics, 2021
    
  
    Proceedings of the IEEE International Conference on Acoustics, 2021
    
  
Robust Speech-Age Estimation Using Local Maximum Mean Discrepancy Under Mismatched Recording Conditions.
    
  
    Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021
    
  
  2020
    Proceedings of The 12th Language Resources and Evaluation Conference, 2020
    
  
Investigating Effective Additional Contextual Factors in DNN-Based Spontaneous Speech Synthesis.
    
  
    Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
    
  
    Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
    
  
  2019
    Proceedings of the 10th ISCA Speech Synthesis Workshop, 2019
    
  
Multi-Speaker Modeling for DNN-based Speech Synthesis Incorporating Generative Adversarial Networks.
    
  
    Proceedings of the 10th ISCA Speech Synthesis Workshop, 2019
    
  
End-to-End Automatic Speech Recognition with a Reconstruction Criterion Using Speech-to-Text and Text-to-Speech Encoder-Decoders.
    
  
    Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
    
  
Can We Simulate Generative Process of Acoustic Modeling Data? Towards Data Restoration for Acoustic Modeling.
    
  
    Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019
    
  
  2018
Non-Parallel Voice Conversion Using Variational Autoencoders Conditioned by Phonetic Posteriorgrams and D-Vectors.
    
  
    Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
    
  
Neural Confnet Classification: Fully Neural Network Based Spoken Utterance Classification Using Word Confusion Networks.
    
  
    Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
    
  
Soft-Target Training with Ambiguous Emotional Utterances for DNN-Based Speech Emotion Classification.
    
  
    Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
    
  
  2017
    Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017
    
  
DNN-SPACE: DNN-HMM-Based Generative Model of Voice F<sub>0</sub> Contours for Statistical Phrase/Accent Command Estimation.
    
  
    Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017
    
  
Generative adversarial network-based postfilter for statistical parametric speech synthesis.
    
  
    Proceedings of the 2017 IEEE International Conference on Acoustics, 2017
    
  
    Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017
    
  
  2016
Objective Evaluation Using Association Between Dimensions Within Spectral Features for Statistical Parametric Speech Synthesis.
    
  
    Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016
    
  
    Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016
    
  
  2015
Statistical model training technique based on speaker clustering approach for HMM-based speech synthesis.
    
  
    Speech Commun., 2015
    
  
Similar Speaker Selection Technique Based on Distance Metric Learning Using Highly Correlated Acoustic Features with Perceptual Voice Quality Similarity.
    
  
    IEICE Trans. Inf. Syst., 2015
    
  
Sub-band text-to-speech combining sample-based spectrum with statistically generated spectrum.
    
  
    Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015
    
  
  2014
Prosodic variation enhancement using unsupervised context labeling for HMM-based expressive speech synthesis.
    
  
    Speech Commun., 2014
    
  
  2013
    Proceedings of the Eighth ISCA Tutorial and Research Workshop on Speech Synthesis, 2013
    
  
    Proceedings of the IEEE International Conference on Acoustics, 2013
    
  
  2012
Similar Speaker Selection Technique Based on Distance Metric Learning with Perceptual Voice Quality Similarity.
    
  
    Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012
    
  
  2011
    Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011
    
  
Correlation Analysis of Acoustic Features with Perceptual Voice Quality Similarity for Similar Speaker Selection.
    
  
    Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011
    
  
  2010
A Rapid Model Adaptation Technique for Emotional Speech Recognition with Style Estimation Based on Multiple-Regression HMM.
    
  
    IEICE Trans. Inf. Syst., 2010
    
  
  2009
Speaking style adaptation for spontaneous speech recognition using multiple-regression HMM.
    
  
    Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009
    
  
Emotional speech recognition based on style estimation and adaptation with multiple-regression HMM.
    
  
    Proceedings of the IEEE International Conference on Acoustics, 2009
    
  
  2008
An on-line adaptation technique for emotional speech recognition using style estimation with multiple-regression HMM.
    
  
    Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008