Detai Xin

Orcid: 0009-0007-1908-1137

According to our database1, Detai Xin authored at least 16 papers between 2020 and 2024.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
JNV corpus: A corpus of Japanese nonverbal vocalizations with diverse phrases and emotions.
Speech Commun., January, 2024

Building speech corpus with diverse voice characteristics for its prompt-based representation.
CoRR, 2024

NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models.
CoRR, 2024

JVNV: A Corpus of Japanese Emotional Speech With Verbal Content and Nonverbal Expressions.
IEEE Access, 2024

2023
How Generative Spoken Language Modeling Encodes Noisy Speech: Investigation from Phonetics to Syntactics.
CoRR, 2023

Laughter Synthesis using Pseudo Phonetic Tokens with a Large-scale In-the-wild Laughter Corpus.
CoRR, 2023

Duration-Aware Pause Insertion Using Pre-Trained Language Model for Multi-Speaker Text-To-Speech.
Proceedings of the IEEE International Conference on Acoustics, 2023

Improving Speech Prosody of Audiobook Text-To-Speech Synthesis with Acoustic and Textual Contexts.
Proceedings of the IEEE International Conference on Acoustics, 2023

MID-Attribute Speaker Generation Using Optimal-Transport-Based Interpolation of Gaussian Mixture Models.
Proceedings of the IEEE International Conference on Acoustics, 2023

COCO-NUT: Corpus of Japanese Utterance and Voice Characteristics Description for Prompt-Based Control.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022
Exploring the Effectiveness of Self-supervised Learning and Classifier Chains in Emotion Recognition of Nonverbal Vocalizations.
CoRR, 2022

Speaking-Rate-Controllable HiFi-GAN Using Feature Interpolation.
CoRR, 2022

UTMOS: UTokyo-SaruLab System for VoiceMOS Challenge 2022.
Proceedings of the Interspeech 2022, 2022

2021
Cross-Lingual Speaker Adaptation Using Domain Adaptation and Speaker Consistency Loss for Text-To-Speech Synthesis.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Disentangled Speaker and Language Representations Using Mutual Information Minimization and Domain Adaptation for Cross-Lingual TTS.
Proceedings of the IEEE International Conference on Acoustics, 2021

2020
Cross-Lingual Text-To-Speech Synthesis via Domain Adaptation and Perceptual Similarity Regression in Speaker Space.
Proceedings of the Interspeech 2020, 2020


  Loading...