We stand with Ukraine

We stand with Ukraine

Wataru Nakata

According to our database¹, Wataru Nakata authored at least 18 papers between 2021 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

DialogueSidon: Recovering Full-Duplex Dialogue Tracks from In-the-Wild Dialogue Audio.

[DOI]

,

,

Kazuki Yamauchi

,

,

Hiroshi Saruwatari

CoRR, April, 2026

Geneses: Unified Generative Speech Enhancement and Separation.

[DOI]

,

,

,

Hiroshi Saruwatari

CoRR, January, 2026

DistilMOS: Layer-Wise Self-Distillation For Self-Supervised Learning Model-Based MOS Prediction.

[DOI]

,

,

,

Hiroshi Saruwatari

CoRR, January, 2026

Speaker-conditioned phrase break prediction for text-to-speech with phoneme-level pre-trained language model.

[DOI]

,

,

,

Tomoki Koriyama

,

,

,

Hiroshi Saruwatari

Speech Commun., 2026

2025

Sidon: Fast and Robust Open-Source Multilingual Speech Restoration for Large-scale Dataset Cleansing.

[DOI]

,

,

,

Hiroshi Saruwatari

CoRR, September, 2025

ReverbMiipher: Generative Speech Restoration meets Reverberation Characteristics Controllability.

[DOI]

,

,

,

Robin Scheibler

,

Haruko Ishikawa

,

Adriana Guevara-Rukoz

,

,

Michiel Bacchiani

Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2025

Causal Speech Enhancement with Predicting Semantics based on Quantized Self-supervised Learning Features.

[DOI]

,

,

,

Hiroshi Saruwatari

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Multi-Sampling-Frequency Naturalness MOS Prediction Using Self-Supervised Learning Model with Sampling-Frequency-Independent Layer.

[DOI]

,

,

,

,

Hiroshi Saruwatari

,

Tomohiko Nakamura

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2025

2024

J-CHAT: Japanese Large-scale Spoken Dialogue Corpus for Spoken Dialogue Language Modeling.

[DOI]

,

,

,

,

Shinnosuke Takamichi

,

Hiroshi Saruwatari

CoRR, 2024

UTDUSS: UTokyo-SaruLab System for Interspeech2024 Speech Processing Using Discrete Speech Unit Challenge.

[DOI]

,

Kazuki Yamauchi

,

,

,

CoRR, 2024

Building speech corpus with diverse voice characteristics for its prompt-based representation.

[DOI]

,

Shinnosuke Takamichi

,

,

,

,

Hiroshi Saruwatari

CoRR, 2024

The T05 System for the voicemos challenge 2024: Transfer Learning from Deep Image Classifier to Naturalness MOS Prediction of High-Quality Synthetic Speech.

[DOI]

,

,

,

Hiroshi Saruwatari

Proceedings of the IEEE Spoken Language Technology Workshop, 2024

NecoBERT: Self-Supervised Learning Model Trained by Masked Language Modeling on Rich Acoustic Features Derived from Neural Audio Codec.

[DOI]

,

,

,

Shinnosuke Takamichi

,

Hiroshi Saruwatari

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2024

2023

COCO-NUT: Corpus of Japanese Utterance and Voice Characteristics Description for Prompt-Based Control.

[DOI]

,

Shinnosuke Takamichi

,

,

,

,

Hiroshi Saruwatari

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022

J-MAC: Japanese multi-speaker audiobook corpus for speech synthesis.

[DOI]

Shinnosuke Takamichi

,

,

,

Hiroshi Saruwatari

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

UTMOS: UTokyo-SaruLab System for VoiceMOS Challenge 2022.

[DOI]

,

,

,

Tomoki Koriyama

,

Shinnosuke Takamichi

,

Hiroshi Saruwatari

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Predicting VQVAE-based Character Acting Style from Quotation-Annotated Text for Audiobook Speech Synthesis.

[DOI]

,

Tomoki Koriyama

,

Shinnosuke Takamichi

,

,

,

,

Hiroshi Saruwatari

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

2021

Audiobook Speech Synthesis Conditioned by Cross-Sentence Context-Aware Word Embeddings.

[DOI]

,

Tomoki Koriyama

,

Shinnosuke Takamichi

,

,

,

,

Hiroshi Saruwatari

Proceedings of the 11th ISCA Speech Synthesis Workshop, 2021

Loading...