We stand with Ukraine

We stand with Ukraine

Wataru Nakata

According to our database¹, Wataru Nakata authored at least 14 papers between 2021 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2025

Sidon: Fast and Robust Open-Source Multilingual Speech Restoration for Large-scale Dataset Cleansing.

[BibT_eX]

[DOI]

,

,

,

Hiroshi Saruwatari

CoRR, September, 2025

Multi-Sampling-Frequency Naturalness MOS Prediction Using Self-Supervised Learning Model with Sampling-Frequency-Independent Layer.

[BibT_eX]

[DOI]

,

,

,

,

Hiroshi Saruwatari

,

Tomohiko Nakamura

CoRR, July, 2025

ReverbMiipher: Generative Speech Restoration meets Reverberation Characteristics Controllability.

[BibT_eX]

[DOI]

,

,

,

Robin Scheibler

,

Haruko Ishikawa

,

Adriana Guevara-Rukoz

,

,

Michiel Bacchiani

CoRR, May, 2025

Causal Speech Enhancement with Predicting Semantics based on Quantized Self-supervised Learning Features.

[BibT_eX]

[DOI]

,

,

,

Hiroshi Saruwatari

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

2024

J-CHAT: Japanese Large-scale Spoken Dialogue Corpus for Spoken Dialogue Language Modeling.

[BibT_eX]

[DOI]

,

,

,

,

Shinnosuke Takamichi

,

Hiroshi Saruwatari

CoRR, 2024

UTDUSS: UTokyo-SaruLab System for Interspeech2024 Speech Processing Using Discrete Speech Unit Challenge.

[BibT_eX]

[DOI]

,

Kazuki Yamauchi

,

,

,

CoRR, 2024

Building speech corpus with diverse voice characteristics for its prompt-based representation.

[BibT_eX]

[DOI]

,

Shinnosuke Takamichi

,

,

,

,

Hiroshi Saruwatari

CoRR, 2024

The T05 System for the voicemos challenge 2024: Transfer Learning from Deep Image Classifier to Naturalness MOS Prediction of High-Quality Synthetic Speech.

[BibT_eX]

[DOI]

,

,

,

Hiroshi Saruwatari

Proceedings of the IEEE Spoken Language Technology Workshop, 2024

NecoBERT: Self-Supervised Learning Model Trained by Masked Language Modeling on Rich Acoustic Features Derived from Neural Audio Codec.

[BibT_eX]

[DOI]

,

,

,

Shinnosuke Takamichi

,

Hiroshi Saruwatari

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2024

2023

COCO-NUT: Corpus of Japanese Utterance and Voice Characteristics Description for Prompt-Based Control.

[BibT_eX]

[DOI]

,

Shinnosuke Takamichi

,

,

,

,

Hiroshi Saruwatari

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022

J-MAC: Japanese multi-speaker audiobook corpus for speech synthesis.

[BibT_eX]

[DOI]

Shinnosuke Takamichi

,

,

,

Hiroshi Saruwatari

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

UTMOS: UTokyo-SaruLab System for VoiceMOS Challenge 2022.

[BibT_eX]

[DOI]

,

,

,

Tomoki Koriyama

,

Shinnosuke Takamichi

,

Hiroshi Saruwatari

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Predicting VQVAE-based Character Acting Style from Quotation-Annotated Text for Audiobook Speech Synthesis.

[BibT_eX]

[DOI]

,

Tomoki Koriyama

,

Shinnosuke Takamichi

,

,

,

,

Hiroshi Saruwatari

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

2021

Audiobook Speech Synthesis Conditioned by Cross-Sentence Context-Aware Word Embeddings.

[BibT_eX]

[DOI]

,

Tomoki Koriyama

,

Shinnosuke Takamichi

,

,

,

,

Hiroshi Saruwatari

Proceedings of the 11th ISCA Speech Synthesis Workshop, 2021

Loading...