Nobukatsu Hojo

Orcid: 0009-0005-4017-6304

According to our database¹, Nobukatsu Hojo authored at least 48 papers between 2013 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of five.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

Let's Put Ourselves in Sally's Shoes: Shoes-of-Others Prefilling Improves Theory of Mind in Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EACL 2026, 2026

2025

Let's Put Ourselves in Sally's Shoes: Shoes-of-Others Prefixing Improves Theory of Mind in Large Language Models.

[BibT_eX]

[DOI]

CoRR, June, 2025

Data stream-pairwise bottleneck transformer for engagement estimation from video conversation.

[BibT_eX]

[DOI]

Frontiers Artif. Intell., 2025

GenerativeGUI: Dynamic GUI Generation Leveraging LLMs for Enhanced User Interaction on Chat Interfaces.

[BibT_eX]

[DOI]

Proceedings of the Extended Abstracts of the CHI Conference on Human Factors in Computing Systems, 2025

ToMATO: Verbalizing the Mental States of Role-Playing LLMs for Benchmarking Theory of Mind.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

Multimodal Fine-Grained Apparent Personality Trait Recognition: Joint Modeling of Big Five and Questionnaire Item-level Scores.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

2024

VoiceGrad: Non-Parallel Any-to-Many Voice Conversion With Annealed Langevin Dynamics.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2024

Participant-Pair-Wise Bottleneck Transformer for Engagement Estimation from Video Conversation.

[BibT_eX]

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Learning from Multiple Annotator Biased Labels in Multimodal Conversation.

[BibT_eX]

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Unified Multi-Talker ASR with and without Target-speaker Enrollment.

[BibT_eX]

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Talking Face Generation for Impression Conversion Considering Speech Semantics.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

2023

Downstream Task Agnostic Speech Enhancement with Self-Supervised Representation Loss.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

End-to-End Joint Target and Non-Target Speakers ASR.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Transcribing Speech as Spoken and Written Dual Text Using an Autoregressive Model.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Audio-Visual Praise Estimation for Conversational Video based on Synchronization-Guided Multimodal Transformer.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Next-Speaker Prediction Based on Non-Verbal Information in Multi-Party Video Conversation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Modeling Lead-Lag Structure in Facial Expression Synchrony for Social-Psychological Outcome Prediction from Negotiation Interaction.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

2022

Multimodal Negotiation Corpus with Various Subjective Assessments for Social-Psychological Outcome Prediction from Non-Verbal Cues.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

End-to-End Joint Modeling of Conversation History-Dependent and Independent ASR Systems with Multi-History Training.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

2021

Many-to-Many Voice Transformer Network.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2021

Model architectures to extrapolate emotional expressions in DNN-based text-to-speech.

[BibT_eX]

[DOI]

Speech Commun., 2021

Maskcyclegan-VC: Learning Non-Parallel Voice Conversion with Filling in Frames.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

2020

ConvS2S-VC: Fully Convolutional Sequence-to-Sequence Voice Conversion.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2020

Nonparallel Voice Conversion With Augmented Classifier Star Generative Adversarial Networks.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2020

CycleGAN-VC3: Examining and Improving CycleGAN-VCs for Mel-Spectrogram Conversion.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

2019

ACVAE-VC: Non-Parallel Voice Conversion With Auxiliary Classifier Variational Autoencoder.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2019

WaveCycleGAN2: Time-domain Neural Post-filter for Speech Waveform Generation.

[BibT_eX]

[DOI]

CoRR, 2019

StarGAN-VC2: Rethinking Conditional Methods for StarGAN-Based Voice Conversion.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Evaluating Intention Communication by TTS Using Explicit Definitions of Illocutionary Act Performance.

[BibT_eX]

[DOI]

Nobukatsu Hojo

Noboru Miyazaki

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

ATTS2S-VC: Sequence-to-sequence Voice Conversion with Attention and Context Preservation Mechanisms.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

Cyclegan-VC2: Improved Cyclegan-based Non-parallel Voice Conversion.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

2018

DNN-Based Speech Synthesis Using Speaker Codes.

[BibT_eX]

[DOI]

Nobukatsu Hojo

Yusuke Ijima

Hideyuki Mizuno

IEICE Trans. Inf. Syst., 2018

ConvS2S-VC: Fully convolutional sequence-to-sequence voice conversion.

[BibT_eX]

[DOI]

CoRR, 2018

WaveCycleGAN: Synthetic-to-natural speech waveform conversion using cycle-consistent adversarial networks.

[BibT_eX]

[DOI]

CoRR, 2018

ACVAE-VC: Non-parallel many-to-many voice conversion with auxiliary classifier variational autoencoder.

[BibT_eX]

[DOI]

CoRR, 2018

StarGAN-VC: Non-parallel many-to-many voice conversion with star generative adversarial networks.

[BibT_eX]

[DOI]

CoRR, 2018

Generative adversarial network-based approach to signal reconstruction from magnitude spectrograms.

[BibT_eX]

[DOI]

CoRR, 2018

Synthetic-to-Natural Speech Waveform Conversion Using Cycle-Consistent Adversarial Networks.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

StarGAN-VC: non-parallel many-to-many Voice Conversion Using Star Generative Adversarial Networks.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

Generative adversarial network-based approach to signal reconstruction from magnitude spectrogram.

[BibT_eX]

[DOI]

Proceedings of the 26th European Signal Processing Conference, 2018

Automatic Speech Pronunciation Correction with Dynamic Frequency Warping-Based Spectral Conversion.

[BibT_eX]

[DOI]

Proceedings of the 26th European Signal Processing Conference, 2018

2017

Prosody Aware Word-Level Encoder Based on BLSTM-RNNs for DNN-Based Speech Synthesis.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

DNN-SPACE: DNN-HMM-Based Generative Model of Voice F<sub>0</sub> Contours for Statistical Phrase/Accent Command Estimation.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Generative adversarial network-based postfilter for statistical parametric speech synthesis.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

An investigation to transplant emotional expressions in DNN-based TTS synthesis.

[BibT_eX]

[DOI]

Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

2016

An Investigation of DNN-Based Speech Synthesis Using Speaker Codes.

[BibT_eX]

[DOI]

Nobukatsu Hojo

Yusuke Ijima

Hideyuki Mizuno

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

2014

Speech prosody generation for text-to-speech synthesis based on generative model of F<sub>0</sub> contours.

[BibT_eX]

[DOI]

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

2013

Text-to-speech synthesizer based on combination of composite wavelet and hidden Markov models.

[BibT_eX]

[DOI]

Proceedings of the Eighth ISCA Tutorial and Research Workshop on Speech Synthesis, 2013

Nobukatsu Hojo

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...