Takuma Okamoto

Orcid: 0000-0001-9913-4647

According to our database¹, Takuma Okamoto authored at least 55 papers between 2010 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2025

Sequence-to-Sequence Voice Conversion With Weighted Guided Attention.

[BibT_eX]

[DOI]

IEEE Access, 2025

Phoneme-Level Duration Controllable Neural Text-to-Speech With Phoneme Embedding Skip Connection and Modified Gaussian Duration Modeling.

[BibT_eX]

[DOI]

IEEE Access, 2025

SFC-L1: Sound Field Control With Least Absolute Deviation Regression.

[BibT_eX]

[DOI]

Takuma Okamoto

Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2025

Simultaneous Speech Translation Integrated Compact Multiple Sound Spot Synthesis System On A Laptop Carried Out With A Backpack.

[BibT_eX]

[DOI]

Takuma Okamoto

Michiyo Kono

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

GST-BERT-TTS: Prosody Prediction Without Accentual Labels For Multi-Speaker TTS Using BERT With Global Style Tokens.

[BibT_eX]

[DOI]

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Mora-Level Prosody Prediction for Text-to-Speech Using Japanese BERT Without Accentual Labels.

[BibT_eX]

[DOI]

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Speech Masking System Based on Spatially Separated Multiple TTS Maskers With A Compact Circular Loudspeaker Array.

[BibT_eX]

[DOI]

Takuma Okamoto

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2025

Voice Factor Control Using FIR-Based Fast Neural Vocoder for Speech Generation Applications.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2025

Layer-wise Analysis for Quality of Multilingual Synthesized Speech.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2025

2024

Fast Neural Speech Waveform Generative Models With Fully-Connected Layer-Based Upsampling.

[BibT_eX]

[DOI]

IEEE Access, 2024

Challenge of Singing Voice Synthesis Using Only Text-To-Speech Corpus With FIRNet Source-Filter Neural Vocoder.

[BibT_eX]

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Mobile PresenTra: NICT fast neural text-to-speech system on smartphones with incremental inference of MS-FC-HiFi-GAN for law-latency synthesis.

[BibT_eX]

[DOI]

Takuma Okamoto

Yamato Ohtani

Hisashi Kawai

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Convnext-TTS And Convnext-VC: Convnext-Based Fast End-To-End Sequence-To-Sequence Text-To-Speech And Voice Conversion.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

FIRNet: Fundamental Frequency Controllable Fast Neural Vocoder With Trainable Finite Impulse Response Filter.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

2023

Harmonic-Net: Fundamental Frequency and Speech Rate Controllable Fast Neural Vocoder.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2023

E2E-S2S-VC: End-To-End Sequence-To-Sequence Voice Conversion.

[BibT_eX]

[DOI]

Takuma Okamoto

Tomoki Toda

Hisashi Kawai

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Continuous Action Space-Based Spoken Language Acquisition Agent Using Residual Sentence Embedding and Transformer Decoder.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

WaveNeXt: ConvNeXt-Based Fast Neural Vocoder Without ISTFT layer.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022

Neural speech-rate conversion with multispeaker WaveNet vocoder.

[BibT_eX]

[DOI]

Speech Commun., 2022

Automatic Spoken Language Acquisition Based on Observation and Dialogue.

[BibT_eX]

[DOI]

IEEE J. Sel. Top. Signal Process., 2022

Speaking-Rate-Controllable HiFi-GAN Using Feature Interpolation.

[BibT_eX]

[DOI]

CoRR, 2022

2021

Quasi-Periodic Parallel WaveGAN: A Non-Autoregressive Raw Waveform Generative Model With Pitch-Dependent Dilated Convolution Neural Network.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2021

Full-Band LPCNet: A Real-Time Neural Vocoder for 48 kHz Audio With a CPU.

[BibT_eX]

[DOI]

IEEE Access, 2021

2D Multizone Sound Field Synthesis with Interior-Exterior Ambisonics.

[BibT_eX]

[DOI]

Takuma Okamoto

Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2021

Noise Level Limited Sub-Modeling for Diffusion Probabilistic Vocoders.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Close-Talking Recording with Planarly Distributed Microphones.

[BibT_eX]

[DOI]

Takuma Okamoto

Proceedings of the IEEE International Conference on Acoustics, 2021

High-Intelligibility Speech Synthesis for Dysarthric Speakers with LPCNet-Based TTS and CycleVAE-Based VC.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Multi-Stream HiFi-GAN with Data-Driven Waveform Decomposition.

[BibT_eX]

[DOI]

Takuma Okamoto

Tomoki Toda

Hisashi Kawai

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

2020

Text-to-Speech Synthesis.

[BibT_eX]

[DOI]

Proceedings of the Speech-to-Speech Translation, 2020

Quasi-Periodic Parallel WaveGAN Vocoder: A Non-Autoregressive Pitch-Dependent Dilated Convolution Model for Parametric Speech Generation.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Transformer-Based Text-to-Speech with Weighted Forced Attention.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019

3D Localized Sound Zone Generation with a Planar Omni-Directional Loudspeaker Array.

[BibT_eX]

[DOI]

Takuma Okamoto

Proceedings of the 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2019

Real-Time Neural Text-to-Speech with Sequence-to-Sequence Acoustic Model and WaveGlow or Single Gaussian WaveRNN Vocoders.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Investigations of Real-time Gaussian Fftnet and Parallel Wavenet Neural Vocoders with Simple Acoustic Features.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

Horizontal 3D Sound Field Recording and 2.5D Synthesis with Omni-directional Circular Arrays.

[BibT_eX]

[DOI]

Takuma Okamoto

Proceedings of the IEEE International Conference on Acoustics, 2019

Tacotron-Based Acoustic Model Using Phoneme Alignment for Practical Neural Text-to-Speech Systems.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

2018

Improving FFTNet Vocoder with Noise Shaping and Subband Approaches.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

2.5D Localized Sound Zone Generation with a Circular Array of Fixed-Directivity Loudspeakers.

[BibT_eX]

[DOI]

Takuma Okamoto

Proceedings of the 16th International Workshop on Acoustic Signal Enhancement, 2018

An Investigation of Subband Wavenet Vocoder Covering Entire Audible Frequency Range with Limited Acoustic Features.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017

Development of Wearable Sheet-Type Shear Force Sensor and Measurement System that is Insusceptible to Temperature and Pressure.

[BibT_eX]

[DOI]

Sensors, 2017

Localized Sound Zone Generation Based on External Radiation Canceller.

[BibT_eX]

[DOI]

Takuma Okamoto

J. Inf. Hiding Multim. Signal Process., 2017

Horizontal Local Sound Field Propagation Based on Sound Source Dimension Mismatch.

[BibT_eX]

[DOI]

Takuma Okamoto

J. Inf. Hiding Multim. Signal Process., 2017

Angular spectrum decomposition-based 2.5D higher-order spherical harmonic sound field synthesis with a linear loudspeaker array.

[BibT_eX]

[DOI]

Takuma Okamoto

Proceedings of the 2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2017

Analytical approach to 2.5D sound field control using a circular double-layer array of fixed-directivity loudspeakers.

[BibT_eX]

[DOI]

Takuma Okamoto

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Subband wavenet with overlapped single-sideband filterbanks.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

2016

2.5D higher order ambisonics for a sound field described by angular spectrum coefficients.

[BibT_eX]

[DOI]

Takuma Okamoto

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

2015

A Spatial Extrapolation Method to Derive High-Order Ambisonics Data from Stereo Sources.

[BibT_eX]

[DOI]

J. Inf. Hiding Multim. Signal Process., 2015

Analytical methods of generating multiple sound zones for open and baffled circular loudspeaker arrays.

[BibT_eX]

[DOI]

Takuma Okamoto

Proceedings of the 2015 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2015

Near-field sound propagation based on a circular and linear array combination.

[BibT_eX]

[DOI]

Takuma Okamoto

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

2014

Sound Field Reproduction Using Ambisonics and Irregular Loudspeaker Arrays.

[BibT_eX]

[DOI]

IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2014

Generation of multiple sound zones by spatial filtering in wavenumber domain using a linear array of loudspeakers.

[BibT_eX]

[DOI]

Takuma Okamoto

Proceedings of the IEEE International Conference on Acoustics, 2014

2013

Extrapolation of Horizontal Ambisonics Data from Mainstream Stereo Sources.

[BibT_eX]

[DOI]

Proceedings of the Ninth International Conference on Intelligent Information Hiding and Multimedia Signal Processing, 2013

Morpion Solitaire 5D: a new upper bound 121 on the maximum score.

[BibT_eX]

[DOI]

Proceedings of the 25th Canadian Conference on Computational Geometry, 2013

2010

Multiple Description Coding Using Time Domain Division for MP3 coded Sound Signal.

[BibT_eX]

[DOI]

J. Inf. Hiding Multim. Signal Process., 2010

Comparative performance evaluation of near 3D sound field reproduction system with directional loudspeakers and wave field synthesis.

[BibT_eX]

[DOI]

Proceedings of the 4th International Universal Communication Symposium, 2010

Takuma Okamoto

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...