Toru Nakashika

CoRR, March, 2026

2025

Fast and Lightweight Non-Parallel Voice Conversion Based on Free-Energy Minimization of Speaker-Conditional Restricted Boltzmann Machine.

[BibT_eX]

[DOI]

Takuya Kishida

IEICE Trans. Inf. Syst., 2025

Chord Recognition Considering All Interval RelationsUsing Weight-Sharing Classification Semi-Restricted Boltzmann Machines.

[BibT_eX]

[DOI]

Shunya Ishikawa

IEICE Trans. Inf. Syst., 2025

Continuous Speech Prediction by Segmentation of Auditory EEG.

[BibT_eX]

[DOI]

Tomoaki Mizuno

Natsue Yoshimura

Proceedings of the 33rd European Signal Processing Conference, 2025

VICNet: FaderNet-Based Voice Impression Conversion with Affective Dimensional Representation.

[BibT_eX]

[DOI]

Saki Kugimoto

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2025

Transferability of Adversarial Examples Across Speaker Embedding Models for Voice Privacy Protection.

[BibT_eX]

[DOI]

Kotaro Nakamura

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2025

Voice Privacy Protection with Adversarial Examples Using Anchor Speaker Embedding.

[BibT_eX]

[DOI]

Shunya Ishikawa

Yuki Katsumata

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2025

Gamma-VAE-VC: Voice Conversion based on VAE Assuming Gamma Distribution for Both Latent Variables and Observation.

[BibT_eX]

[DOI]

Nanako Imaichi

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2025

Few-Step Diffusion-Based Voice Conversion Using Consistency Trajectory Models.

[BibT_eX]

[DOI]

Ryuichi Hatakeyama

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2025

2024

An Investigation on the Speech Recovery from EEG Signals Using Transformer.

[BibT_eX]

[DOI]

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2024

Gamma-VAE: Speech representation based on VAE assuming gamma distribution for both latent variables and observation.

[BibT_eX]

[DOI]

Nanako Imaichi

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2024

DDPMVC: Non-parallel any-to-many voice conversion using diffusion encoder.

[BibT_eX]

[DOI]

Ryuichi Hatakeyama

Kohei Okuda

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2024

2021

Gamma Boltzmann Machine for Audio Modeling.

[BibT_eX]

[DOI]

Kohei Yatabe

IEEE ACM Trans. Audio Speech Lang. Process., 2021

Acoustic Scenery Recognition Using CWT and Deep Neural Network.

[BibT_eX]

[DOI]

Proceedings of the New Trends in Intelligent Software Methodologies, Tools and Techniques, 2021

2020

Speech Chain VC: Linking Linguistic and Acoustic Levels via Latent Distinctive Features for RBM-Based Voice Conversion.

[BibT_eX]

[DOI]

Takuya Kishida

Héctor Manuel Pérez Meana

IEICE Trans. Inf. Syst., 2020

Many-to-Many Symbolic Multi-Track Music Genre Transfer.

[BibT_eX]

[DOI]

Michel Pezzat

Mariko Nakano

Proceedings of the Knowledge Innovation Through Intelligent Software Methodologies, Tools and Techniques, 2020

Complex-Valued Variational Autoencoder: A Novel Deep Generative Model for Direct Representation of Complex Spectra.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Simultaneous Conversion of Speaker Identity and Emotion Based on Multiple-Domain Adaptive RBM.

[BibT_eX]

[DOI]

Takuya Kishida

Shin Tsukamoto

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Gamma Boltzmann Machine for Simultaneously Modeling Linear- and Log-amplitude Spectra.

[BibT_eX]

[DOI]

Kohei Yatabe

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2020

2019

Complex-Valued Restricted Boltzmann Machine for Speaker-Dependent Speech Parameterization From Complex Spectra.

[BibT_eX]

[DOI]

Junichi Yamagishi

IEEE ACM Trans. Audio Speech Lang. Process., 2019

Pre-Training of DNN-Based Speech Synthesis Based on Bidirectional Conversion between Text and Speech.

[BibT_eX]

[DOI]

Kentaro Sone

IEICE Trans. Inf. Syst., 2019

Non-parallel dictionary learning for voice conversion using non-negative Tucker decomposition.

[BibT_eX]

[DOI]

EURASIP J. Audio Speech Music. Process., 2019

STFT Spectral Loss for Training a Neural Speech Waveform Model.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

2018

Deep Relational Model: A Joint Probabilistic Model with a Hierarchical Structure for Bidirectional Estimation of Image and Labels.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2018

Complex-Valued Restricted Boltzmann Machine for Direct Speech Parameterization from Complex Spectra.

[BibT_eX]

[DOI]

Junichi Yamagishi

CoRR, 2018

Bidirectional Voice Conversion Based on Joint Training Using Gaussian-Gaussian Deep Relational Model.

[BibT_eX]

[DOI]

Kentaro Sone

Proceedings of the Odyssey 2018: The Speaker and Language Recognition Workshop, 2018

DNN-based Speech Synthesis for Small Data Sets Considering Bidirectional Speech-Text Conversion.

[BibT_eX]

[DOI]

Kentaro Sone

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

LSTBM: A Novel Sequence Representation of Speech Spectra Using Restricted Boltzmann Machine with Long Short-Term Memory.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Parallel-Data-Free Dictionary Learning for Voice Conversion Using Non-Negative Tucker Decomposition.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017

Speaker-adaptive-trainable Boltzmann machine and its application to non-parallel voice conversion.

[BibT_eX]

[DOI]

EURASIP J. Audio Speech Music. Process., 2017

Complex-Valued Restricted Boltzmann Machine for Direct Learning of Frequency Spectra.

[BibT_eX]

[DOI]

Junichi Yamagishi

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

CAB: An Energy-Based Speaker Clustering Model for Rapid Adaptation in Non-Parallel Voice Conversion.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

2016

Non-Parallel Training in Voice Conversion Using an Adaptive Restricted Boltzmann Machine.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2016

Generative Acoustic-Phonemic-Speaker Model Based on Three-Way Restricted Boltzmann Machine.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Modeling deep bidirectional relationships for image classification and generation.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Speaker adaptive model based on Boltzmann machine for non-parallel training in voice conversion.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

3WRBM-based speech factor modeling for arbitrary-source and non-parallel voice conversion.

[BibT_eX]

[DOI]

Proceedings of the 24th European Signal Processing Conference, 2016

Selection of an optimum random matrix using a genetic algorithm for acoustic feature extraction.

[BibT_eX]

[DOI]

Proceedings of the 15th IEEE/ACIS International Conference on Computer and Information Science, 2016

2015

Voice Conversion Using RNN Pre-Trained by Recurrent Temporal Restricted Boltzmann Machines.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2015

Voice conversion using speaker-dependent conditional restricted Boltzmann machine.

[BibT_eX]

[DOI]

EURASIP J. Audio Speech Music. Process., 2015

Small-parallel exemplar-based voice conversion in noisy environments using affine non-negative matrix factorization.

[BibT_eX]

[DOI]

EURASIP J. Audio Speech Music. Process., 2015

Content-based Image Retrieval Using Rotation-invariant Histograms of Oriented Gradients.

[BibT_eX]

[DOI]

Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, 2015

Sparse nonlinear representation for voice conversion.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Multimedia and Expo, 2015

Feature extraction using pre-trained convolutive bottleneck nets for dysarthric speech recognition.

[BibT_eX]

[DOI]

Proceedings of the 23rd European Signal Processing Conference, 2015

Noise-robust voice conversion using a small parallel data based on non-negative matrix factorization.

[BibT_eX]

[DOI]

Proceedings of the 23rd European Signal Processing Conference, 2015

2014

Voice Conversion Based on Speaker-Dependent Restricted Boltzmann Machines.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2014

High-order sequence modeling using speaker-dependent recurrent temporal restricted boltzmann machines for voice conversion.

[BibT_eX]

[DOI]

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Error correction of automatic speech recognition based on normalized web distance.

[BibT_eX]

[DOI]

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

3D-Object Recognition Based on LLC Using Depth Spatial Pyramid.

[BibT_eX]

[DOI]

Proceedings of the 22nd International Conference on Pattern Recognition, 2014

Voice conversion in time-invariant speaker-independent space.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2014

Voice conversion based on Non-negative matrix factorization using phoneme-categorized dictionary.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2014

2013

High-Frequency Restoration Using Deep Belief Nets for Super-resolution.

[BibT_eX]

[DOI]

Proceedings of the Ninth International Conference on Signal-Image Technology & Internet-Based Systems, 2013

Voice conversion in high-order eigen space using deep belief nets.

[BibT_eX]

[DOI]

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Sparse representation for outliers suppression in semi-supervised image annotation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2013

A Combination of Hand-Crafted and Hierarchical High-Level Learnt Feature Extraction for Music Genre Classification.

[BibT_eX]

[DOI]

Proceedings of the Artificial Neural Networks and Machine Learning - ICANN 2013, 2013

2012

Local-feature-map Integration Using Convolutional Neural Networks for Music Genre Classification.

[BibT_eX]

[DOI]

Christophe Garcia

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

2011

Constrained Spectrum Generation Using A Probabilistic Spectrum Envelope for Mixed Music Analysis.

[BibT_eX]

[DOI]

Proceedings of the 12th International Society for Music Information Retrieval Conference, 2011

Probabilistic Spectrum Envelope: Categorized Audio-Features Representation for NMF-Based Sound Decomposition.

[BibT_eX]

[DOI]