We stand with Ukraine

We stand with Ukraine

Shogo Seki

Orcid: 0009-0007-3990-3740

According to our database¹, Shogo Seki authored at least 37 papers between 2016 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

Confidence-based Filtering for Speech Dataset Curation with Generative Speech Enhancement Using Discrete Tokens.

[DOI]

Kazuki Yamauchi

,

,

CoRR, January, 2026

2025

Improving DF-Conformer Using Hydra For High-Fidelity Generative Speech Enhancement on Discrete Codec Token.

[DOI]

,

,

CoRR, November, 2025

First Analyze Then Enhance: A Task-Aware System for Speech Separation, Denoising, and Dereverberation.

[DOI]

,

,

,

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

The T12 System for AudioMOS Challenge 2025: Audio Aesthetics Score Prediction System Using KAN- and VERSA-based Models.

[DOI]

Katsuhiko Yamamoto

,

Koichi Miyazaki

,

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2025

Exploring Dual-Mode Training for Real-Time Target Speaker Extraction.

[DOI]

,

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2025

2024

VoiceGrad: Non-Parallel Any-to-Many Voice Conversion With Annealed Langevin Dynamics.

[DOI]

Hirokazu Kameoka

,

Takuhiro Kaneko

,

,

,

IEEE ACM Trans. Audio Speech Lang. Process., 2024

Improved Remixing Process for Domain Adaptation-Based Speech Enhancement by Mitigating Data Imbalance in Signal-to-Noise Ratio.

[DOI]

,

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Remixed2remixed: Domain Adaptation for Speech Enhancement by Noise2noise Learning with Remixing.

[DOI]

,

Proceedings of the IEEE International Conference on Acoustics, 2024

Audio Spotforming Using Nonnegative Tensor Factorization with Attractor-Based Regularization.

[DOI]

,

,

,

Daichi Kitamura

Proceedings of the 32nd European Signal Processing Conference, 2024

Inference Efficient Source Separation Using Input-dependent Convolutions.

[DOI]

,

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2024

2023

Non-Parallel Whisper-to-Normal Speaking Style Conversion Using Auxiliary Classifier Variational Autoencoder.

[DOI]

,

Hirokazu Kameoka

,

Takuhiro Kaneko

,

IEEE Access, 2023

CFVC: Conditional Filtering for Controllable Voice Conversion.

[DOI]

,

Takuhiro Kaneko

,

Hirokazu Kameoka

,

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

iSTFTNet2: Faster and More Lightweight iSTFT-Based Neural Vocoder Using 1D-2D CNN.

[DOI]

Takuhiro Kaneko

,

Hirokazu Kameoka

,

,

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

JSV-VC: Jointly Trained Speaker Verification and Voice Conversion Models.

[DOI]

,

Hirokazu Kameoka

,

,

Takuhiro Kaneko

Proceedings of the IEEE International Conference on Acoustics, 2023

Wave-U-Net Discriminator: Fast and Lightweight Discriminator for Generative Adversarial Network-Based Speech Synthesis.

[DOI]

Takuhiro Kaneko

,

Hirokazu Kameoka

,

,

Proceedings of the IEEE International Conference on Acoustics, 2023

W2N-AVSC: Audiovisual Extension For Whisper-To-Normal Speech Conversion.

[DOI]

,

,

Hirokazu Kameoka

,

Takuhiro Kaneko

,

,

Proceedings of the 31st European Signal Processing Conference, 2023

2022

Distilling Sequence-to-Sequence Voice Conversion Models for Streaming Conversion Applications.

[DOI]

,

Hirokazu Kameoka

,

Takuhiro Kaneko

,

Proceedings of the IEEE Spoken Language Technology Workshop, 2022

MISRNet: Lightweight Neural Vocoder Using Multi-Input Single Shared Residual Blocks.

[DOI]

Takuhiro Kaneko

,

Hirokazu Kameoka

,

,

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

CAUSE: Crossmodal Action Unit Sequence Estimation from Speech.

[DOI]

Hirokazu Kameoka

,

Takuhiro Kaneko

,

,

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Investigation And Comparison of Optimization Methods for Variational Autoencoder-Based Underdetermined Multichannel Source Separation.

[DOI]

,

Hirokazu Kameoka

,

Proceedings of the IEEE International Conference on Acoustics, 2022

HBP: An Efficient Block Permutation Solver Using Hungarian Algorithm and Spectrogram Inpainting for Multichannel Audio Source Separation.

[DOI]

,

Hirokazu Kameoka

,

Proceedings of the IEEE International Conference on Acoustics, 2022

ISTFTNET: Fast and Lightweight Mel-Spectrogram Vocoder Incorporating Inverse Short-Time Fourier Transform.

[DOI]

Takuhiro Kaneko

,

,

Hirokazu Kameoka

,

Proceedings of the IEEE International Conference on Acoustics, 2022

Attentionpit: Soft Permutation Invariant Training for Audio Source Separation with Attention Mechanism.

[DOI]

Hirokazu Kameoka

,

,

,

Chihiro Watanabe

Proceedings of the IEEE International Conference on Acoustics, 2022

2021

Singing Fundamental Frequency Contour Generation Using Generalized Command-Response Model and Score-Conditional Variational Autoencoder.

[DOI]

,

,

Proceedings of the 2021 IEEE 31st International Workshop on Machine Learning for Signal Processing (MLSP), 2021

2020

A Study on Utilization of Prior Knowledge in Underdetermined Source Separation and Its Application.

[DOI]

PhD thesis, 2020

Semi-Supervised Self-Produced Speech Enhancement and Suppression Based on Joint Source Modeling of Air- and Body-Conducted Signals Using Variational Autoencoder.

[DOI]

,

,

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Intelligibility Enhancement Based on Speech Waveform Modification Using Hearing Impairment.

[DOI]

,

,

,

Kazuhiro Kobayashi

,

,

,

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Semi-Supervised Enhancement and Suppression of Self-Produced Speech Using Correspondence between Air- and Body-Conducted Signals.

[DOI]

,

,

Patrick Lumban Tobing

,

Proceedings of the 28th European Signal Processing Conference, 2020

2019

Underdetermined Source Separation Based on Generalized Multichannel Variational Autoencoder.

[DOI]

,

Hirokazu Kameoka

,

,

,

IEEE Access, 2019

Joint Separation and Dereverberation of Reverberant Mixtures with Multichannel Variational Autoencoder.

[DOI]

,

Hirokazu Kameoka

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2019

Generalized Multichannel Variational Autoencoder for Underdetermined Source Separation.

[DOI]

,

Hirokazu Kameoka

,

,

,

Proceedings of the 27th European Signal Processing Conference, 2019

2018

Stereophonic Music Separation Based on Non-Negative Tensor Factorization with Cepstral Distance Regularization.

[DOI]

,

,

IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2018

Self-Produced Speech Enhancement and Suppression Method using Air- and Body-Conductive Microphones.

[DOI]

,

,

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

2017

Sketch-based 3D hair posing by contour drawings.

[DOI]

,

Proceedings of the ACM SIGGRAPH / Eurographics Symposium on Computer Animation, 2017

Missing component restoration for masked speech signals based on time-domain spectrogram factorization.

[DOI]

,

Hirokazu Kameoka

,

,

Proceedings of the 27th IEEE International Workshop on Machine Learning for Signal Processing, 2017

Stereophonic music separation based on non-negative tensor factorization with cepstrum regularization.

[DOI]

,

,

Proceedings of the 25th European Signal Processing Conference, 2017

2016

Robust Example Search Using Bottleneck Features for Example-Based Speech Enhancement.

[DOI]

,

,

Keisuke Kinoshita

,

,

Takuya Yoshioka

,

Tomohiro Nakatani

,

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Loading...