We stand with Ukraine

We stand with Ukraine

Yi-Chiao Wu

Orcid: 0000-0001-7754-459X

According to our database¹, Yi-Chiao Wu authored at least 74 papers between 2016 and 2026.

Collaborative distances:

Dijkstra number² of three.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

Conditional Flow Matching for Visually-Guided Acoustic Highlighting.

[DOI]

,

,

,

,

,

CoRR, February, 2026

SAM Audio Judge: A Unified Multimodal Framework for Perceptual Evaluation of Audio Separation.

[DOI]

,

,

,

,

,

,

,

,

CoRR, January, 2026

2025

SAM Audio: Segment Anything in Audio.

[DOI]

,

,

,

,

,

,

,

,

,

,

Christoph Feichtenhofer

,

,

,

CoRR, December, 2025

Meta Audiobox Aesthetics: Unified Automatic Quality Assessment for Speech, Music, and Sound.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

CoRR, February, 2025

FlowDec: A flow-based full-band general audio codec with high perceptual quality.

[DOI]

,

,

Ricky T. Q. Chen

,

,

,

Alexander Richard

,

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

ComplexDec: A Domain-robust High-fidelity Neural Audio Codec with Complex Spectrum Modeling.

[DOI]

,

,

,

Israel D. Gebru

,

Alexander Richard

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Meta Audiobox Aesthetics: Unified Automatic Assessment for Speech, Music and Sound.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2025

The AudioMOS Challenge 2025.

[DOI]

,

,

,

,

,

,

,

,

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2025

2024

Contact-Free Atrial Fibrillation Screening With Attention Network.

[DOI]

,

,

,

,

Meng-Liang Chung

,

,

IEEE J. Biomed. Health Informatics, September, 2024

Contactless Blood Pressure Measurement Via Remote Photoplethysmography With Synthetic Data Generation Using Generative Adversarial Networks.

[DOI]

,

,

,

,

,

IEEE J. Biomed. Health Informatics, February, 2024

Video-Based Contactless Detection of Sleep Apnea With Deep-Learning Model.

[DOI]

,

,

,

Meng-Liang Chung

,

,

IEEE Trans. Instrum. Meas., 2024

Multi-Speaker Text-to-Speech Training With Speaker Anonymized Data.

[DOI]

,

,

IEEE Signal Process. Lett., 2024

EMO-Codec: An In-Depth Look at Emotion Preservation capacity of Legacy and Neural Codec Models With Subjective and Objective Evaluations.

[DOI]

,

,

Huang-Cheng Chou

,

,

,

,

,

CoRR, 2024

Codec-Superb @ SLT 2024: A Lightweight Benchmark For Neural Audio Codec Models.

[DOI]

,

,

,

,

,

,

Alexander H. Liu

,

,

,

,

,

,

,

,

Shinji Watanabe

,

Proceedings of the IEEE Spoken Language Technology Workshop, 2024

EARS: An Anechoic Fullband Speech Dataset Benchmarked for Speech Enhancement and Dereverberation.

[DOI]

,

,

,

,

,

Shinji Watanabe

,

Alexander Richard

,

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

ScoreDec: A Phase-Preserving High-Fidelity Audio Codec with a Generalized Score-Based Diffusion Post-Filter.

[DOI]

,

,

,

Israel D. Gebru

,

Alexander Richard

Proceedings of the IEEE International Conference on Acoustics, 2024

EMO-Codec: An In-Depth Look at Emotion Preservation Capacity of Legacy and Neural Codec Models with Subjective and Objective Evaluations.

[DOI]

,

,

Huang-Cheng Chou

,

,

,

,

,

,

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2024

2023

Motion Robust Remote Photoplethysmography Measurement During Exercise for Contactless Physical Activity Intensity Detection.

[DOI]

,

,

,

Linda Li-Chuan Lin

,

,

Meng-Liang Chung

,

IEEE Trans. Instrum. Meas., 2023

Deep-Learning-Based Remote Photoplethysmography Measurement in Driving Scenarios With Color and Near-Infrared Images.

[DOI]

,

,

,

IEEE Trans. Instrum. Meas., 2023

High-Fidelity and Pitch-Controllable Neural Vocoder Based on Unified Source-Filter Networks.

[DOI]

,

,

IEEE ACM Trans. Audio Speech Lang. Process., 2023

Recognizing, Fast and Slow: Complex Emotion Recognition With Facial Expression Detection and Remote Physiological Measurement.

[DOI]

,

,

,

,

Sunny S. J. Lin

IEEE Trans. Affect. Comput., 2023

Audiobox: Unified Audio Generation with Natural Language Prompts.

[DOI]

CoRR, 2023

Source-Filter HiFi-GAN: Fast and Pitch Controllable High-Fidelity Neural Vocoder.

[DOI]

,

,

Proceedings of the IEEE International Conference on Acoustics, 2023

Audiodec: An Open-Source Streaming High-Fidelity Neural Audio Codec.

[DOI]

,

Israel D. Gebru

,

,

Alexander Richard

Proceedings of the IEEE International Conference on Acoustics, 2023

2022

A Compensation Network With Error Mapping for Robust Remote Photoplethysmography in Noise-Heavy Conditions.

[DOI]

,

,

IEEE Trans. Instrum. Meas., 2022

A Cyclical Approach to Synthetic and Natural Speech Mismatch Refinement of Neural Post-filter for Low-cost Text-to-speech System.

[DOI]

,

Patrick Lumban Tobing

,

Kazuki Yasuhara

,

Noriyuki Matsunaga

,

,

CoRR, 2022

Soft Label With Channel Encoding for Dependent Facial Image Classification.

[DOI]

,

,

,

IEEE Access, 2022

Unified Source-Filter GAN with Harmonic-plus-Noise Source Excitation Generation.

[DOI]

,

,

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Direct Noisy Speech Modeling for Noisy-To-Noisy Voice Conversion.

[DOI]

,

,

Patrick Lumban Tobing

,

,

Proceedings of the IEEE International Conference on Acoustics, 2022

Contactless Blood Pressure Measurement via Remote Photoplethysmography with Synthetic Data Generation Using Generative Adversarial Network.

[DOI]

,

,

,

,

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2022

2021

Incorporating Prior Knowledge on Speech Production Mechanism into Neural Speech Waveform Generation.

[DOI]

PhD thesis, 2021

Quasi-Periodic WaveNet: An Autoregressive Raw Waveform Generative Model With Pitch-Dependent Dilated Convolution Neural Network.

[DOI]

,

,

Patrick Lumban Tobing

,

Kazuhiro Kobayashi

,

IEEE ACM Trans. Audio Speech Lang. Process., 2021

Quasi-Periodic Parallel WaveGAN: A Non-Autoregressive Raw Waveform Generative Model With Pitch-Dependent Dilated Convolution Neural Network.

[DOI]

,

,

,

,

IEEE ACM Trans. Audio Speech Lang. Process., 2021

Pretraining Techniques for Sequence-to-Sequence Voice Conversion.

[DOI]

,

,

,

Hirokazu Kameoka

,

IEEE ACM Trans. Audio Speech Lang. Process., 2021

The AS-NU System for the M2VoC Challenge.

[DOI]

,

,

,

,

,

,

,

,

CoRR, 2021

Unified Source-Filter GAN: Unified Source-Filter Network Based On Factorization of Quasi-Periodic Parallel WaveGAN.

[DOI]

,

,

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Relational Data Selection for Data Augmentation of Speaker-Dependent Multi-Band MelGAN Vocoder.

[DOI]

,

,

,

,

,

,

,

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Crank: An Open-Source Software for Nonparallel Voice Conversion Based on Vector-Quantized Variational Autoencoder.

[DOI]

Kazuhiro Kobayashi

,

,

,

Patrick Lumban Tobing

,

,

Proceedings of the IEEE International Conference on Acoustics, 2021

Any-to-One Sequence-to-Sequence Voice Conversion Using Self-Supervised Discrete Speech Representations.

[DOI]

,

,

Proceedings of the IEEE International Conference on Acoustics, 2021

HASA-Net: A Non-Intrusive Hearing-Aid Speech Assessment Network.

[DOI]

Hsin-Tien Chiang

,

,

,

,

,

,

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

Noisy-to-Noisy Voice Conversion Framework with Denoising Model.

[DOI]

,

,

Patrick Lumban Tobing

,

,

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021

2020

Any-to-One Sequence-to-Sequence Voice Conversion using Self-Supervised Discrete Speech Representations.

[DOI]

,

,

,

CoRR, 2020

Non-Parallel Voice Conversion System With WaveNet Vocoder and Collapsed Speech Suppression.

[DOI]

,

Patrick Lumban Tobing

,

Kazuhiro Kobayashi

,

,

IEEE Access, 2020

Masked Neural Sparse Encoder for Face Occlusion Detection.

[DOI]

,

Proceedings of the 2020 IEEE International Conference on Systems, Man, and Cybernetics, 2020

A Cyclical Post-Filtering Approach to Mismatch Refinement of Neural Vocoder for Text-to-Speech Systems.

[DOI]

,

Patrick Lumban Tobing

,

Kazuki Yasuhara

,

Noriyuki Matsunaga

,

,

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Quasi-Periodic Parallel WaveGAN Vocoder: A Non-Autoregressive Pitch-Dependent Dilated Convolution Model for Parametric Speech Generation.

[DOI]

,

,

,

,

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Cyclic Spectral Modeling for Unsupervised Unit Discovery into Voice Conversion with Excitation and Waveform Modeling.

[DOI]

Patrick Lumban Tobing

,

,

,

Kazuhiro Kobayashi

,

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Voice Transformer Network: Sequence-to-Sequence Voice Conversion Using Transformer with Text-to-Speech Pretraining.

[DOI]

,

,

,

Hirokazu Kameoka

,

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Efficient Shallow Wavenet Vocoder Using Multiple Samples Output Based on Laplacian Distribution and Linear Prediction.

[DOI]

Patrick Lumban Tobing

,

,

,

Kazuhiro Kobayashi

,

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Baseline System of Voice Conversion Challenge 2020 with Cyclic Variational Autoencoder and Parallel WaveGAN.

[DOI]

Patrick Lumban Tobing

,

,

Proceedings of the Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020, 2020

The NU Voice Conversion System for the Voice Conversion Challenge 2020: On the Effectiveness of Sequence-to-sequence Models and Autoregressive Neural Vocoders.

[DOI]

,

Patrick Lumban Tobing

,

,

Kazuhiro Kobayashi

,

Proceedings of the Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020, 2020

2019

The ASVspoof 2019 database.

[DOI]

CoRR, 2019

Voice Conversion With CycleRNN-Based Spectral Mapping and Finely Tuned WaveNet Vocoder.

[DOI]

Patrick Lumban Tobing

,

,

,

Kazuhiro Kobayashi

,

IEEE Access, 2019

Statistical Voice Conversion with Quasi-periodic WaveNet Vocoder.

[DOI]

,

Patrick Lumban Tobing

,

,

Kazuhiro Kobayashi

,

Proceedings of the 10th ISCA Speech Synthesis Workshop, 2019

Generalization of Spectrum Differential based Direct Waveform Modification for Voice Conversion.

[DOI]

,

,

Kazuhiro Kobayashi

,

,

,

Patrick Lumban Tobing

,

,

,

Proceedings of the 10th ISCA Speech Synthesis Workshop, 2019

Quasi-Periodic WaveNet Vocoder: A Pitch Dependent Dilated Convolution Model for Parametric Speech Generation.

[DOI]

,

,

Patrick Lumban Tobing

,

Kazuhiro Kobayashi

,

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Non-Parallel Voice Conversion with Cyclic Variational Autoencoder.

[DOI]

Patrick Lumban Tobing

,

,

,

Kazuhiro Kobayashi

,

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Investigation of F0 Conditioning and Fully Convolutional Networks in Variational Autoencoder Based Voice Conversion.

[DOI]

,

,

,

Patrick Lumban Tobing

,

,

Kazuhiro Kobayashi

,

,

,

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Voice Conversion with Cyclic Recurrent Neural Network and Fine-tuned Wavenet Vocoder.

[DOI]

Patrick Lumban Tobing

,

,

,

Kazuhiro Kobayashi

,

Proceedings of the IEEE International Conference on Acoustics, 2019

Refined WaveNet Vocoder for Variational Autoencoder Based Voice Conversion.

[DOI]

,

,

,

Patrick Lumban Tobing

,

,

Kazuhiro Kobayashi

,

,

,

Proceedings of the 27th European Signal Processing Conference, 2019

2018

Locally Linear Embedding Based Post-Filtering for Speech Enhancement.

[DOI]

,

,

,

,

,

,

,

J. Inf. Sci. Eng., 2018

Voice Conversion Based on Locally Linear Embedding.

[DOI]

,

,

,

,

,

,

,

J. Inf. Sci. Eng., 2018

An Evaluation of Deep Spectral Mappings and WaveNet Vocoder for Voice Conversion.

[DOI]

Patrick Lumban Tobing

,

,

,

Kazuhiro Kobayashi

,

Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

The NU Non-Parallel Voice Conversion System for the Voice Conversion Challenge 2018.

[DOI]

,

Patrick Lumban Tobing

,

,

Kazuhiro Kobayashi

,

Proceedings of the Odyssey 2018: The Speaker and Language Recognition Workshop, 2018

NU Voice Conversion System for the Voice Conversion Challenge 2018.

[DOI]

Patrick Lumban Tobing

,

,

,

Kazuhiro Kobayashi

,

Proceedings of the Odyssey 2018: The Speaker and Language Recognition Workshop, 2018

Collapsed Speech Segment Detection and Suppression for WaveNet Vocoder.

[DOI]

,

Kazuhiro Kobayashi

,

,

Patrick Lumban Tobing

,

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Exemplar-Based Spectral Detail Compensation for Voice Conversion.

[DOI]

,

,

,

,

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

2017

A Post-Filtering Approach Based on Locally Linear Embedding Difference Compensation for Speech Enhancement.

[DOI]

,

,

,

,

,

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Voice Conversion from Unaligned Corpora Using Variational Autoencoding Wasserstein Generative Adversarial Networks.

[DOI]

,

,

,

,

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

A locally linear embbeding based postfiltering approach for speech enhancement.

[DOI]

,

,

,

,

,

,

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Fast locally linear embedding algorithm for exemplar-based voice conversion.

[DOI]

,

,

,

,

,

,

Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

2016

Dictionary update for NMF-based voice conversion using an encoder-decoder network.

[DOI]

,

,

,

,

Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016

Locally Linear Embedding for Exemplar-Based Spectral Conversion.

[DOI]

,

,

,

,

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Voice conversion from non-parallel corpora using variational auto-encoder.

[DOI]

,

,

,

,

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2016

Loading...