Xin Wang

Orcid: 0000-0001-8246-0606

Affiliations:
  • Graduate University for Advanced Studies (SOKENDAI), National Institute of Informatics, Department of Informatics, Tokyo, Japan


According to our database1, Xin Wang authored at least 166 papers between 2012 and 2026.

Collaborative distances:

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

Online presence:

On csauthors.net:

Bibliography

2026
Deepfake Word Detection by Next-token Prediction using Fine-tuned Whisper.
CoRR, February, 2026

Self Voice Conversion as an Attack against Neural Audio Watermarking.
CoRR, January, 2026

WeDefense: A Toolkit to Defend Against Fake Audio.
CoRR, January, 2026

ASVspoof 5: Evaluation of Spoofing, Deepfake, and Adversarial Attack Detection Using Crowdsourced Speech.
CoRR, January, 2026

Robust Identity-Based Signcryption Scheme for Vehicular Ad Hoc Networks.
IEEE Trans. Inf. Forensics Secur., 2026

Search Me in the Dark: Access Pattern-Hidden Range Query Over Encrypted Spatial Data.
IEEE Trans. Inf. Forensics Secur., 2026

ASVspoof 5: Design, collection and validation of resources for spoofing, deepfake, and adversarial attack detection using crowdsourced speech.
Comput. Speech Lang., 2026

The third VoicePrivacy challenge: Preserving emotional expressiveness and linguistic content in voice anonymization.
Comput. Speech Lang., 2026

Context-aware prompting for collaborative problem solving skill identification.
Comput. Educ. Artif. Intell., 2026

2025
Target speaker anonymization in multi-speaker recordings.
CoRR, October, 2025

Frustratingly Easy Zero-Day Audio DeepFake Detection via Retrieval Augmentation and Profile Matching.
CoRR, September, 2025

WildSpoof Challenge Evaluation Plan.
CoRR, August, 2025

SEF-MK: Speaker-Embedding-Free Voice Anonymization through Multi-k-means Quantization.
CoRR, August, 2025

LENS-DF Sample Data: A Dataset for Long-Form, Multi-Speaker, and Noisy Audio.
Dataset, July, 2025

Efficient Encrypted Trajectory Similarity Query Over Mobile E-Health Cloud.
IEEE Internet Things J., May, 2025

Dyn-D<sup>2</sup>P: Dynamic Differentially Private Decentralized Learning with Provable Utility Guarantee.
CoRR, May, 2025

Trace Your Footprint: Efficient Spatial Keyword Query Over Encrypted Trajectory Data.
IEEE Trans. Inf. Forensics Secur., 2025

A Benchmark for Multi-Speaker Anonymization.
IEEE Trans. Inf. Forensics Secur., 2025

Adapting general disentanglement-based speaker anonymization for enhanced emotion preservation.
Comput. Speech Lang., 2025

Zero Trust-Based Dynamic and Continuous Access Control for Mobile Devices.
Proceedings of the 24th IEEE International Conference on Trust, 2025

DFed-LaMA: Differentially Private Federated Learning via Adaptive Layer-Wise Model Aggregation.
Proceedings of the 22nd IEEE International Conference on Mobile Ad-Hoc and Smart Systems, 2025

MIDI-VALLE: Improving Expressive Piano Performance Synthesis Through Neural Codec Language Modelling.
Proceedings of the 26th International Society for Music Information Retrieval Conference, 2025

Mitigating Language Mismatch in SSL-Based Speaker Anonymization.
Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

A Comparative Study on Proactive and Passive Detection of Deepfake Speech.
Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

The Text-to-speech in the Wild (TITW) Database.
Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

From Sharpness to Better Generalization for Speech Deepfake Detection.
Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Bridging Privacy Preservation and Optimization in Heterogeneous Decentralized Learning: Regularization Tuning and Model Pruning.
Proceedings of the International Joint Conference on Neural Networks, 2025

Dyn-D^2P: Dynamic Differentially Private Decentralized Learning with Provable Utility Guarantee.
Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence, 2025

FedSaaS: Class-Consistency Federated Semantic Segmentation via Global Prototype Supervision and Local Adversarial Harmonization.
Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence, 2025

LENS-DF: Deepfake Detection and Temporal Localization for Long-Form Noisy Speech.
Proceedings of the IEEE International Joint Conference on Biometrics, 2025

SecureSpeech: Prompt-based Speaker and Content Protection.
Proceedings of the IEEE International Joint Conference on Biometrics, 2025

Towards An Integrated Approach for Expressive Piano Performance Synthesis from Music Scores.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Audio Codec Augmentation for Robust Collaborative Watermarking of Speech Synthesis.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

DNPS: A Robust Aggregation Method for Heterogeneous Distributed Learning Based on Gradient Direction and Norm Probability Screening.
Proceedings of the 28th International Conference on Computer Supported Cooperative Work in Design, 2025

Post-training for Deepfake Speech Detection.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2025

Speaker Privacy and Security in the Big Data Era: Protection and Defense Against Deepfake.
Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2025

2024

The VoicePrivacy 2022 Challenge: Progress and Perspectives in Voice Anonymisation.
IEEE ACM Trans. Audio Speech Lang. Process., 2024

ZMM-TTS: Zero-Shot Multilingual and Multispeaker Speech Synthesis Conditioned on Self-Supervised Discrete Speech Representations.
IEEE ACM Trans. Audio Speech Lang. Process., 2024

Application of Prompt Learning Models in Identifying the Collaborative Problem Solving Skills in an Online Task.
Proc. ACM Hum. Comput. Interact., 2024

Joint speaker encoder and neural back-end model for fully end-to-end automatic speaker verification with multiple enrollment utterances.
Comput. Speech Lang., 2024

SpoofCeleb: Speech Deepfake Detection and SASV In The Wild.
CoRR, 2024

Text-To-Speech Synthesis In The Wild.
CoRR, 2024

Malacopula: adversarial automatic speaker verification attacks using a neural-based generalised Hammerstein model.
CoRR, 2024

ASVspoof 5: Crowdsourced Speech Data, Deepfakes, and Adversarial Attacks at Scale.
CoRR, 2024

To what extent can ASV systems naturally defend against spoofing attacks?
CoRR, 2024

The VoicePrivacy 2024 Challenge Evaluation Plan.
CoRR, 2024

FedDue: Optimizing Personalized Federated Learning Through Dynamic Update Classifier.
Proceedings of the Wireless Artificial Intelligent Computing Systems and Applications, 2024

Spoofing-Aware Speaker Verification Robust Against Domain and Channel Mismatches.
Proceedings of the IEEE Spoken Language Technology Workshop, 2024

Poster Abstract: Intrusion Detection for In-vehicle Networks Based on Parc-net Architecture.
Proceedings of the 20th International Conference on Mobility, Sensing and Networking, 2024

Spoof Diarization: "What Spoofed When" in Partially Spoofed Audio.
Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Speaker Detection by the Individual Listener and the Crowd: Parametric Models Applicable to Bonafide and Deepfake Speech.
Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

To what extent can ASV systems naturally defend against spoofing attacks?
Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

An Initial Investigation of Language Adaptation for TTS Systems under Low-resource Scenarios.
Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Revisiting and Improving Scoring Fusion for Spoofing-aware Speaker Verification Using Compositional Data Analysis.
Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

PrivSGP-VR: Differentially Private Variance-Reduced Stochastic Gradient Push with Tight Utility Bounds.
Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024

Synvox2: Towards A Privacy-Friendly Voxceleb2 Dataset.
Proceedings of the IEEE International Conference on Acoustics, 2024

Collaborative Watermarking for Adversarial Speech Synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2024

Spoofing Attack Augmentation: Can Differently-Trained Attack Models Improve Generalisation?
Proceedings of the IEEE International Conference on Acoustics, 2024

Can Large-Scale Vocoded Spoofed Data Improve Speech Spoofing Countermeasure with a Self-Supervised Front End?
Proceedings of the IEEE International Conference on Acoustics, 2024

A Preliminary Case Study on Long-Form In-the-Wild Audio Spoofing Detection.
Proceedings of the 23rd International Conference of the Biometrics Special Interest Group, 2024

Exploring Active Data Selection Strategies for Continuous Training in Deepfake Detection.
Proceedings of the 23rd International Conference of the Biometrics Special Interest Group, 2024

2023
Data-driven control for dynamic quantized nonlinear systems with state constraints based on barrier functions.
Inf. Sci., October, 2023

Using iterative adaptation and dynamic mask for child speech extraction under real-world multilingual conditions.
Speech Commun., July, 2023

Enabling Anonymous Authorized Auditing Over Keyword-Based Searchable Ciphertexts in Cloud Storage Systems.
IEEE Trans. Serv. Comput., 2023

The PartialSpoof Database and Countermeasures for the Detection of Short Fake Speech Segments Embedded in an Utterance.
IEEE ACM Trans. Audio Speech Lang. Process., 2023

Speaker Anonymization Using Orthogonal Householder Neural Network.
IEEE ACM Trans. Audio Speech Lang. Process., 2023

ASVspoof 2021: Towards Spoofed and Deepfake Speech Detection in the Wild.
IEEE ACM Trans. Audio Speech Lang. Process., 2023

Speaker-Text Retrieval via Contrastive Learning.
CoRR, 2023

DDSP-based Neural Waveform Synthesis of Polyphonic Guitar Performance from String-wise MIDI Input.
CoRR, 2023

Language-independent speaker anonymization using orthogonal Householder neural network.
CoRR, 2023

Range-Based Equal Error Rate for Spoof Localization.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Improving Generalization Ability of Countermeasures for New Mismatch Scenario by Combining Multiple Advanced Regularization Terms.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Towards Single Integrated Spoofing-aware Speaker Verification Embeddings.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Neural Network-Based Safety Optimization Control for Constrained Discrete-Time Systems.
Proceedings of the 49th Annual Conference of the IEEE Industrial Electronics Society, 2023

Spoofed Training Data for Speech Spoofing Countermeasure Can Be Efficiently Created Using Neural Vocoders.
Proceedings of the IEEE International Conference on Acoustics, 2023

Can Knowledge of End-to-End Text-to-Speech Models Improve Neural Midi-to-Audio Synthesis Systems?
Proceedings of the IEEE International Conference on Acoustics, 2023

Hiding Speaker's Sex in Speech Using Zero-Evidence Speaker Representation in an Analysis/Synthesis Pipeline.
Proceedings of the IEEE International Conference on Acoustics, 2023

Modelling Attention Levels with Ocular Responses in a Speech-in-Noise Recall Task.
Proceedings of the 2023 Symposium on Eye Tracking Research and Applications, 2023

2022
Privacy and Utility of X-Vector Based Speaker Anonymization.
IEEE ACM Trans. Audio Speech Lang. Process., 2022

The VoicePrivacy 2020 Challenge: Results and findings.
Comput. Speech Lang., 2022

The VoicePrivacy 2020 Challenge Evaluation Plan.
CoRR, 2022

The PartialSpoof Database and Countermeasures for the Detection of Short Generated Audio Segments Embedded in a Speech Utterance.
CoRR, 2022

The VoicePrivacy 2022 Challenge Evaluation Plan.
CoRR, 2022

A Practical Guide to Logical Access Voice Presentation Attack Detection.
CoRR, 2022

Investigating Active-Learning-Based Training Data Selection for Speech Spoofing Countermeasure.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Automatic Speaker Verification Spoofing and Deepfake Detection Using Wav2vec 2.0 and Data Augmentation.
Proceedings of the Odyssey 2022: The Speaker and Language Recognition Workshop, 28 June, 2022

Language-Independent Speaker Anonymization Approach Using Self-Supervised Pre-Trained Models.
Proceedings of the Odyssey 2022: The Speaker and Language Recognition Workshop, 28 June, 2022

Investigating Self-Supervised Front Ends for Speech Spoofing Countermeasures.
Proceedings of the Odyssey 2022: The Speaker and Language Recognition Workshop, 28 June, 2022

Analyzing Language-Independent Speaker Anonymization Framework under Unseen Conditions.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Attention Back-End for Automatic Speaker Verification with Multiple Enrollment Utterances.
Proceedings of the IEEE International Conference on Acoustics, 2022

Estimating the Confidence of Speech Spoofing Countermeasure.
Proceedings of the IEEE International Conference on Acoustics, 2022

2021
ASVspoof 2019: Spoofing Countermeasures for the Detection of Synthesized, Converted and Replayed Speech.
IEEE Trans. Biom. Behav. Identity Sci., 2021

Investigation of learning abilities on linguistic features in sequence-to-sequence text-to-speech synthesis.
Comput. Speech Lang., 2021

ASVspoof 2021: accelerating progress in spoofed and deepfake speech detection.
CoRR, 2021

ASVspoof 2021: Automatic Speaker Verification Spoofing and Countermeasures Challenge Evaluation Plan.
CoRR, 2021

Benchmarking and challenges in security and privacy for voice biometrics.
CoRR, 2021

Multi-Task Learning in Utterance-Level and Segmental-Level Spoof Detection.
CoRR, 2021

Attention Back-end for Automatic Speaker Verification with Multiple Enrollment Utterances.
CoRR, 2021

Text-to-Speech Synthesis Techniques for MIDI-to-Audio Synthesis.
Proceedings of the 11th ISCA Speech Synthesis Workshop, 2021

Denoising-and-Dereverberation Hierarchical Neural Vocoder for Robust Waveform Generation.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021

An Initial Investigation for Detecting Partially Spoofed Audio.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

A Comparative Study on Recent Neural Spoofing Countermeasures for Synthetic Speech Detection.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Visualizing Classifier Adjacency Relations: A Case Study in Speaker Verification and Voice Anti-Spoofing.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

End-to-End Text-to-Speech Using Latent Duration Based on VQ-VAE.
Proceedings of the IEEE International Conference on Acoustics, 2021

How Similar or Different is Rakugo Speech Synthesizer to Professional Performers?
Proceedings of the IEEE International Conference on Acoustics, 2021

Combining Oculo-motor Indices to Measure Cognitive Load of Synthetic Speech in Noisy Listening Conditions.
Proceedings of the 2021 Symposium on Eye Tracking Research and Applications, 2021

Evaluating Synthetic Speech Workload with Oculo-motor Indices: Preliminary Observations for Japanese Speech.
Proceedings of the 14th International Joint Conference on Biomedical Engineering Systems and Technologies, 2021

A Multi-Level Attention Model for Evidence-Based Fact Checking.
Proceedings of the Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, 2021

2020
A Vector Quantized Variational Autoencoder (VQ-VAE) Autoregressive Neural F<sub>0</sub> Model for Statistical Parametric Speech Synthesis.
IEEE ACM Trans. Audio Speech Lang. Process., 2020

Neural Source-Filter Waveform Models for Statistical Parametric Speech Synthesis.
IEEE ACM Trans. Audio Speech Lang. Process., 2020

Tandem Assessment of Spoofing Countermeasures and Automatic Speaker Verification: Fundamentals.
IEEE ACM Trans. Audio Speech Lang. Process., 2020

ASVspoof 2019: A large-scale public database of synthesized, converted and replayed speech.
Comput. Speech Lang., 2020

Pretraining Strategies, Waveform Model Choice, and Acoustic Configurations for Multi-Speaker End-to-End Speech Synthesis.
CoRR, 2020

Modeling of Rakugo Speech and Its Limitations: Toward Speech Synthesis That Entertains Audiences.
IEEE Access, 2020


Fine-Grained Similarity Measurement between Educational Videos and Exercises.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Using Cyclic Noise as the Source Signal for Neural Source-Filter-Based Speech Waveform Model.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Introducing the VoicePrivacy Initiative.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Design Choices for X-Vector Based Speaker Anonymization.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Reverberation Modeling for Source-Filter-Based Neural Vocoder.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Transferring Neural Speech Waveform Synthesizers to Musical Instrument Sounds Generation.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Effect of Choice of Probability Distribution, Randomness, and Search Methods for Alignment Modeling in Sequence-to-Sequence Text-to-Speech Synthesis Using Hard Alignment.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

A Study of Child Speech Extraction Using Joint Speech Enhancement and Separation in Realistic Conditions.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Zero-Shot Multi-Speaker Text-To-Speech with State-Of-The-Art Neural Speaker Embeddings.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019
Transformation of low-quality device-recorded speech to high-quality speech using improved SEGAN model.
CoRR, 2019

The ASVspoof 2019 database.
CoRR, 2019

Initial investigation of an encoder-decoder end-to-end TTS framework using marginalization of monotonic hard latent alignments.
CoRR, 2019

Initial investigation of encoder-decoder end-to-end TTS using marginalization of monotonic hard alignments.
Proceedings of the 10th ISCA Speech Synthesis Workshop, 2019

Rakugo speech synthesis using segment-to-segment neural transduction and style tokens - toward speech synthesis for entertaining audiences.
Proceedings of the 10th ISCA Speech Synthesis Workshop, 2019

Speaker Anonymization Using X-vector and Neural Waveform Models.
Proceedings of the 10th ISCA Speech Synthesis Workshop, 2019

Neural Harmonic-plus-Noise Waveform Model with Trainable Maximum Voice Frequency for Text-to-Speech Synthesis.
Proceedings of the 10th ISCA Speech Synthesis Workshop, 2019

ASVspoof 2019: Future Horizons in Spoofed and Fake Audio Detection.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Training Multi-Speaker Neural Text-to-Speech Systems Using Speaker-Imbalanced Speech Corpora.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

MOSNet: Deep Learning-Based Objective Assessment for Voice Conversion.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Joint Training Framework for Text-to-Speech and Voice Conversion Using Multi-Source Tacotron and WaveNet.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Investigation of Enhanced Tacotron Text-to-speech Synthesis Systems with Self-attention for Pitch Accent Language.
Proceedings of the IEEE International Conference on Acoustics, 2019

Neural Source-filter-based Waveform Model for Statistical Parametric Speech Synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2019

STFT Spectral Loss for Training a Neural Speech Waveform Model.
Proceedings of the IEEE International Conference on Acoustics, 2019

Audiovisual Speaker Conversion: Jointly and Simultaneously Transforming Facial Expression and Acoustic Characteristics.
Proceedings of the IEEE International Conference on Acoustics, 2019

2018
Fundamental Frequency Modeling for Neural-Network-Based Statistical Parametric Speech Synthesis.
PhD thesis, 2018

Autoregressive Neural F0 Model for Statistical Parametric Speech Synthesis.
IEEE ACM Trans. Audio Speech Lang. Process., 2018

Deep Encoder-Decoder Models for Unsupervised Learning of Controllable Speech Synthesis.
CoRR, 2018

Can we steal your vocal identity from the Internet?: Initial investigation of cloning Obama's voice using GAN, WaveNet and low-quality found data.
Proceedings of the Odyssey 2018: The Speaker and Language Recognition Workshop, 2018

A Progressive Deep Learning Approach to Child Speech Separation.
Proceedings of the 11th International Symposium on Chinese Spoken Language Processing, 2018

Investigating Accuracy of Pitch-accent Annotations in Neural Network-based Speech Synthesis and Denoising Effects.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

A Comparison of Recent Waveform Generation and Acoustic Modeling Methods for Neural-Network-Based Speech Synthesis.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Speech Waveform Synthesis from MFCC Sequences with Generative Adversarial Networks.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Cyborg Speech: Deep Multilingual Speech Synthesis for Generating Segmental Foreign Accent with Natural Prosody.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017
An RNN-Based Quantized F0 Model with Multi-Tier Feedback Links for Text-to-Speech Synthesis.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Principles for Learning Controllable TTS from Annotated and Latent Variation.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

An autoregressive recurrent mixture density network for parametric speech synthesis.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

A maximum likelihood approach to deep neural network based speech dereverberation.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

2016
Investigation of Using Continuous Representation of Various Linguistic Units in Neural Network Based Text-to-Speech Synthesis.
IEICE Trans. Inf. Syst., 2016

Concept-to-Speech generation with knowledge sharing for acoustic modelling and utterance filtering.
Comput. Speech Lang., 2016

Investigating Very Deep Highway Networks for Parametric Speech Synthesis.
Proceedings of the 9th ISCA Speech Synthesis Workshop, 2016

A Comparative Study of the Performance of HMM, DNN, and RNN based Speech Synthesis Systems Trained on Very Large Speaker-Dependent Corpora.
Proceedings of the 9th ISCA Speech Synthesis Workshop, 2016

Investigating RNN-based speech enhancement methods for noise-robust Text-to-Speech.
Proceedings of the 9th ISCA Speech Synthesis Workshop, 2016

Enhance the Word Vector with Prosodic Information for the Recurrent Neural Network Based TTS System.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Speech Enhancement for a Noise-Robust Text-to-Speech Synthesis System Using Deep Recurrent Neural Networks.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Using Text and Acoustic Features in Predicting Glottal Excitation Waveforms for Parametric Speech Synthesis with Recurrent Neural Networks.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

A full training framework of cross-stream dependence modelling for HMM-based singing voice synthesis.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

The NII speech synthesis entry for Blizzard Challenge 2016.
Proceedings of the Blizzard Challenge 2016, Cuppertino, CA, USA, September 16, 2016, 2016

2014
Concept-to-speech generation by integrating syntagmatic features into HMM-based speech synthesis.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

2013
An anisotropic diffusion filter based on multidirectional separability.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

2012
Cross-stream dependency modeling using continuous F0 model for HMM-based speech synthesis.
Proceedings of the 8th International Symposium on Chinese Spoken Language Processing, 2012


  Loading...