Yuepeng Jiang

Orcid: 0000-0002-1444-7183

According to our database1, Yuepeng Jiang authored at least 23 papers between 2021 and 2026.

Collaborative distances:

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
MINT-Bench: A Comprehensive Multilingual Benchmark for Instruction-Following Text-to-Speech.
CoRR, April, 2026

YingMusic-Singer: Controllable Singing Voice Synthesis with Flexible Lyric Manipulation and Annotation-free Melody Guidance.
CoRR, March, 2026

A study on hand gesture recognition algorithm realized with the aid of efficient feature extraction method and convolution neural networks: design and its application to VR environment.
Soft Comput., February, 2026

Mitigating scale imbalance and conflicting gradients in deep multi-task learning.
Frontiers Comput. Sci., February, 2026

2025
SoulX-Podcast: Towards Realistic Long-form Podcasts with Dialectal and Paralinguistic Diversity.
CoRR, October, 2025

MeanVC: Lightweight and Streaming Zero-Shot Voice Conversion via Mean Flows.
CoRR, October, 2025

A comprehensive benchmarking for evaluating TCR embeddings in modeling TCR-epitope interactions.
Briefings Bioinform., January, 2025

REF-VC: Robust, Expressive and Fast Zero-Shot Voice Conversion with Diffusion Transformers.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2025

DiffRhythm+: Controllable and Flexible Full-Length Song Generation with Preference Optimization.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2025

Drop the Beat! Freestyler for Accompaniment Conditioned Rapping Voice Generation.
Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

2024
Unlocking T-cell receptor-epitope insights with structural analysis.
Nat. Comput. Sci., July, 2024

WenetSpeech4TTS: A 12, 800-hour Mandarin TTS Corpus for Large Speech Generation Model Benchmark.
Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Towards Expressive Zero-Shot Speech Synthesis with Hierarchical Prosody Modeling.
Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Dualvc 2: Dynamic Masked Convolution for Unified Streaming and Non-Streaming Voice Conversion.
Proceedings of the IEEE International Conference on Acoustics, 2024

2023
Validation of MODIS Temperature and Emissivity Products Based on Ground-Based Mid-Wave Hyperspectral Imaging Measurement in the Northwestern Plateau Region of Qinghai, China.
Remote. Sens., August, 2023

Deep autoregressive generative models capture the intrinsics embedded in T-cell receptor repertoires.
Briefings Bioinform., March, 2023

TEINet: a deep learning framework for prediction of TCR-epitope binding specificity.
Briefings Bioinform., March, 2023

DualVC: Dual-mode Voice Conversion using Intra-model Knowledge Distillation and Hybrid Predictive Coding.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

The Xiaomi-ASLP Text-to-speech System for Blizzard Challenge 2023.
Proceedings of the 18th Blizzard Challenge Workshop, Grenoble, France, August 29, 2023, 2023

Vits-Based Singing Voice Conversion Leveraging Whisper and Multi-Scale F0 Modeling.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

HIGNN-TTS: Hierarchical Prosody Modeling With Graph Neural Networks for Expressive Long-Form TTS.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022
Adversarial VAE with Normalizing Flows for Multi-Dimensional Classification.
Proceedings of the Pattern Recognition and Computer Vision - 5th Chinese Conference, 2022

2021
BEV-Net: Assessing Social Distancing Compliance by Joint People Localization and Geometric Reasoning.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021


  Loading...