Xiang Yin

Orcid: 0000-0003-1324-4277

Affiliations:
  • ByteDance AI Lab, China


According to our database1, Xiang Yin authored at least 32 papers between 2020 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
MSGCN-ISTL: A multi-scaled self-attention-enhanced graph convolutional network with improved STL decomposition for probabilistic load forecasting.
Expert Syst. Appl., March, 2024

2023
Static-dynamic collaborative graph convolutional network with meta-learning for node-level traffic flow prediction.
Expert Syst. Appl., October, 2023

Spatiotemporal dynamic graph convolutional network for traffic speed forecasting.
Inf. Sci., September, 2023

Mega-TTS 2: Zero-Shot Text-to-Speech with Arbitrary Length Speech Prompts.
CoRR, 2023

GenerTTS: Pronunciation Disentanglement for Timbre and Style Generalization in Cross-Lingual Text-to-Speech.
CoRR, 2023

Mega-TTS: Zero-Shot Text-to-Speech at Scale with Intrinsic Inductive Bias.
CoRR, 2023

Ada-TTA: Towards Adaptive High-Quality Text-to-Talking Avatar Synthesis.
CoRR, 2023

Make-An-Audio 2: Temporal-Enhanced Text-to-Audio Generation.
CoRR, 2023

StyleS2ST: Zero-shot Style Transfer for Direct Speech-to-speech Translation.
CoRR, 2023

GeneFace++: Generalized and Stable Real-Time Audio-Driven 3D Talking Face Generation.
CoRR, 2023

Towards Building Voice-based Conversational Recommender Systems: Datasets, Potential Solutions and Prospects.
Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2023

Unsupervised Video Domain Adaptation for Action Recognition: A Disentanglement Perspective.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Emotionally Situated Text-to-Speech Synthesis in User-Agent Conversation.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

AudioQR: Deep Neural Audio Watermarks For QR Code.
Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023

Virtual Try-On with Pose-Garment Keypoints Guided Inpainting.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

LiteG2P: A Fast, Light and High Accuracy Model for Grapheme-to-Phoneme Conversion.
Proceedings of the IEEE International Conference on Acoustics, 2023

2022
Direct Speech-to-speech Translation without Textual Annotation using Bottleneck Features.
CoRR, 2022

Unsupervised Video Domain Adaptation: A Disentanglement Perspective.
CoRR, 2022

A Novel Chinese Dialect TTS Frontend with Non-Autoregressive Neural Machine Translation.
CoRR, 2022

Towards high-fidelity singing voice conversion with acoustic reference and contrastive predictive coding.
Proceedings of the Interspeech 2022, 2022

An Automatic Soundtracking System for Text-to-Speech Audiobooks.
Proceedings of the Interspeech 2022, 2022

Towards Using Clothes Style Transfer for Scenario-Aware Person Video Generation.
Proceedings of the IEEE International Conference on Acoustics, 2022

2021
Cross-speaker Emotion Transfer Based on Speaker Condition Layer Normalization and Semi-Supervised Training in Text-To-Speech.
CoRR, 2021

Towards Realistic Visual Dubbing with Heterogeneous Sources.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

ByteSing: A Chinese Singing Voice Synthesis System Using Duration Allocated Encoder-Decoder Acoustic Models and WaveRNN Vocoders.
Proceedings of the 12th International Symposium on Chinese Spoken Language Processing, 2021

Fine-Grained Prosody Modeling in Neural Speech Synthesis Using ToBI Representation.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

A Chapter-Wise Understanding System for Text-To-Speech in Chinese Novels.
Proceedings of the IEEE International Conference on Acoustics, 2021

PPG-Based Singing Voice Conversion with Adversarial Representation Learning.
Proceedings of the IEEE International Conference on Acoustics, 2021

2020
Improving Accent Conversion with Reference Encoder and End-To-End Text-To-Speech.
CoRR, 2020

A Hybrid Text Normalization System Using Multi-Head Self-Attention For Mandarin.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

A Unified Sequence-to-Sequence Front-End Model for Mandarin Text-to-Speech Synthesis.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Xiaomingbot: A Multilingual Robot News Reporter.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, 2020


  Loading...