Dongchao Yang

Orcid: 0000-0002-8905-224X

According to our database1, Dongchao Yang authored at least 34 papers between 2020 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models.
CoRR, 2024

AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Diffsound: Discrete Diffusion Model for Text-to-Sound Generation.
IEEE ACM Trans. Audio Speech Lang. Process., 2023

Consistent and Relevant: Rethink the Query Embedding in General Sound Separation.
CoRR, 2023

DPM-TSE: A Diffusion Probabilistic Model for Target Sound Extraction.
CoRR, 2023

UniAudio: An Audio Foundation Model Toward Universal Audio Generation.
CoRR, 2023

PromptTTS 2: Describing and Generating Voices with Text Prompt.
CoRR, 2023

Make-A-Voice: Unified Voice Synthesis With Discrete Representation.
CoRR, 2023

Make-An-Audio 2: Temporal-Enhanced Text-to-Audio Generation.
CoRR, 2023

HiFi-Codec: Group-residual Vector quantization for High Fidelity Audio Codec.
CoRR, 2023

AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head.
CoRR, 2023

InstructTTS: Modelling Expressive TTS in Discrete Latent Space with Natural Language Style Prompt.
CoRR, 2023

Make-An-Audio: Text-To-Audio Generation with Prompt-Enhanced Diffusion Models.
Proceedings of the International Conference on Machine Learning, 2023

Improving Text-Audio Retrieval by Text-Aware Attention Pooling and Prior Matrix Revised Loss.
Proceedings of the IEEE International Conference on Acoustics, 2023

Improving Weakly Supervised Sound Event Detection with Causal Intervention.
Proceedings of the IEEE International Conference on Acoustics, 2023

NADiffuSE: Noise-aware Diffusion-based Model for Speech Enhancement.
Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023

2022
NoreSpeech: Knowledge Distillation based Conditional Diffusion Model for Noise-robust Expressive TTS.
CoRR, 2022

A Two-student Learning Framework for Mixed Supervised Target Sound Detection.
CoRR, 2022

A Mobile Robot Design for Efficient and Large-Scale Solar Panel Cleaning.
Proceedings of the IEEE International Conference on Robotics and Biomimetics, 2022

Target Confusion in End-to-end Speaker Extraction: Analysis and Approaches.
Proceedings of the Interspeech 2022, 2022

Speaker-Aware Mixture of Mixtures Training for Weakly Supervised Speaker Extraction.
Proceedings of the Interspeech 2022, 2022

RaDur: A Reference-aware and Duration-robust Network for Target Sound Detection.
Proceedings of the Interspeech 2022, 2022

Audio Pyramid Transformer with Domain Adaption for Weakly Supervised Sound Event Detection and Audio Classification.
Proceedings of the Interspeech 2022, 2022

Improving Target Sound Extraction with Timestamp Information.
Proceedings of the Interspeech 2022, 2022

A Mutual Learning Framework for Few-Shot Sound Event Detection.
Proceedings of the IEEE International Conference on Acoustics, 2022

A Mixed Supervised Learning Framework For Target Sound Detection.
Proceedings of the 7th Workshop on Detection and Classification of Acoustic Scenes and Events 2022, 2022

Detect What You Want: Target Sound Detection.
Proceedings of the 7th Workshop on Detection and Classification of Acoustic Scenes and Events 2022, 2022

Omnidirectional Motion Control Method of Quadruped Robot Based on 3D-CPG Oscillator Group.
Proceedings of the Robotics in Natural Settings, 2022

2021
Detect what you want: Target Sound Detection.
CoRR, 2021

Unsupervised Multi-Target Domain Adaptation for Acoustic Scene Classification.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

YOLOv3 with Asymmetric Intersection over Union Based Loss Function for Human Detection.
Proceedings of the ICMLSC '21: 2021 The 5th International Conference on Machine Learning and Soft Computing, 2021

Improving the Performance of Automated Audio Captioning via Integrating the Acoustic and Semantic Information.
Proceedings of the 6th Workshop on Detection and Classification of Acoustic Scenes and Events 2021 (DCASE 2021), 2021

2020
Towards Data Distillation for End-to-end Spoken Conversational Question Answering.
CoRR, 2020

A petal-array capacitive tactile sensor with micro-pin for robotic fingertip sensing.
Proceedings of the 3rd IEEE International Conference on Soft Robotics, 2020


  Loading...