Xilin Jiang

Orcid: 0009-0000-9373-0851

According to our database1, Xilin Jiang authored at least 19 papers between 2013 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
Listen, Chat, and Remix: Text-Guided Soundscape Remixing for Enhanced Auditory Experience.
IEEE J. Sel. Top. Signal Process., May, 2025

Bridging Ears and Eyes: Analyzing Audio and Visual Large Language Models to Humans in Visible Sound Recognition and Reducing Their Sensory Gap via Cross-Modal Distillation.
CoRR, May, 2025

ArrayDPS: Unsupervised Blind Speech Separation with a Diffusion Prior.
CoRR, May, 2025

Exploring Finetuned Audio-LLM on Heart Murmur Features.
CoRR, January, 2025

StyleTTS-ZS: Efficient High-Quality Zero-Shot Text-to-Speech Synthesis with Distilled Time-Varying Style Diffusion.
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025

Speech Slytherin: Examining the Performance and Efficiency of Mamba for Speech Separation, Recognition, and Synthesis.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Dual-path Mamba: Short and Long-term Bidirectional Selective Structured State Space Models for Speech Separation.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

AAD-LLM: Neural Attention-Driven Auditory Scene Understanding.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

2024
Style-Talker: Finetuning Audio Language Model and Style-Based Text-to-Speech Model for Fast Spoken Dialogue Generation.
CoRR, 2024

Listen, Chat, and Edit: Text-Guided Soundscape Modification for Enhanced Auditory Experience.
CoRR, 2024

Just ASR + LLM? A Study on Speech Large Language Models' Ability to Identify And Understand Speaker in Spoken Dialogue.
Proceedings of the IEEE Spoken Language Technology Workshop, 2024

SSAMBA: Self-Supervised Audio Representation Learning With Mamba State Space Model.
Proceedings of the IEEE Spoken Language Technology Workshop, 2024

Exploring Self-supervised Contrastive Learning of Spatial Sound Event Representation.
Proceedings of the IEEE International Conference on Acoustics, 2024

2023
HiFTNet: A Fast High-Quality Neural Vocoder with Harmonic-plus-Noise Filter and Inverse Short Time Fourier Transform.
CoRR, 2023

DeCoR: Defy Knowledge Forgetting by Predicting Earlier Audio Codes.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Phoneme-Level Bert for Enhanced Prosody of Text-To-Speech with Grapheme Predictions.
Proceedings of the IEEE International Conference on Acoustics, 2023

2022
Compute and Memory Efficient Universal Sound Source Separation.
J. Signal Process. Syst., 2022

Learning Representations for New Sound Classes With Continual Self-Supervised Learning.
IEEE Signal Process. Lett., 2022

2013
Demonstration of broadband inter-modal four-wave mixing in graded-index few-mode fibers.
Proceedings of the 2013 Optical Fiber Communication Conference and Exposition and the National Fiber Optic Engineers Conference (OFC/NFOEC), 2013


  Loading...