We stand with Ukraine

We stand with Ukraine

Xuanjun Chen

Orcid: 0009-0002-5930-3797

According to our database¹, Xuanjun Chen authored at least 25 papers between 2020 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2025

How Does Instrumental Music Help SingFake Detection?

[BibT_eX]

[DOI]

,

,

,

,

,

,

Sung-Feng Huang

,

,

,

,

Jyh-Shing Roger Jang

CoRR, September, 2025

Localizing Audio-Visual Deepfakes via Hierarchical Boundary Modeling.

[BibT_eX]

[DOI]

,

Shih-Peng Cheng

,

,

,

,

,

,

,

Jyh-Shing Roger Jang

CoRR, August, 2025

Exploring State-Space-Model based Language Model in Music Generation.

[BibT_eX]

[DOI]

,

Fang-Chih Hsieh

,

,

,

CoRR, July, 2025

DeSTA2.5-Audio: Toward General-Purpose Large Audio Language Model with Self-Generated Cross-Modal Alignment.

[BibT_eX]

[DOI]

CoRR, July, 2025

Towards Generalized Source Tracing for Codec-Based Deepfake Speech.

[BibT_eX]

[DOI]

,

,

,

,

,

Jyh-Shing Roger Jang

CoRR, June, 2025

A Preliminary Exploration with GPT-4o Voice Mode.

[BibT_eX]

[DOI]

,

,

,

,

,

,

CoRR, February, 2025

CodecFake-Omni: A Large-Scale Codec-based Deepfake Speech Dataset.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

Jyh-Shing Roger Jang

,

CoRR, January, 2025

Codec-Based Deepfake Source Tracing via Neural Audio Codec Taxonomy.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Jyh-Shing Roger Jang

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Wei-Cheng Tseng

,

,

,

,

,

,

,

,

,

,

,

,

,

Fabian Alejandro Ritter Gutierrez

,

et al.

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Leveraging Joint Spectral and Spatial Learning with MAMBA for Multichannel Speech Enhancement.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

2024

Building a Taiwanese Mandarin Spoken Language Model: A First Attempt.

[BibT_eX]

[DOI]

CoRR, 2024

Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Wei-Cheng Tseng

,

,

,

,

,

,

,

,

,

,

,

,

,

Fabian Ritter Gutierrez

,

,

,

,

,

,

,

Chung-Ming Chien

,

,

Cheng-Hsiu Hsieh

,

,

,

,

Heitor R. Guimarães

,

,

,

,

,

,

,

,

,

,

,

,

,

Kuan-Yu Fang Chiang

,

,

,

,

Shao-Syuan Huang

,

,

,

,

,

,

,

,

,

,

Shih-Yun Shan Kuan

,

,

,

,

,

,

,

,

Chao-Han Huck Yang

,

,

,

Shao-Xiang Yuan

,

,

,

,

,

,

Shinji Watanabe

,

CoRR, 2024

Singing Voice Graph Modeling for SingFake Detection.

[BibT_eX]

[DOI]

,

,

Jyh-Shing Roger Jang

,

CoRR, 2024

Towards audio language modeling - an overview.

[BibT_eX]

[DOI]

,

,

,

,

,

Alexander H. Liu

,

CoRR, 2024

Codec-Superb @ SLT 2024: A Lightweight Benchmark For Neural Audio Codec Models.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Alexander H. Liu

,

,

,

,

,

,

,

,

Shinji Watanabe

,

Proceedings of the IEEE Spoken Language Technology Workshop, 2024

DFADD: The Diffusion and Flow-Matching Based Audio Deepfake Dataset.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

Jyh-Shing Roger Jang

Proceedings of the IEEE Spoken Language Technology Workshop, 2024

Singer Separation for Karaoke Content Generation.

[BibT_eX]

[DOI]

,

,

Jyh-Shing Roger Jang

Proceedings of the 27th Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques, 2024

Singing Voice Graph Modeling for SingFake Detection.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Neural Codec-based Adversarial Sample Detection for Speaker Verification.

[BibT_eX]

[DOI]

,

,

,

Jyh-Shing Roger Jang

,

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Multimodal Transformer Distillation for Audio-Visual Synchronization.

[BibT_eX]

[DOI]

,

,

,

,

Jyh-Shing Roger Jang

Proceedings of the IEEE International Conference on Acoustics, 2024

Codec-SUPERB: An In-Depth Analysis of Sound Codec Models.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Hsiu-Hsuan Wang

,

,

Alexander H. Liu

,

Proceedings of the Findings of the Association for Computational Linguistics, 2024

2023

Intelligent Directing System for Music Concert Scene Based on Visual and Auditory Information.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

Proceedings of the 2023 ACM International Conference on Interactive Media Experiences Workshops, 2023

2022

Push-Pull: Characterizing the Adversarial Robustness for Audio-Visual Active Speaker Detection.

[BibT_eX]

[DOI]

,

,

,

,

Jyh-Shing Roger Jang

Proceedings of the IEEE Spoken Language Technology Workshop, 2022

2021

Singer separation for karaoke content generation.

[BibT_eX]

[DOI]

,

,

Jyh-Shing Roger Jang

CoRR, 2021

2020

Enterprise financial management information system based on cloud computing in big data environment.

[BibT_eX]

[DOI]

,

J. Intell. Fuzzy Syst., 2020

Loading...