Daxin Tan

According to our database¹, Daxin Tan authored at least 20 papers between 2020 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

Minimizing Modality Gap from the Input Side: Your Speech LLM Can Be a Prosody-Aware Text LLM.

[BibT_eX]

[DOI]

CoRR, May, 2026

PROST-LLM: Progressively Enhancing the Speech-to-Speech Translation Capability in LLMs.

[BibT_eX]

[DOI]

CoRR, January, 2026

AEQ-Bench: Measuring Empathy of Omni-Modal Large Models.

[BibT_eX]

[DOI]

CoRR, January, 2026

DSA-Tokenizer: Disentangled Semantic-Acoustic Tokenization via Flow Matching-based Hierarchical Fusion.

[BibT_eX]

[DOI]

CoRR, January, 2026

2025

EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Enhancing Code-switched Text-to-Speech Synthesis Capability in Large Language Models with only Monolingual Corpora.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2025

2024

EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions.

[BibT_eX]

[DOI]

CoRR, 2024

Enhancing Multilingual Speech Generation and Recognition Abilities in LLMs with Constructed Code-switched Data.

[BibT_eX]

[DOI]

CoRR, 2024

Exploring SSL Discrete Tokens for Multilingual ASR.

[BibT_eX]

[DOI]

CoRR, 2024

ToneUnit: A Speech Discretization Approach for Tonal Language Speech Synthesis.

[BibT_eX]

[DOI]

CoRR, 2024

2022

Analysis and Utilization of Entrainment on Acoustic and Emotion Features in User-agent Dialogue.

[BibT_eX]

[DOI]

Daxin Tan

Nikos Kargas

David McHardy

Constantinos Papayiannis

Antonio Bonafonte

Marek Strelec

Jonas Rohnke

Agis Oikonomou-Filandras

Trevor Wood

CoRR, 2022

CorrectSpeech: A Fully Automated System for Speech Correction and Accent Reduction.

[BibT_eX]

[DOI]

Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022

Mixed-Phoneme BERT: Improving BERT with Mixed Phoneme and Sup-Phoneme Representations for Text to Speech.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Environment Aware Text-to-Speech Synthesis.

[BibT_eX]

[DOI]

Daxin Tan

Guangyan Zhang

Tan Lee

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

A Study on the Efficacy of Model Pre-Training In Developing Neural Text-to-Speech System.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

2021

CUHK-EE voice cloning system for ICASSP 2021 M2VoC challenge.

[BibT_eX]

[DOI]

CoRR, 2021

Applying the Information Bottleneck Principle to Prosodic Representation Learning.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Fine-Grained Style Modeling, Transfer and Prediction in Text-to-Speech Synthesis via Phone-Level Content-Style Disentanglement.

[BibT_eX]

[DOI]

Daxin Tan

Tan Lee

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

EditSpeech: A Text Based Speech Editing System Using Partial Inference and Bidirectional Fusion.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

2020

Fine-grained style modelling and transfer in text-to-speech synthesis via content-style disentanglement.

[BibT_eX]

[DOI]

Daxin Tan

Tan Lee

CoRR, 2020

Daxin Tan

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...