Songjun Cao

According to our database¹, Songjun Cao authored at least 20 papers between 2020 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2025

PseudoVC: Improving One-shot Voice Conversion with Pseudo Paired Data.

[BibT_eX]

[DOI]

CoRR, June, 2025

Detect All-Type Deepfake Audio: Wavelet Prompt Tuning for Enhanced Auditory Perception.

[BibT_eX]

[DOI]

CoRR, April, 2025

Neural Codec Source Tracing: Toward Comprehensive Attribution in Open-Set Condition.

[BibT_eX]

[DOI]

CoRR, January, 2025

SonarGuard2: Ultrasonic Face Liveness Detection Based on Adaptive Doppler Effect Feature Extraction.

[BibT_eX]

[DOI]

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Monotonic Attention for Robust Text-to-Speech Synthesis in Large Language Model Frameworks.

[BibT_eX]

[DOI]

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

MPE-TTS: Customized Emotion Zero-Shot Text-To-Speech Using Multi-Modal Prompt.

[BibT_eX]

[DOI]

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

DiffCSS: Diverse and Expressive Conversational Speech Synthesis with Diffusion Models.

[BibT_eX]

[DOI]

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

M-MoE: Mixture of Mixture-of-Expert Model for CTC-based Streaming Multilingual ASR.

[BibT_eX]

[DOI]

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

2024

A Transcription Prompt-based Efficient Audio Large Language Model for Robust Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

2023

DistillW2V2: A Small and Streaming Wav2vec 2.0 Based ASR Model.

[BibT_eX]

[DOI]

CoRR, 2023

2022

A practical framework for multi-domain speech recognition and an instance sampling method to neural language modeling.

[BibT_eX]

[DOI]

CoRR, 2022

Censer: Curriculum Semi-supervised Learning for Speech Recognition Based on Self-supervised Pre-training.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Improving CTC-Based Speech Recognition Via Knowledge Transferring from Pre-Trained Language Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

2021

Improving Hybrid CTC/Attention End-to-end Speech Recognition with Pretrained Acoustic and Language Model.

[BibT_eX]

[DOI]

CoRR, 2021

Improving Speech Recognition Accuracy of Local POI Using Geographical Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Explore wav2vec 2.0 for Mispronunciation Detection.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Improving Accent Identification and Accented Speech Recognition Under a Framework of Self-Supervised Learning.

[BibT_eX]

[DOI]

Keqi Deng

Songjun Cao

Long Ma

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Improving Streaming Transformer Based ASR Under a Framework of Self-Supervised Learning.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Improving Hybrid CTC/Attention End-to-End Speech Recognition with Pretrained Acoustic and Language Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

2020

Multi-head Monotonic Chunkwise Attention For Online Speech Recognition.

[BibT_eX]

[DOI]

CoRR, 2020

Songjun Cao

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...