Songjun Cao

According to our database1, Songjun Cao authored at least 19 papers between 2020 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
PseudoVC: Improving One-shot Voice Conversion with Pseudo Paired Data.
CoRR, June, 2025

MPE-TTS: Customized Emotion Zero-Shot Text-To-Speech Using Multi-Modal Prompt.
CoRR, May, 2025

Detect All-Type Deepfake Audio: Wavelet Prompt Tuning for Enhanced Auditory Perception.
CoRR, April, 2025

Neural Codec Source Tracing: Toward Comprehensive Attribution in Open-Set Condition.
CoRR, January, 2025

DiffCSS: Diverse and Expressive Conversational Speech Synthesis with Diffusion Models.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

M-MoE: Mixture of Mixture-of-Expert Model for CTC-based Streaming Multilingual ASR.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

2024
A Transcription Prompt-based Efficient Audio Large Language Model for Robust Speech Recognition.
CoRR, 2024

A Transcription Prompt-based Efficient Audio Large Language Model for Robust Speech Recognition.
Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

2023
DistillW2V2: A Small and Streaming Wav2vec 2.0 Based ASR Model.
CoRR, 2023

2022
A practical framework for multi-domain speech recognition and an instance sampling method to neural language modeling.
CoRR, 2022

Censer: Curriculum Semi-supervised Learning for Speech Recognition Based on Self-supervised Pre-training.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Improving CTC-Based Speech Recognition Via Knowledge Transferring from Pre-Trained Language Models.
Proceedings of the IEEE International Conference on Acoustics, 2022

2021
Improving Hybrid CTC/Attention End-to-end Speech Recognition with Pretrained Acoustic and Language Model.
CoRR, 2021

Improving Speech Recognition Accuracy of Local POI Using Geographical Models.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Explore wav2vec 2.0 for Mispronunciation Detection.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Improving Accent Identification and Accented Speech Recognition Under a Framework of Self-Supervised Learning.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Improving Streaming Transformer Based ASR Under a Framework of Self-Supervised Learning.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Improving Hybrid CTC/Attention End-to-End Speech Recognition with Pretrained Acoustic and Language Models.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

2020
Multi-head Monotonic Chunkwise Attention For Online Speech Recognition.
CoRR, 2020


  Loading...