Zhifu Gao
Orcid: 0009-0008-5691-7324
According to our database1,
Zhifu Gao authored at least 33 papers
between 2018 and 2026.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2026
Towards Human-Like Interactive Speech Recognition With Agentic Correction and Semantic Evaluation.
CoRR, May, 2026
Interactive ASR: Towards Human-Like Interaction and Semantic Coherence Evaluation for Agentic Speech Recognition.
CoRR, April, 2026
SLAM-LLM: A Modular, Open-Source Multimodal Large Language Model Framework and Best Practice for Speech, Language, Audio and Music Processing.
IEEE J. Sel. Top. Signal Process., January, 2026
2025
CoRR, September, 2025
CoRR, May, 2025
CoRR, April, 2025
InspireMusic: Integrating Super Resolution and Large Language Model for High-Fidelity Long-Form Music Generation.
CoRR, March, 2025
CoRR, January, 2025
Proceedings of the 33rd ACM International Conference on Multimedia, 2025
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025
Speech Recognition Meets Large Language Model: Benchmarking, Models, and Exploration.
Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025
2024
CoRR, 2024
Paraformer-v2: An improved non-autoregressive transformer for noise-robust speech recognition.
CoRR, 2024
CosyVoice: A Scalable Multilingual Zero-shot Text-to-speech Synthesizer based on Supervised Semantic Tokens.
CoRR, 2024
FunAudioLLM: Voice Understanding and Generation Foundation Models for Natural Interaction Between Humans and LLMs.
CoRR, 2024
Proceedings of the IEEE Spoken Language Technology Workshop, 2024
Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024
SeACo-Paraformer: A Non-Autoregressive ASR System with Flexible and Effective Hotword Customization Ability.
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the Findings of the Association for Computational Linguistics, 2024
2023
Accurate and Reliable Confidence Estimation Based on Non-Autoregressive End-to-End Speech Recognition System.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
2022
Paraformer: Fast and Accurate Parallel Transformer for Non-autoregressive End-to-End Speech Recognition.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
2021
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
2020
Universal ASR: Unifying Streaming and Non-Streaming ASR Using a Single Encoder-Decoder Model.
CoRR, 2020
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
2019
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Improving Aggregation and Loss Function for Better Embedding Learning in End-to-End Speaker Verification System.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
2018
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018