Zhifu Gao

According to our database1, Zhifu Gao authored at least 18 papers between 2018 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
CosyVoice: A Scalable Multilingual Zero-shot Text-to-speech Synthesizer based on Supervised Semantic Tokens.
CoRR, 2024

FunAudioLLM: Voice Understanding and Generation Foundation Models for Natural Interaction Between Humans and LLMs.
CoRR, 2024

MaLa-ASR: Multimedia-Assisted LLM-Based ASR.
CoRR, 2024

An Embarrassingly Simple Approach for LLM with Strong ASR Capacity.
CoRR, 2024

SeACo-Paraformer: A Non-Autoregressive ASR System with Flexible and Effective Hotword Customization Ability.
Proceedings of the IEEE International Conference on Acoustics, 2024

emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

2023
LauraGPT: Listen, Attend, Understand, and Regenerate Audio with GPT.
CoRR, 2023

FunASR: A Fundamental End-to-End Speech Recognition Toolkit.
CoRR, 2023

Accurate and Reliable Confidence Estimation Based on Non-Autoregressive End-to-End Speech Recognition System.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

FunASR: A Fundamental End-to-End Speech Recognition Toolkit.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

2022
Paraformer: Fast and Accurate Parallel Transformer for Non-autoregressive End-to-End Speech Recognition.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

2021
Extremely Low Footprint End-to-End ASR System for Smart Device.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

2020
Universal ASR: Unifying Streaming and Non-Streaming ASR Using a Single Encoder-Decoder Model.
CoRR, 2020

Streaming Chunk-Aware Multihead Attention for Online End-to-End Speech Recognition.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

SAN-M: Memory Equipped Self-Attention for End-to-End Speech Recognition.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

2019
An Effective Deep Embedding Learning Architecture for Speaker Verification.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Improving Aggregation and Loss Function for Better Embedding Learning in End-to-End Speaker Verification System.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

2018
An Improved Deep Embedding Learning Method for Short Duration Speaker Verification.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018


  Loading...