Zhisheng Zheng

Orcid: 0000-0001-7761-9790

According to our database1, Zhisheng Zheng authored at least 17 papers between 2022 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
MMAR: A Challenging Benchmark for Deep Reasoning in Speech, Audio, Music, and Their Mix.
CoRR, May, 2025

Scaling Rich Style-Prompted Text-to-Speech Datasets.
CoRR, March, 2025

DRCap: Decoding CLAP Latents with Retrieval-Augmented Generation for Zero-shot Audio Captioning.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

SLAM-AAC: Enhancing Audio Captioning with Paraphrasing Augmentation and CLAP-Refine through LLMs.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

2024
Improving Emotion Recognition with Pre-Trained Models, Multimodality, and Contextual Information.
Proceedings of the 14th IEEE International Symposium on Chinese Spoken Language Processing, 2024

EmoBox: Multilingual Multi-corpus Speech Emotion Recognition Toolkit and Benchmark.
Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

EAT: Self-Supervised Pre-Training with Efficient Audio Transformer.
Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024

BAT: Learning to Reason about Spatial Sounds with Large Language Models.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Leveraging Speech PTM, Text LLM, And Emotional TTS For Speech Emotion Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2024

emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

2023
Unsupervised Active Learning: Optimizing Labeling Cost-Effectiveness for Automatic Speech Recognition.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Pushing the Limits of Unsupervised Unit Discovery for SSL Speech Representation.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

MT4SSL: Boosting Self-Supervised Speech Representation Learning by Integrating Multiple Targets.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Front-End Adapter: Adapting Front-End Input of Speech Based Self-Supervised Learning for Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2023

Fast-Hubert: an Efficient Training Framework for Self-Supervised Speech Representation Learning.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

Exploring Effective Distillation of Self-Supervised Speech Models for Automatic Speech Recognition.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022
Noniterative f -x-y Streaming Prediction Filtering for Random Noise Attenuation on Seismic Data.
IEEE Trans. Geosci. Remote. Sens., 2022


  Loading...