Zhisheng Zheng

Orcid: 0000-0001-7761-9790

According to our database¹, Zhisheng Zheng authored at least 18 papers between 2022 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2025

STAR-Bench: Probing Deep Spatio-Temporal Reasoning as Audio 4D Intelligence.

[BibT_eX]

[DOI]

CoRR, October, 2025

MMAR: A Challenging Benchmark for Deep Reasoning in Speech, Audio, Music, and Their Mix.

[BibT_eX]

[DOI]

CoRR, May, 2025

Scaling Rich Style-Prompted Text-to-Speech Datasets.

[BibT_eX]

[DOI]

CoRR, March, 2025

DRCap: Decoding CLAP Latents with Retrieval-Augmented Generation for Zero-shot Audio Captioning.

[BibT_eX]

[DOI]

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

SLAM-AAC: Enhancing Audio Captioning with Paraphrasing Augmentation and CLAP-Refine through LLMs.

[BibT_eX]

[DOI]

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

2024

Improving Emotion Recognition with Pre-Trained Models, Multimodality, and Contextual Information.

[BibT_eX]

[DOI]

Proceedings of the 14th IEEE International Symposium on Chinese Spoken Language Processing, 2024

EmoBox: Multilingual Multi-corpus Speech Emotion Recognition Toolkit and Benchmark.

[BibT_eX]

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

EAT: Self-Supervised Pre-Training with Efficient Audio Transformer.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024

BAT: Learning to Reason about Spatial Sounds with Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Leveraging Speech PTM, Text LLM, And Emotional TTS For Speech Emotion Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2024

2023

Unsupervised Active Learning: Optimizing Labeling Cost-Effectiveness for Automatic Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Pushing the Limits of Unsupervised Unit Discovery for SSL Speech Representation.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

MT4SSL: Boosting Self-Supervised Speech Representation Learning by Integrating Multiple Targets.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Front-End Adapter: Adapting Front-End Input of Speech Based Self-Supervised Learning for Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Fast-Hubert: an Efficient Training Framework for Self-Supervised Speech Representation Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

Exploring Effective Distillation of Self-Supervised Speech Models for Automatic Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022

Noniterative f -x-y Streaming Prediction Filtering for Random Noise Attenuation on Seismic Data.

[BibT_eX]

[DOI]

Yang Liu

Zhisheng Zheng

IEEE Trans. Geosci. Remote. Sens., 2022

Zhisheng Zheng

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...