Xinhao Mei

Orcid: 0000-0001-6079-5130

According to our database¹, Xinhao Mei authored at least 31 papers between 2021 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

Exploring Audio Hallucination in Egocentric Video Understanding.

[BibT_eX]

[DOI]

CoRR, April, 2026

EgoAVU: Egocentric Audio-Visual Understanding.

[BibT_eX]

[DOI]

CoRR, February, 2026

SLAP: Scalable Language-Audio Pretraining with Variable-Duration Audio and Multi-Objective Training.

[BibT_eX]

[DOI]

CoRR, January, 2026

2025

Lightweight and Generalizable Acoustic Scene Representations via Contrastive Fine-Tuning and Distillation.

[BibT_eX]

[DOI]

Saeed Bagheri Sereshki

CoRR, October, 2025

Enhanced audio-based fish feeding intensity recognition via decomposed visually-guided cross-modality distillation.

[BibT_eX]

[DOI]

Comput. Electron. Agric., 2025

MASV: Speaker Verification with Global and Local Context Mamba.

[BibT_eX]

[DOI]

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

2024

WavCaps: A ChatGPT-Assisted Weakly-Labelled Audio Captioning Dataset for Audio-Language Multimodal Research.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2024

Towards Generating Diverse Audio Captions via Adversarial Training.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2024

AudioLDM 2: Learning Holistic Audio Generation With Self-Supervised Pretraining.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2024

SyncFlow: Toward Temporally Aligned Joint Audio-Video Generation from Text.

[BibT_eX]

[DOI]

CoRR, 2024

Data Efficient Reflow for Few Step Audio Generation.

[BibT_eX]

[DOI]

Raghuraman Krishnamoorthi

Wei-Ning Hsu

Yangyang Shi

Vikas Chandra

Proceedings of the IEEE Spoken Language Technology Workshop, 2024

Foleygen: Visually-Guided Audio Generation.

[BibT_eX]

[DOI]

Proceedings of the 34th IEEE International Workshop on Machine Learning for Signal Processing, 2024

First-Shot Unsupervised Anomalous Sound Detection with Unknown Anomalies Estimated by Metadata-Assisted Audio Generation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

2023

Enhance audio generation controllability through representation similarity regularization.

[BibT_eX]

[DOI]

CoRR, 2023

Dual Transformer Decoder based Features Fusion Network for Automated Audio Captioning.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Ontology-aware Learning and Evaluation for Audio Tagging.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Visually-Aware Audio Captioning With Adaptive Audio-Visual Attention.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

AudioLDM: Text-to-Audio Generation with Latent Diffusion Models.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

Simple Pooling Front-Ends for Efficient Audio Classification.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

2022

Automated audio captioning: an overview of recent progress and new challenges.

[BibT_eX]

[DOI]

EURASIP J. Audio Speech Music. Process., 2022

Automated Audio Captioning via Fusion of Low- and High- Dimensional Features.

[BibT_eX]

[DOI]

CoRR, 2022

Surrey System for DCASE 2022 Task 5: Few-shot Bioacoustic Event Detection with Segment-level Metric Learning.

[BibT_eX]

[DOI]

CoRR, 2022

On Metric Learning for Audio-Text Cross-Modal Retrieval.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Separate What You Describe: Language-Queried Audio Source Separation.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Diverse Audio Captioning Via Adversarial Training.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Deep Neural Decision Forest for Acoustic Scene Classification.

[BibT_eX]

[DOI]

Proceedings of the 30th European Signal Processing Conference, 2022

Leveraging Pre-trained BERT for Audio Captioning.

[BibT_eX]

[DOI]

Proceedings of the 30th European Signal Processing Conference, 2022

Segment-Level Metric Learning for Few-Shot Bioacoustic Event Detection.

[BibT_eX]

[DOI]

Proceedings of the 7th Workshop on Detection and Classification of Acoustic Scenes and Events 2022, 2022

2021

Audio Captioning Transformer.

[BibT_eX]

[DOI]

Proceedings of the 6th Workshop on Detection and Classification of Acoustic Scenes and Events 2021 (DCASE 2021), 2021

An Encoder-Decoder Based Audio Captioning System with Transfer and Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the 6th Workshop on Detection and Classification of Acoustic Scenes and Events 2021 (DCASE 2021), 2021

CL4AC: A Contrastive Loss for Audio Captioning.

[BibT_eX]

[DOI]

Proceedings of the 6th Workshop on Detection and Classification of Acoustic Scenes and Events 2021 (DCASE 2021), 2021

Xinhao Mei

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...