Xin Jing

Jiadong Wang

Iosif Tsangko

CoRR, May, 2025

STAA-Net: A Sparse and Transferable Adversarial Attack for Speech Emotion Recognition.

[BibT_eX]

[DOI]

IEEE Trans. Affect. Comput., 2025

Audio-Based Kinship Verification Using Age Domain Conversion.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2025

Vishing: Detecting social engineering in spoken communication - A first survey & urgent roadmap to address an emerging societal challenge.

[BibT_eX]

[DOI]

Comput. Speech Lang., 2025

MADUV: The 1st INTERSPEECH Mice Autism Detection via Ultrasound Vocalization Challenge.

[BibT_eX]

[DOI]

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Enhancing Emotional Text-to-Speech Controllability with Natural Language Guidance through Contrastive Learning and Diffusion Models.

[BibT_eX]

[DOI]

Kun Zhou

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

2024

DB3V: A Dialect Dominated Dataset of Bird Vocalisation for Cross-corpus Bird Species Recognition.

[BibT_eX]

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

ParaCLAP - Towards a general language-audio model for computational paralinguistic tasks.

[BibT_eX]

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Temporal Oriented ResNet for Gaming Dimensional Emotion Prediction.

[BibT_eX]

[DOI]

Emilia Parada-Cabaleiro

Yoshiharu Yamamoto

Proceedings of the 32nd European Signal Processing Conference, 2024

2023

HEAR4Health: a blueprint for making computer audition a staple of modern healthcare.

[BibT_eX]

[DOI]

Frontiers Digit. Health, May, 2023

U-DiT TTS: U-Diffusion Vision Transformer for Text-to-Speech.

[BibT_eX]

[DOI]

CoRR, 2023

Daily Mental Health Monitoring from Speech: A Real-World Japanese Dataset and Multitask Learning Analysis.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

2022

Audio self-supervised learning: A survey.

[BibT_eX]

[DOI]

Shuo Liu

Adria Mallol-Ragolta

Emilia Parada-Cabaleiro

Patterns, 2022

Dynamic Restrained Uncertainty Weighting Loss for Multitask Learning of Vocal Expression.

[BibT_eX]

[DOI]

CoRR, 2022

Redundancy Reduction Twins Network: A Training framework for Multi-output Emotion Regression.

[BibT_eX]

[DOI]

CoRR, 2022

Exploring speaker enrolment for few-shot personalisation in emotional vocalisation prediction.

[BibT_eX]

[DOI]

CoRR, 2022

An Overview & Analysis of Sequence-to-Sequence Emotional Voice Conversion.

[BibT_eX]

[DOI]

Ilhan Aslan

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

A Temporal-oriented Broadcast ResNet for COVID-19 Detection.

[BibT_eX]

[DOI]

Shuo Liu

Emilia Parada-Cabaleiro