Björn W. Schuller

Orcid: 0000-0002-6478-8699

Affiliations:
  • Imperial College London, GLAM, UK
  • University of Augsburg, Department of Computer Science, Germany
  • University of Passau, Faculty of Computer Science and Mathematics, Germany (former)


According to our database1, Björn W. Schuller authored at least 994 papers between 2001 and 2024.

Collaborative distances:

Awards

ACM Fellow

ACM Fellow 2023, "For empirical and theoretical contributions to the development of computer audition, affective computing, and health informatics".

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Multi-view domain-adaptive representation learning for EEG-based emotion recognition.
Inf. Fusion, April, 2024

Meet the authors: Georgios Rizos, Jenna L. Lawson, and Björn W. Schuller.
Patterns, March, 2024

Propagating variational model uncertainty for bioacoustic call label smoothing.
Patterns, March, 2024

COLD Fusion: Calibrated and Ordinal Latent Distribution Fusion for Uncertainty-Aware Multimodal Emotion Recognition.
IEEE Trans. Pattern Anal. Mach. Intell., February, 2024

Introducing the COVID-19 YouTube (COVYT) speech dataset featuring the same speakers with and without infection.
Biomed. Signal Process. Control., February, 2024

LEPCNet: A Lightweight End-to-End PCG Classification Neural Network Model for Wearable Devices.
IEEE Trans. Instrum. Meas., 2024

emoDARTS: Joint Optimisation of CNN & Sequential Neural Network Architectures for Superior Speech Emotion Recognition.
CoRR, 2024

On Prompt Sensitivity of ChatGPT in Affective Computing.
CoRR, 2024

STAA-Net: A Sparse and Transferable Adversarial Attack for Speech Emotion Recognition.
CoRR, 2024

Emotion-Aware Contrastive Adaptation Network for Source-Free Cross-Corpus Speech Emotion Recognition.
CoRR, 2024

Speech Swin-Transformer: Exploring a Hierarchical Transformer with Shifted Windows for Speech Emotion Recognition.
CoRR, 2024

Improving Speaker-independent Speech Emotion Recognition Using Dynamic Joint Distribution Adaptation.
CoRR, 2024

Coupling Sentiment and Arousal Analysis Towards an Affective Dialogue Manager.
IEEE Access, 2024

2023
Human-aligned trading by imitative multi-loss reinforcement learning.
Expert Syst. Appl., December, 2023

Battling with the low-resource condition for snore sound recognition: introducing a meta-learning strategy.
EURASIP J. Audio Speech Music. Process., December, 2023

Guest Editorial Trustworthy and Collaborative AI for Personalised Healthcare Through Edge-of-Things.
IEEE J. Biomed. Health Informatics, November, 2023

Zero-shot personalization of speech foundation models for depressed mood monitoring.
Patterns, November, 2023

An Overview of Affective Speech Synthesis and Conversion in the Deep Learning Era.
Proc. IEEE, October, 2023

Affective Computing [Scanning the Issue].
Proc. IEEE, October, 2023

A weakly supervised spatial group attention network for fine-grained visual recognition.
Appl. Intell., October, 2023

Dawn of the Transformer Era in Speech Emotion Recognition: Closing the Valence Gap.
IEEE Trans. Pattern Anal. Mach. Intell., September, 2023

End-to-End Video-to-Speech Synthesis Using Generative Adversarial Networks.
IEEE Trans. Cybern., June, 2023

Classification of stuttering - The ComParE challenge and beyond.
Comput. Speech Lang., June, 2023

Ethical Awareness in Paralinguistics: A Taxonomy of Applications.
Int. J. Hum. Comput. Interact., May, 2023

HEAR4Health: a blueprint for making computer audition a staple of modern healthcare.
Frontiers Digit. Health, May, 2023

Can a Holistic View Facilitate the Development of Intelligent Traditional Chinese Medicine? A Survey.
IEEE Trans. Comput. Soc. Syst., April, 2023

Exploring interpretable representations for heart sound abnormality detection.
Biomed. Signal Process. Control., April, 2023

Editorial: Human-centred computer audition: sound, music, and healthcare.
Frontiers Digit. Health, March, 2023

A summary of the ComParE COVID-19 challenges.
Frontiers Digit. Health, March, 2023

Intelligent Music Intervention for Mental Disorders: Insights and Perspectives.
IEEE Trans. Comput. Soc. Syst., February, 2023


Speech Synthesis With Mixed Emotions.
IEEE Trans. Affect. Comput., 2023

Emotion Intensity and its Control for Emotional Voice Conversion.
IEEE Trans. Affect. Comput., 2023

Guest Editorial Neurosymbolic AI for Sentiment Analysis.
IEEE Trans. Affect. Comput., 2023

The Multimodal Sentiment Analysis in Car Reviews (MuSe-CaR) Dataset: Collection, Insights and Improvements.
IEEE Trans. Affect. Comput., 2023

Dual Attention and Element Recalibration Networks for Automatic Depression Level Prediction.
IEEE Trans. Affect. Comput., 2023

Multitask Learning From Augmented Auxiliary Data for Improving Speech Emotion Recognition.
IEEE Trans. Affect. Comput., 2023

Self Supervised Adversarial Domain Adaptation for Cross-Corpus and Cross-Language Speech Emotion Recognition.
IEEE Trans. Affect. Comput., 2023

Survey of Deep Representation Learning for Speech Emotion Recognition.
IEEE Trans. Affect. Comput., 2023

EmoNet: A Transfer Learning Framework for Multi-Corpus Speech Emotion Recognition.
IEEE Trans. Affect. Comput., 2023

Audio-Visual Gated-Sequenced Neural Networks for Affect Recognition.
IEEE Trans. Affect. Comput., 2023

Guest Editorial: Special Issue on Affective Speech and Language Synthesis, Generation, and Conversion.
IEEE Trans. Affect. Comput., 2023

Speech Denoising and Compensation for Hearing Aids Using an FTCRN-Based Metric GAN.
IEEE Signal Process. Lett., 2023

Automated composition of Galician Xota - tuning RNN-based composers for specific musical styles using deep Q-learning.
PeerJ Comput. Sci., 2023

Multistage linguistic conditioning of convolutional layers for speech emotion recognition.
Frontiers Comput. Sci., 2023

Computational charisma - A brick by brick blueprint for building charismatic artificial intelligence.
Frontiers Comput. Sci., 2023

Can ChatGPT's Responses Boost Traditional Natural Language Processing?
IEEE Intell. Syst., 2023

Will Affective Computing Emerge From Foundation Models and General Artificial Intelligence? A First Evaluation of ChatGPT.
IEEE Intell. Syst., 2023

Testing Speech Emotion Recognition Machine Learning Models.
CoRR, 2023

Customising General Large Language Models for Specialised Emotion Recognition Tasks.
CoRR, 2023

Bringing the Discussion of Minima Sharpness to the Audio Domain: a Filter-Normalised Evaluation for Acoustic Scene Classification.
CoRR, 2023

Synthia's Melody: A Benchmark Framework for Unsupervised Domain Adaptation in Audio.
CoRR, 2023

Task Selection and Assignment for Multi-modal Multi-task Dialogue Act Classification with Non-stationary Multi-armed Bandits.
CoRR, 2023

Exploring Meta Information for Audio-based Zero-shot Bird Classification.
CoRR, 2023

A Wide Evaluation of ChatGPT on Affective Computing Tasks.
CoRR, 2023

Sparks of Large Audio Models: A Survey and Outlook.
CoRR, 2023

Identifying depression-related topics in smartphone-collected free-response speech recordings using an automatic speech recognition system and a deep learning topic model.
CoRR, 2023

Refashioning Emotion Recognition Modelling: The Advent of Generalised Large Models.
CoRR, 2023

Can Large Language Models Aid in Annotating Speech Emotional Data? Uncovering New Frontiers.
CoRR, 2023

Going Retro: Astonishingly Simple Yet Effective Rule-based Prosody Modelling for Speech Synthesis Simulating Emotion Dimensions.
CoRR, 2023

Speech-based Age and Gender Prediction with Transformers.
CoRR, 2023

Improving Speech Emotion Recognition Performance using Differentiable Architecture Search.
CoRR, 2023

Happy or Evil Laughter? Analysing a Database of Natural Audio Samples.
CoRR, 2023

U-DiT TTS: U-Diffusion Vision Transformer for Text-to-Speech.
CoRR, 2023

A Preliminary Study on Augmenting Speech Emotion Recognition using a Diffusion Model.
CoRR, 2023

The MuSe 2023 Multimodal Sentiment Analysis Challenge: Mimicked Emotions, Cross-Cultural Humour, and Personalisation.
CoRR, 2023

MER 2023: Multi-label Learning, Modality Robustness, and Semi-Supervised Learning.
CoRR, 2023

Will Affective Computing Emerge from Foundation Models and General AI? A First Evaluation on ChatGPT.
CoRR, 2023

audb - Sharing and Versioning of Audio and Annotation Data in Python.
CoRR, 2023

A Comprehensive Survey on Heart Sound Analysis in the Deep Learning Era.
CoRR, 2023

Toward Detecting and Addressing Corner Cases in Deep Learning Based Medical Image Segmentation.
IEEE Access, 2023

Analysing Breathing Patterns in Reading and Spontaneous Speech.
Proceedings of the Speech and Computer - 25th International Conference, 2023

The MuSe 2023 Multimodal Sentiment Analysis Challenge: Mimicked Emotions, Cross-Cultural Humour, and Personalisation.
Proceedings of the 4th on Multimodal Sentiment Analysis Challenge and Workshop: Mimicked Emotions, 2023

Personalised Speech-Based Heart Rate Categorisation Using Weighted-Instance Learning.
Proceedings of the 6th International Workshop on Multimedia Content Analysis in Sports, 2023

An Overview of the ICASSP Special Session on AI Security and Privacy in Speech and Audio Processing.
Proceedings of the ACM Multimedia Asia Workshops, 2023

The ACM Multimedia 2023 Computational Paralinguistics Challenge: Emotion Share & Requests.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

MER 2023: Multi-label Learning, Modality Robustness, and Semi-Supervised Learning.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

MRAC'23: 1st International Workshop on Multimodal and Responsible Affective Computing.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

"Do touch!" - 3D Scanning and Printing Technologies for the Haptic Representation of Cultural Assets: A Study with Blind Target Users.
Proceedings of the 5th Workshop on analySis, 2023

MuSe 2023 Challenge: Multimodal Prediction of Mimicked Emotions, Cross-Cultural Humour, and Personalised Recognition of Affects.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Integrating Generative Artificial Intelligence in Intelligent Vehicle Systems.
Proceedings of the 25th IEEE International Conference on Intelligent Transportation Systems, 2023

Explainable Stuttering Recognition Using Axial Attention.
Proceedings of the Advanced Intelligent Computing Technology and Applications, 2023

SMILENets: Audio Representation Learning via Neural Knowledge Distillation of Traditional Audio-Feature Extractors.
Proceedings of the 8th International Conference on Frontiers of Signal Processing, 2023

An End-to-End Model for Mental Disorders Detection by Spontaneous Physical Activity Data.
Proceedings of the IEEE International Conference on Data Mining, 2023

Crossmodal Transformer on Multi-Physical Signals for Personalised Daily Mental Health Prediction.
Proceedings of the IEEE International Conference on Data Mining, 2023

An Investigation on Data Augmentation and Multiple Instance Learning for Diagnosis of COVID-19 from Speech and Cough Sound.
Proceedings of the International Conference on Consumer Electronics - Taiwan, 2023

Hierarchical Network with Decoupled Knowledge Distillation for Speech Emotion Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2023

Federated Intelligent Terminals Facilitate Stuttering Monitoring.
Proceedings of the IEEE International Conference on Acoustics, 2023

Zero-Shot Speech Emotion Recognition Using Generative Learning with Reconstructed Prototypes.
Proceedings of the IEEE International Conference on Acoustics, 2023

Large-Scale Nonverbal Vocalization Detection Using Transformers.
Proceedings of the IEEE International Conference on Acoustics, 2023

Daily Mental Health Monitoring from Speech: A Real-World Japanese Dataset and Multitask Learning Analysis.
Proceedings of the IEEE International Conference on Acoustics, 2023

Positive-Pair Redundancy Reduction Regularisation for Speech-Based Asthma Diagnosis Prediction.
Proceedings of the IEEE International Conference on Acoustics, 2023

Fast Yet Effective Speech Emotion Recognition with Self-Distillation.
Proceedings of the IEEE International Conference on Acoustics, 2023

COVID-19 Detection from Speech in Noisy Conditions.
Proceedings of the IEEE International Conference on Acoustics, 2023

Hearttoheart: The Arts of Infant Versus Adult-Directed Speech Classification.
Proceedings of the IEEE International Conference on Acoustics, 2023

Knowledge Transfer for on-Device Speech Emotion Recognition With Neural Structured Learning.
Proceedings of the IEEE International Conference on Acoustics, 2023

Masking Speech Contents by Random Splicing: is Emotional Expression Preserved?
Proceedings of the IEEE International Conference on Acoustics, 2023

Audio Barlow Twins: Self-Supervised Audio Representation Learning.
Proceedings of the IEEE International Conference on Acoustics, 2023

AMNet: Introducing an Adaptive Mel-Spectrogram End-to-End Neural Network for Heart Sound Classification.
Proceedings of the IEEE International Conference on E-health Networking, 2023

An End-to-End Model for Speech-based Somatisation Disorder Detection.
Proceedings of the 12th IEEE Global Conference on Consumer Electronics, 2023

Snore Sound Recognition via an Explainable Capsule Network.
Proceedings of the 12th IEEE Global Conference on Consumer Electronics, 2023

Multi-Track Music Generation with WGAN-GP and Attention Mechanisms.
Proceedings of the 12th IEEE Global Conference on Consumer Electronics, 2023

Applying Speech Derived Breathing Patterns to Automatically Classify Human Confidence.
Proceedings of the 31st European Signal Processing Conference, 2023

Less is More: A Novel Feature Extraction Method for Heart Sound Classification via Fractal Transformation.
Proceedings of the 45th Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2023

Cutting Weights of Deep Learning Models for Heart Sound Classification: Introducing a Knowledge Distillation Approach.
Proceedings of the 45th Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2023

A novel and simple approach to regularise attention frameworks and its efficacy in segmentation.
Proceedings of the 45th Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2023

NeuroCellCentreDB: Exploring a Novel Dataset for Neuron-like Cell Centre Detection with Deep Neural Networks.
Proceedings of the 45th Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2023

How does Music Affect Your Brain? A Pilot Study on EEG and Music Features for Automatic Analysis.
Proceedings of the 45th Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2023

Deep Modelling Strategies for Human Confidence Classification using Audio-visual Data.
Proceedings of the 45th Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2023

Noise Robust Recognition of Depression Status and Treatment Response from Speech via Unsupervised Feature Aggregation.
Proceedings of the 45th Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2023

Automatic Breathing Pattern Analysis from Reading-Speech Signals.
Proceedings of the 45th Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2023

Somatisation Disorder Detection via Speech: Introducing a Self-Supervised Learning Model.
Proceedings of the 45th Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2023

Universal Lesion Detection Utilising Cascading R-CNNs and a Novel Video Pretraining Method.
Proceedings of the 45th Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2023

2022
Introducing the COVID-19 YouTube (COVYT) speech dataset featuring the same speakers with and without infection.
Dataset, September, 2022

Guest Editorial: Introduction to the Special Section on Efficient Network Design for Convergence of Deep Learning and Edge Computing.
IEEE Trans. Netw. Sci. Eng., 2022

Exploring Zero-Shot Emotion Recognition in Speech Using Semantic-Embedding Prototypes.
IEEE Trans. Multim., 2022

Learning Multimodal Representations for Drowsiness Detection.
IEEE Trans. Intell. Transp. Syst., 2022

Capturing Time Dynamics From Speech Using Neural Networks for Surgical Mask Detection.
IEEE J. Biomed. Health Informatics, 2022

Selective Element and Two Orders Vectorization Networks for Automatic Depression Severity Diagnosis via Facial Changes.
IEEE Trans. Circuits Syst. Video Technol., 2022

Rethinking Auditory Affective Descriptors Through Zero-Shot Emotion Recognition in Speech.
IEEE Trans. Comput. Soc. Syst., 2022

Digital Mental Health - Breaking a Lance for Prevention.
IEEE Trans. Comput. Soc. Syst., 2022

COVID-19's Impact on Mental Health - The Hour of Computational Aid?
IEEE Trans. Comput. Soc. Syst., 2022

Psychological Field Versus Physiological Field: From Qualitative Analysis to Quantitative Modeling of the Mental Status.
IEEE Trans. Comput. Soc. Syst., 2022

Domain Invariant Feature Learning for Speaker-Independent Speech Emotion Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2022

Holistic Affect Recognition Using PaNDA: Paralinguistic Non-Metric Dimensional Analysis.
IEEE Trans. Affect. Comput., 2022

Multi-Task Semi-Supervised Adversarial Autoencoding for Speech Emotion Recognition.
IEEE Trans. Affect. Comput., 2022

Ethics and Good Practice in Computational Paralinguistics.
IEEE Trans. Affect. Comput., 2022

Face mask recognition from audio: The MASC database and an overview on the mask challenge.
Pattern Recognit., 2022

Fitbeat: COVID-19 estimation based on wristband heart rate using a contrastive convolutional auto-encoder.
Pattern Recognit., 2022

AI-Based human audio processing for COVID-19: A comprehensive overview.
Pattern Recognit., 2022

Audio self-supervised learning: A survey.
Patterns, 2022

Affective Image Content Analysis: Two Decades Review and New Perspectives.
IEEE Trans. Pattern Anal. Mach. Intell., 2022

MEDAS: an open-source platform as a service to help break the walls between medicine and informatics.
Neural Comput. Appl., 2022

Editorial: Intelligent Signal Analysis for Contagious Virus Diseases.
IEEE J. Sel. Top. Signal Process., 2022

Correction to: The perception of emotional cues by children in artificial background noise.
Int. J. Speech Technol., 2022

DeepSpectrumLite: A Power-Efficient Transfer Learning Framework for Embedded Speech and Audio Processing From Decentralized Data.
Frontiers Artif. Intell., 2022

Future-generation personality prediction from digital footprints.
Future Gener. Comput. Syst., 2022

Is Speech the New Blood? Recent Progress in AI-Based Disease Detection From Audio in a Nutshell.
Frontiers Digit. Health, 2022

Personalised depression forecasting using mobile sensor data and ecological momentary assessment.
Frontiers Digit. Health, 2022

Voice Analysis for Neurological Disorder Recognition-A Systematic Review and Perspective on Emerging Trends.
Frontiers Digit. Health, 2022

Evaluating the COVID-19 Identification ResNet (CIdeR) on the INTERSPEECH COVID-19 From Audio Challenges.
Frontiers Digit. Health, 2022

An Estimation of Online Video User Engagement From Features of Time- and Value-Continuous, Dimensional Emotions.
Frontiers Comput. Sci., 2022

Evaluating the Impact of Voice Activity Detection on Speech Emotion Recognition for Autistic Children.
Frontiers Comput. Sci., 2022

Outer Product-Based Fusion of Smartwatch Sensor Data for Human Activity Recognition.
Frontiers Comput. Sci., 2022

A Cross-Corpus Speech-Based Analysis of Escalating Negative Interactions.
Frontiers Comput. Sci., 2022

Child and Youth Affective Computing - Challenge Accepted.
IEEE Intell. Syst., 2022

A Survey on Client Throughput Prediction Algorithms in Wired and Wireless Networks.
ACM Comput. Surv., 2022

Automatic Emotion Modelling in Written Stories.
CoRR, 2022

Statistical Design and Analysis for Robust Machine Learning: A Case Study from COVID-19.
CoRR, 2022

Audio-based AI classifiers show no evidence of improved COVID-19 screening over simple symptoms checkers.
CoRR, 2022

A large-scale and PCR-referenced vocal audio dataset for COVID-19.
CoRR, 2022

AI-Based Emotion Recognition: Promise, Peril, and Prescriptions for Prosocial Path.
CoRR, 2022

Proceedings of the ACII Affective Vocal Bursts Workshop and Competition 2022 (A-VB): Understanding a critically understudied modality of emotional expression.
CoRR, 2022

Multimodal Prediction of Spontaneous Humour: A Novel Dataset and First Results.
CoRR, 2022

Self-Supervised Attention Networks and Uncertainty Loss Weighting for Multi-Task Emotion Recognition on Vocal Bursts.
CoRR, 2022

Proceedings of the ICML 2022 Expressive Vocalizations Workshop and Competition: Recognizing, Generating, and Personalizing Vocal Bursts.
CoRR, 2022

The ACII 2022 Affective Vocal Bursts Workshop & Competition: Understanding a critically understudied modality of emotional expression.
CoRR, 2022

Are 3D Face Shapes Expressive Enough for Recognising Continuous Emotions and Action Unit Intensities?
CoRR, 2022

Dynamic Restrained Uncertainty Weighting Loss for Multitask Learning of Vocal Expression.
CoRR, 2022

COVYT: Introducing the Coronavirus YouTube and TikTok speech dataset featuring the same speakers with and without infection.
CoRR, 2022

Redundancy Reduction Twins Network: A Training framework for Multi-output Emotion Regression.
CoRR, 2022

Exploring speaker enrolment for few-shot personalisation in emotional vocalisation prediction.
CoRR, 2022

The ICML 2022 Expressive Vocalizations Workshop and Competition: Recognizing, Generating, and Personalizing Vocal Bursts.
CoRR, 2022

Continuous-Time Audiovisual Fusion with Recurrence vs. Attention for In-The-Wild Affect Recognition.
CoRR, 2022

Audiovisual Affect Assessment and Autonomous Automobiles: Applications.
CoRR, 2022

Climate Change & Computer Audition: A Call to Action and Overview on Audio Intelligence to Help Save the Planet.
CoRR, 2022

Robust Federated Learning Against Adversarial Attacks for Speech Emotion Recognition.
CoRR, 2022

HEAR 2021: Holistic Evaluation of Audio Representations.
CoRR, 2022

Predicting Sex and Stroke Success - Computer-aided Player Grunt Analysis in Tennis Matches.
CoRR, 2022

Normalise for Fairness: A Simple Normalisation Technique for Fairness in Regression Machine Learning Problems.
CoRR, 2022

The ACM Multimedia 2022 Computational Paralinguistics Challenge: Vocalisations, Stuttering, Activity, & Mosquitoes.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

A Personalised Approach to Audiovisual Humour Recognition and its Individual-level Fairness.
Proceedings of the MuSe@MM 2022: Proceedings of the 3rd International on Multimodal Sentiment Analysis Workshop and Challenge, 2022

Improving Exertion and Wellbeing Prediction in Outdoor Running Conditions using Audio-based Surface Recognition.
Proceedings of the MMSports@MM 2022: Proceedings of the 5th International ACM Workshop on Multimedia Content Analysis in Sports, 2022

The MuSe 2022 Multimodal Sentiment Analysis Challenge: Humor, Emotional Reactions, and Stress.
Proceedings of the MuSe@MM 2022: Proceedings of the 3rd International on Multimodal Sentiment Analysis Workshop and Challenge, 2022

MuSe 2022 Challenge: Multimodal Humour, Emotional Reactions, and Stress.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

A Comparative Cross Language View On Acted Databases Portraying Basic Emotions Utilising Machine Learning.
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

Nkululeko: A Tool For Rapid Speaker Characteristics Detection.
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

Heart Sound Classification based on Fractional Fourier Transformation Entropy.
Proceedings of the 4th IEEE Global Conference on Life Sciences and Technologies, 2022

Online Personalisation of Deep Mobile Activity Recognisers.
Proceedings of the 7th International Workshop on Sensor-based Activity Recognition and Artificial Intelligence, 2022

Data Augmentation for Dementia Detection in Spoken Language.
Proceedings of the Interspeech 2022, 2022

Distinguishing between pre- and post-treatment in the speech of patients with chronic obstructive pulmonary disease.
Proceedings of the Interspeech 2022, 2022

Probing speech emotion recognition transformers for linguistic knowledge.
Proceedings of the Interspeech 2022, 2022

SVTS: Scalable Video-to-Speech Synthesis.
Proceedings of the Interspeech 2022, 2022

Multi-Type Outer Product-Based Fusion of Respiratory Sounds for Detecting COVID-19.
Proceedings of the Interspeech 2022, 2022

Cross-Layer Similarity Knowledge Distillation for Speech Enhancement.
Proceedings of the Interspeech 2022, 2022

Example-based Explanations with Adversarial Attacks for Respiratory Sound Analysis.
Proceedings of the Interspeech 2022, 2022

Accurate Emotion Strength Assessment for Seen and Unseen Speech Based on Data-Driven Deep Learning.
Proceedings of the Interspeech 2022, 2022

An Overview & Analysis of Sequence-to-Sequence Emotional Voice Conversion.
Proceedings of the Interspeech 2022, 2022

Quantifying Cognitive Load from Voice using Transformer-Based Models and a Cross-Dataset Evaluation.
Proceedings of the 21st IEEE International Conference on Machine Learning and Applications, 2022

A Glance-and-Gaze Network for Respiratory Sound Classification.
Proceedings of the IEEE International Conference on Acoustics, 2022

Convoluational Transformer With Adaptive Position Embedding For Covid-19 Detection From Cough Sounds.
Proceedings of the IEEE International Conference on Acoustics, 2022

An Overview of the FIRST ICASSP Special Session on Computer Audition for Healthcare.
Proceedings of the IEEE International Conference on Acoustics, 2022

Heart Sound Classification based on Residual Shrinkage Networks.
Proceedings of the 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2022

CoughLIME: Sonified Explanations for the Predictions of COVID-19 Cough Classifiers.
Proceedings of the 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2022

Fatigue Prediction in Outdoor Running Conditions using Audio Data.
Proceedings of the 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2022

Insights on Modelling Physiological, Appraisal, and Affective Indicators of Stress using Audio Features.
Proceedings of the 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2022

Novel no-reference multi-dimensional perceptual similarity metric.
Proceedings of the 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2022

Novel Insights on Induced Sparsity in Multi-Time Attention Networks.
Proceedings of the 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2022

A Federated Learning Paradigm for Heart Sound Classification.
Proceedings of the 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2022

Triplet Loss-Based Models for COVID-19 Detection from Vocal Sounds.
Proceedings of the 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2022

EEG Emotion Recognition Based on Self-attention Dynamic Graph Neural Networks.
Proceedings of the 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2022

CNN-Based Heart Sound Classification with an Imbalance-Compensating Weighted Loss Function.
Proceedings of the 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2022

Journaling Data for Daily PHQ-2 Depression Prediction and Forecasting.
Proceedings of the 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2022

Depression Diagnosis and Forecast based on Mobile Phone Sensor Data.
Proceedings of the 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2022

Time-Continuous Audiovisual Fusion with Recurrence vs Attention for In-The-Wild Affect Recognition.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2022

COVID-19 Detection Exploiting Self-Supervised Learning Representations of Respiratory Sounds.
Proceedings of the IEEE-EMBS International Conference on Biomedical and Health Informatics, 2022

A Temporal-oriented Broadcast ResNet for COVID-19 Detection.
Proceedings of the IEEE-EMBS International Conference on Biomedical and Health Informatics, 2022

A Novel Policy for Pre-trained Deep Reinforcement Learning for Speech Emotion Recognition.
Proceedings of the ACSW 2022: Australasian Computer Science Week 2022, Brisbane, Australia, February 14, 2022

The ACII 2022 Affective Vocal Bursts Workshop & Competition.
Proceedings of the 10th International Conference on Affective Computing and Intelligent Interaction, ACII 2022, 2022

Ist Stimme das neue Blut? KI und Stimmbiomarker zu früheren Diagnose - für jedermann, überall und jederzeit.
Proceedings of the Künstliche Intelligenz im Gesundheitswesen: Entwicklungen, 2022

2021
Self-attention transfer networks for speech emotion recognition.
Virtual Real. Intell. Hardw., 2021

Frustration recognition from speech during game interaction using wide residual networks.
Virtual Real. Intell. Hardw., 2021

Predictable Robots for Autistic Children - Variance in Robot Behaviour, Idiosyncrasies in Autistic Children's Characteristics, and Child-Robot Engagement.
ACM Trans. Comput. Hum. Interact., 2021

CAA-Net: Conditional Atrous CNNs With Attention for Explainable Device-Robust Acoustic Scene Classification.
IEEE Trans. Multim., 2021

Can Machine Learning Assist Locating the Excitation of Snore Sound? A Review.
IEEE J. Biomed. Health Informatics, 2021

The Detection of Parkinson's Disease From Speech Using Voice Source Information.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

Guided Generative Adversarial Neural Network for Representation Learning and Audio Generation Using Fewer Labelled Audio Data.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

A Deep Adaptation Network for Speech Enhancement: Combining a Relativistic Discriminator With Multi-Kernel Maximum Mean Discrepancy.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

An Online Robot Collision Detection and Identification Scheme by Supervised Learning and Bayesian Decision Theory.
IEEE Trans Autom. Sci. Eng., 2021

EmoBed: Strengthening Monomodal Emotion Recognition via Training with Crossmodal Emotion Embeddings.
IEEE Trans. Affect. Comput., 2021

Intelligent Signal Processing for Affective Computing [From the Guest Editors].
IEEE Signal Process. Mag., 2021

Artificial Intelligence Internet of Things for the Elderly: From Assisted Living to Health-Care Monitoring.
IEEE Signal Process. Mag., 2021

Deep Learning for Mobile Mental Health: Challenges and recent advances.
IEEE Signal Process. Mag., 2021

SEWA DB: A Rich Database for Audio-Visual Emotion and Sentiment Research in the Wild.
IEEE Trans. Pattern Anal. Mach. Intell., 2021

Combining a parallel 2D CNN with a self-attention Dilated Residual Network for CTC-based discrete speech emotion recognition.
Neural Networks, 2021

N-HANS: A neural network-based toolkit for in-the-wild audio enhancement.
Multim. Tools Appl., 2021

Computer Audition for Fighting the SARS-CoV-2 Corona Crisis - Introducing the Multitask Speech Corpus for COVID-19.
IEEE Internet Things J., 2021

Can Appliances Understand the Behavior of Elderly Via Machine Learning? A Feasibility Study.
IEEE Internet Things J., 2021

End-to-end multimodal affect recognition in real-world environments.
Inf. Fusion, 2021

Introduction to the Special Issue on MMAC: Multimodal Affective Computing of Large-Scale Multimedia Data.
IEEE Multim., 2021

Internet of emotional people: Towards continual affective computing cross cultures via audiovisual signals.
Future Gener. Comput. Syst., 2021

COVID-19 and Computer Audition: An Overview on What Speech & Sound Analysis Could Contribute in the SARS-CoV-2 Corona Crisis.
Frontiers Digit. Health, 2021

CovNet: A Transfer Learning Framework for Automatic COVID-19 Detection From Crowd-Sourced Cough Sounds.
Frontiers Digit. Health, 2021

Editorial: Ethical Machine Learning and Artificial Intelligence.
Frontiers Big Data, 2021

A Deep Audiovisual Approach for Human Confidence Classification.
Frontiers Comput. Sci., 2021

An Evaluation of Speech-Based Recognition of Emotional and Physiological Markers of Stress.
Frontiers Comput. Sci., 2021

Sentiment Analysis and Topic Recognition in Video Transcriptions.
IEEE Intell. Syst., 2021

Learning audio sequence representations for acoustic event classification.
Expert Syst. Appl., 2021

Capturing dynamics of post-earnings-announcement drift using a genetic algorithm-optimized XGBoost.
Expert Syst. Appl., 2021

Conversational Agent as Trustworthy Autonomous System (Trust-CA) (Dagstuhl Seminar 21381).
Dagstuhl Reports, 2021

Representation transfer learning from deep end-to-end speech recognition networks for the classification of health states from speech.
Comput. Speech Lang., 2021

Facial Emotion Recognition using Deep Residual Networks in Real-World Environments.
CoRR, 2021

EIHW-MTG: Second DiCOVA Challenge System Report.
CoRR, 2021

EIHW-MTG DiCOVA 2021 Challenge System Report.
CoRR, 2021

A Machine Learning Framework for Automatic Prediction of Human Semen Motility.
CoRR, 2021

The EIHW-GLAM Deep Attentive Multi-model Fusion System for Cough-based COVID-19 Recognition in the DiCOVA 2021 Challenge.
CoRR, 2021

An Estimation of Online Video User Engagement from Features of Continuous Emotions.
CoRR, 2021

Unsupervised Graph-based Topic Modeling from Video Transcriptions.
CoRR, 2021

DeepSpectrumLite: A Power-Efficient Transfer Learning Framework for Embedded Speech and Audio Processing from Decentralised Data.
CoRR, 2021

On the Impact of Word Error Rate on Acoustic-Linguistic Speech Emotion Recognition: An Update for the Deep Learning Era.
CoRR, 2021

Fitbeat: COVID-19 Estimation based on Wristband Heart Rate.
CoRR, 2021

Remote smartphone-based speech collection: acceptance and barriers in individuals with major depressive disorder.
CoRR, 2021

Computational Emotion Analysis From Images: Recent Advances and Future Directions.
CoRR, 2021

End-2-End COVID-19 Detection from Breath & Cough Audio.
CoRR, 2021

Deep Attention-based Representation Learning for Heart Sound Classification.
CoRR, 2021

Personalized Federated Deep Learning for Pain Estimation From Face Images.
CoRR, 2021

Exploring Perception Uncertainty for Emotion Recognition in Dyadic Conversation and Music Listening.
Cogn. Comput., 2021

Robot-Based Intervention for Children With Autism Spectrum Disorder: A Systematic Literature Review.
IEEE Access, 2021

An Enhanced Adversarial Network with Combined Latent Features for Spatio-temporal Facial Affect Estimation in the Wild.
Proceedings of the 16th International Joint Conference on Computer Vision, 2021

Emotion Recognition in Public Speaking Scenarios Utilising An LSTM-RNN Approach with Attention.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Identifying surgical-mask speech using deep neural networks on low-level aggregation.
Proceedings of the SAC '21: The 36th ACM/SIGAPP Symposium on Applied Computing, 2021


Towards an Efficient Deep Learning Model for Emotion and Theme Recognition in Music.
Proceedings of the 23rd International Workshop on Multimedia Signal Processing, 2021

Evaluating Deep Music Generation Methods Using Data Augmentation.
Proceedings of the 23rd International Workshop on Multimedia Signal Processing, 2021

MuSe-Toolbox: The Multimodal Sentiment Analysis Continuous Annotation Fusion and Discrete Class Transformation Toolbox.
Proceedings of the MuSe '21: Proceedings of the 2nd on Multimodal Sentiment Analysis Challenge, 2021

MuSe 2021 Challenge: Multimodal Emotion, Sentiment, Physiological-Emotion, and Stress Detection.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

The MuSe 2021 Multimodal Sentiment Analysis Challenge: Sentiment, Emotion, Physiological-Emotion, and Stress.
Proceedings of the MuSe '21: Proceedings of the 2nd on Multimodal Sentiment Analysis Challenge, 2021

A Physiologically-Adapted Gold Standard for Arousal during Stress.
Proceedings of the MuSe '21: Proceedings of the 2nd on Multimodal Sentiment Analysis Challenge, 2021

Recent Advances in Computer Audition for Diagnosing COVID-19: An Overview.
Proceedings of the 3rd IEEE Global Conference on Life Sciences and Technologies, 2021

Comparison of Automatic Speech Recognition Systems.
Proceedings of the Conversational AI for Natural Human-Centric Interaction, 2021

Automatic Recognition of Texture in Renaissance Music.
Proceedings of the 22nd International Society for Music Information Retrieval Conference, 2021

Coughing-Based Recognition of Covid-19 with Spatial Attentive ConvLSTM Recurrent Neural Networks.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

The INTERSPEECH 2021 Computational Paralinguistics Challenge: COVID-19 Cough, COVID-19 Speech, Escalation & Primates.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Multi-Attentive Detection of the Spider Monkey Whinny in the (Actual) Wild.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Cough-Based COVID-19 Detection with Contextual Attention Convolutional Neural Networks and Gender Information.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Recognising Covid-19 from Coughing Using Ensembles of SVMs and LSTMs with Handcrafted and Deep Audio Features.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Speaking Corona? Human and Machine Recognition of COVID-19 from Voice.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

An Improved StarGAN for Emotional Voice Conversion: Enhancing Voice Quality and Data Augmentation.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Remote Smartphone-Based Speech Collection: Acceptance and Barriers in Individuals with Major Depressive Disorder.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

The DiCOVA 2021 Challenge - An Encoder-Decoder Approach for COVID-19 Recognition from Coughing Audio.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

A Prototypical Network Approach for Evaluating Generated Emotional Speech.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

LiRA: Learning Visual Speech Representations from Audio Through Self-Supervision.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Deep Learning Post-Earnings-Announcement Drift.
Proceedings of the International Joint Conference on Neural Networks, 2021

AI Hears Your Health: Computer Audition for Health Monitoring.
Proceedings of the ICT for Health, Accessibility and Wellbeing, 2021

Towards Sonification in Multimodal and User-friendlyExplainable Artificial Intelligence.
Proceedings of the ICMI '21: International Conference on Multimodal Interaction, 2021

ASMMC21: The 6th International Workshop on Affective Social Multimedia Computing.
Proceedings of the ICMI '21: International Conference on Multimodal Interaction, 2021

Deep speaker conditioning for speech emotion recognition.
Proceedings of the 2021 IEEE International Conference on Multimedia and Expo, 2021

Predicting Group Work Performance from Physical Handwriting Features in a Smart English Classroom.
Proceedings of the ICDSP 2021: 5th International Conference on Digital Signal Processing, 2021

Speech Emotion Recognition Using Semantic Information.
Proceedings of the IEEE International Conference on Acoustics, 2021

The Role of Task and Acoustic Similarity in Audio Transfer Learning: Insights from the Speech Emotion Recognition Case.
Proceedings of the IEEE International Conference on Acoustics, 2021

A Novel Attention-Based Gated Recurrent Unit and its Efficacy in Speech Emotion Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2021

Hierarchical Attention-Based Temporal Convolutional Networks for Eeg-Based Emotion Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2021

Supervised Contrastive Learning for Game-Play Frustration Detection from Speech.
Proceedings of the Universal Access in Human-Computer Interaction. Design Methods and User Experience, 2021

harAGE: A Novel Multimodal Smartwatch-based Dataset for Human Activity Recognition.
Proceedings of the 16th IEEE International Conference on Automatic Face and Gesture Recognition, 2021

Sensing the Sounds of Silence: A Pilot Study on the Detection of Model Mice of Autism Spectrum Disorder from Ultrasonic Vocalisations.
Proceedings of the 43rd Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2021

The Filtering Effect of Face Masks in their Detection from Speech.
Proceedings of the 43rd Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2021

COVID-19 Detection with a Novel Multi-Type Deep Fusion Method using Breathing and Coughing Information.
Proceedings of the 43rd Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2021

Transferring Cross-Corpus Knowledge: An Investigation on Data Augmentation for Heart Sound Classification.
Proceedings of the 43rd Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2021

Deformable Dilated Faster R-CNN for Universal Lesion Detection in CT Images.
Proceedings of the 43rd Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2021

COVID-19 Biomarkers in Speech: On Source and Filter Components.
Proceedings of the 43rd Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2021

Transformer-based CNNs: Mining Temporal Context Information for Multi-sound COVID-19 Diagnosis.
Proceedings of the 43rd Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2021

Fairness and Underspecification in Acoustic Scene Classification: The Case for Disaggregated Evaluations.
Proceedings of the 6th Workshop on Detection and Classification of Acoustic Scenes and Events 2021 (DCASE 2021), 2021

User Experience for Multi-Device Ecosystems: Challenges and Opportunities.
Proceedings of the CHI '21: CHI Conference on Human Factors in Computing Systems, 2021

Detecting COVID-19 from Breathing and Coughing Sounds using Deep Neural Networks.
Proceedings of the 34th IEEE International Symposium on Computer-Based Medical Systems, 2021

GraphTMT: Unsupervised Graph-based Topic Modeling from Video Transcripts.
Proceedings of the Seventh IEEE International Conference on Multimedia Big Data, 2021

Uncertainty Aware Review Hallucination for Science Article Classification.
Proceedings of the Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, 2021

Embracing and Exploiting Annotator Emotional Subjectivity: An Affective Rater Ensemble Model.
Proceedings of the 2021 9th International Conference on Affective Computing and Intelligent Interaction, 2021

2020
Snore-GANs: Improving Automatic Snore Sound Classification With Synthesized Data.
IEEE J. Biomed. Health Informatics, 2020

Machine Listening for Heart Status Monitoring: Introducing and Benchmarking HSS - The Heart Sounds Shenzhen Corpus.
IEEE J. Biomed. Health Informatics, 2020

Guest Editorial Special Issue on Adversarial Learning in Computational Intelligence.
IEEE Trans. Emerg. Top. Comput. Intell., 2020

A Generic Human-Machine Annotation Framework Based on Dynamic Cooperative Learning.
IEEE Trans. Cybern., 2020

"Are You Playing a Shooter Again?!" Deep Representation Learning for Audio-Based Video Game Genre Recognition.
IEEE Trans. Games, 2020

Exploiting time-frequency patterns with LSTM-RNNs for low-bitrate audio restoration.
Neural Comput. Appl., 2020

Validity of machine learning in biology and medicine increased through collaborations across fields of expertise.
Nat. Mach. Intell., 2020

DEMoS: an Italian emotional speech corpus.
Lang. Resour. Evaluation, 2020

eXplainable Cooperative Machine Learning with NOVA.
Künstliche Intell., 2020

Analysis of loss functions for fast single-class classification.
Knowl. Inf. Syst., 2020

Automatic Assessment of Depression From Speech via a Hierarchical Attention Transfer Network and Attention Autoencoders.
IEEE J. Sel. Top. Signal Process., 2020

I see it in your eyes: Training the shallowest-possible CNN to recognise emotions and pain from muted web-assisted in-the-wild video-chats in real-time.
Inf. Process. Manag., 2020

The perception of emotional cues by children in artificial background noise.
Int. J. Speech Technol., 2020

Computer Audition for Healthcare: Opportunities and Challenges.
Frontiers Digit. Health, 2020

Five Crucial Challenges in Digital Health.
Frontiers Digit. Health, 2020

Considerations for a More Ethical Approach to Data in AI: On Data Representation and Infrastructure.
Frontiers Big Data, 2020

Towards cross-modal pre-training and learning tempo-spatial characteristics for audio recognition with convolutional and recurrent neural networks.
EURASIP J. Audio Speech Music. Process., 2020

Detecting COVID-19 from Breathing and Coughing Sounds using Deep Neural Networks.
CoRR, 2020

The voice of COVID-19: Acoustic correlates of infection.
CoRR, 2020

Audio, Speech, Language, & Signal Processing for COVID-19: A Comprehensive Overview.
CoRR, 2020

Capturing dynamics of post-earnings-announcement drift using genetic algorithm-optimised supervised learning.
CoRR, 2020

MeDaS: An open-source platform as service to help break the walls between medicine and informatics.
CoRR, 2020

Go-CaRD - Generic, Optical Car Part Recognition and Detection: Collection, Insights, and Applications.
CoRR, 2020

Deep Reinforcement Learning with Pre-training for Time-efficient Training of Automatic Speech Recognition.
CoRR, 2020

A Novel Fusion of Attention and Sequence to Sequence Autoencoders to Predict Sleepiness From Speech.
CoRR, 2020

An Overview on Audio, Signal, Speech, & Language Processing for COVID-19.
CoRR, 2020

On Deep Speech Packet Loss Concealment: A Mini-Survey.
CoRR, 2020

ConcealNet: An End-to-end Neural Network for Packet Loss Concealment in Deep Speech Emotion Recognition.
CoRR, 2020

"I have vxxx bxx connexxxn!": Facing Packet Loss in Deep Speech Emotion Recognition.
CoRR, 2020

deepSELF: An Open Source Deep Self End-to-End Learning Framework.
CoRR, 2020

MuSe 2020 - The First International Multimodal Sentiment Analysis in Real-life Media Challenge and Workshop.
CoRR, 2020

Cross-lingual Zero- and Few-shot Hate Speech Detection Utilising Frozen Transformer Language Models and AXEL.
CoRR, 2020

Guided Generative Adversarial Neural Network for Representation Learning and High Fidelity Audio Generation using Fewer Labelled Audio Data.
CoRR, 2020

Adversarial-based neural networks for affect estimations in the wild.
CoRR, 2020

Deep Representation Learning in Speech Processing: Challenges, Recent Advances, and Future Trends.
CoRR, 2020

Classification of Lung Nodules Based on Deep Residual Networks and Migration Learning.
Comput. Intell. Neurosci., 2020

High-Fidelity Audio Generation and Representation Learning With Guided Adversarial Autoencoder.
IEEE Access, 2020

Laughter as a Controller in a Stress Buster Game.
Proceedings of the PervasiveHealth '20: 14th EAI International Conference on Pervasive Computing Technologies for Healthcare, 2020

An Evolutionary-based Generative Approach for Audio Data Augmentation.
Proceedings of the 22nd IEEE International Workshop on Multimedia Signal Processing, 2020

Summary of MuSe 2020: Multimodal Sentiment Analysis, Emotion-target Engagement and Trustworthiness Detection in Real-life Media.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

MuSe 2020 Challenge and Workshop: Multimodal Sentiment Analysis, Emotion-target Engagement and Trustworthiness Detection in Real-life Media: Emotional Car Reviews in-the-wild.
Proceedings of the MuSe'20: Proceedings of the 1st International on Multimodal Sentiment Analysis in Real-life Media Challenge and Workshop, 2020

Unsupervised Representation Learning with Attention and Sequence to Sequence Autoencoders to Predict Sleepiness From Speech.
Proceedings of the MuSe'20: Proceedings of the 1st International on Multimodal Sentiment Analysis in Real-life Media Challenge and Workshop, 2020

Emotion and Theme Recognition in Music Using Attention-Based Methods.
Proceedings of the Working Notes Proceedings of the MediaEval 2020 Workshop, 2020

Emotion and Themes Recognition in Music with Convolutional and Recurrent Attention-Blocks.
Proceedings of the Working Notes Proceedings of the MediaEval 2020 Workshop, 2020

Average Jane, Where Art Thou? - Recent Avenues in Efficient Machine Learning Under Subjectivity Uncertainty.
Proceedings of the Information Processing and Management of Uncertainty in Knowledge-Based Systems, 2020

Hybrid Network Feature Extraction for Depression Assessment from Speech.
Proceedings of the Interspeech 2020, 2020

Adventitious Respiratory Classification Using Attentive Residual Neural Networks.
Proceedings of the Interspeech 2020, 2020

Computer Audition for Continuous Rainforest Occupancy Monitoring: The Case of Bornean Gibbons' Call Detection.
Proceedings of the Interspeech 2020, 2020

Uncertainty-Aware Machine Support for Paper Reviewing on the Interspeech 2019 Submission Corpus.
Proceedings of the Interspeech 2020, 2020

The INTERSPEECH 2020 Computational Paralinguistics Challenge: Elderly Emotion, Breathing & Masks.
Proceedings of the Interspeech 2020, 2020

Enhancing Transferability of Black-Box Adversarial Attacks via Lifelong Learning for Speech Emotion Recognition Models.
Proceedings of the Interspeech 2020, 2020

An Investigation of Cross-Cultural Semi-Supervised Learning for Continuous Affect Recognition.
Proceedings of the Interspeech 2020, 2020

Deep Attentive End-to-End Continuous Breath Sensing from Speech.
Proceedings of the Interspeech 2020, 2020

Towards Speech Robustness for Acoustic Scene Classification.
Proceedings of the Interspeech 2020, 2020

Deep Architecture Enhancing Robustness to Noise, Adversarial Attacks, and Cross-Corpus Setting for Speech Emotion Recognition.
Proceedings of the Interspeech 2020, 2020

Augmenting Generative Adversarial Networks for Speech Emotion Recognition.
Proceedings of the Interspeech 2020, 2020

Learning Higher Representations from Pre-Trained Deep Models with Data Augmentation for the COMPARE 2020 Challenge Mask Task.
Proceedings of the Interspeech 2020, 2020

Towards Silent Paralinguistics: Deriving Speaking Mode and Speaker ID from Electromyographic Signals.
Proceedings of the Interspeech 2020, 2020

A Comparison of Acoustic and Linguistics Methodologies for Alzheimer's Dementia Recognition.
Proceedings of the Interspeech 2020, 2020

Toward Silent Paralinguistics: Speech-to-EMG - Retrieving Articulatory Muscle Activity from Speech.
Proceedings of the Interspeech 2020, 2020

An Evaluation of the Effect of Anxiety on Speech - Computational Prediction of Anxiety from Sustained Vowels.
Proceedings of the Interspeech 2020, 2020

Squeeze for Sneeze: Compact Neural Networks for Cold and Flu Recognition.
Proceedings of the Interspeech 2020, 2020

An Early Study on Intelligent Analysis of Speech Under COVID-19: Severity, Sleep Quality, Fatigue, and Anxiety.
Proceedings of the Interspeech 2020, 2020

Hierarchical Component-attention Based Speaker Turn Embedding for Emotion Recognition.
Proceedings of the 2020 International Joint Conference on Neural Networks, 2020

Exploring Spatial-Temporal Representations for fNIRS-based Intimacy Detection via an Attention-enhanced Cascade Convolutional Recurrent Neural Network.
Proceedings of the 25th International Conference on Pattern Recognition, 2020

X-AWARE: ConteXt-AWARE Human-Environment Attention Fusion for Driver Gaze Prediction in the Wild.
Proceedings of the ICMI '20: International Conference on Multimodal Interaction, 2020

Group-level Speech Emotion Recognition Utilising Deep Spectrum Features.
Proceedings of the ICMI '20: International Conference on Multimodal Interaction, 2020

Hierarchical Attention Transfer Networks for Depression Assessment from Speech.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Stargan for Emotional Speech Conversion: Validated by Data Augmentation of End-To-End Emotion Recognition.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Generating and Protecting Against Adversarial Attacks for Deep Speech-Based Emotion Recognition Models.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Ordinal Learning for Emotion Recognition in Customer Service Calls.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Interaction with the Soundscape: Exploring Emotional Audio Generation for Improved Individual Wellbeing.
Proceedings of the Artificial Intelligence in HCI, 2020

Synthesising 3D Facial Motion from "In-the-Wild" Speech.
Proceedings of the 15th IEEE International Conference on Automatic Face and Gesture Recognition, 2020

A Curriculum Learning Approach for Pain Intensity Recognition from Facial Expressions.
Proceedings of the 15th IEEE International Conference on Automatic Face and Gesture Recognition, 2020

Latent-Based Adversarial Neural Networks for Facial Affect Estimations.
Proceedings of the 15th IEEE International Conference on Automatic Face and Gesture Recognition, 2020

Audio for Audio is Better? An Investigation on Transfer Learning Models for Heart Sound Classification.
Proceedings of the 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2020

2019
Humans Inside: Cooperative Big Multimedia Data Mining.
Proceedings of the Innovations in Big Data Mining and Embedded Knowledge, 2019

Dynamic Difficulty Awareness Training for Continuous Emotion Prediction.
IEEE Trans. Multim., 2019

Connecting Subspace Learning and Extreme Learning Machine in Speech Emotion Recognition.
IEEE Trans. Multim., 2019

Guest Editorial Intelligence in Serious Games.
IEEE Trans. Games, 2019

The ASC-Inclusion Perceptual Serious Gaming Platform for Autistic Children.
IEEE Trans. Games, 2019

Speech Emotion Classification Using Attention-Based LSTM.
IEEE ACM Trans. Audio Speech Lang. Process., 2019

IEEE Transactions on Affective Computing-On Novelty and Valence.
IEEE Trans. Affect. Comput., 2019

Responding to uncertainty in emotion recognition.
J. Inf. Commun. Ethics Soc., 2019

Deep Affect Prediction in-the-Wild: Aff-Wild Database and Challenge, Deep Architectures, and Beyond.
Int. J. Comput. Vis., 2019

Large-scale Data Collection and Analysis via a Gamified Intelligent Crowdsourcing Platform.
Int. J. Autom. Comput., 2019

Synchronization in Interpersonal Speech.
Frontiers Robotics AI, 2019

Affective and behavioural computing: Lessons learnt from the First Computational Paralinguistics Challenge.
Comput. Speech Lang., 2019

N-HANS: Introducing the Augsburg Neuro-Holistic Audio-eNhancement System.
CoRR, 2019

Pre-training in Deep Reinforcement Learning for Automatic Speech Recognition.
CoRR, 2019

Poisson CNN: Convolutional Neural Networks for the Solution of the Poisson Equation with Varying Meshes and Dirichlet Boundary Conditions.
CoRR, 2019

On Laughter and Speech-Laugh, Based on Observations of Child-Robot Interaction.
CoRR, 2019

Presenting the Acoustic Sounds for Wellbeing Dataset and Baseline Classification Results.
CoRR, 2019

Single-Channel Speech Separation with Auxiliary Speaker Embeddings.
CoRR, 2019

A Comparison of Online Automatic Speech Recognition Systems and the Nonverbal Responses to Unintelligible Speech.
CoRR, 2019

Voice command generation using Progressive Wavegans.
CoRR, 2019

Responsible and Representative Multimodal Data Acquisition and Analysis: On Auditability, Benchmarking, Confidence, Data-Reliance & Explainability.
CoRR, 2019

On Many-to-Many Mapping Between Concordance Correlation Coefficient and Mean Square Error.
CoRR, 2019

SEWA DB: A Rich Database for Audio-Visual Emotion and Sentiment Research in the Wild.
CoRR, 2019

Microexpressions: A Chance for Computers to Beat Humans at Detecting Hidden Emotions?
Computer, 2019

Adversarial Training in Affective Computing and Sentiment Analysis: Recent Advances and Perspectives [Review Article].
IEEE Comput. Intell. Mag., 2019

Exploring Deep Spectrum Representations via Attention-Based Recurrent and Convolutional Neural Networks for Speech Emotion Recognition.
IEEE Access, 2019

From Speech to Facial Activity: Towards Cross-modal Sequence-to-Sequence Attention Networks.
Proceedings of the 21st IEEE International Workshop on Multimedia Signal Processing, 2019

Can Deep Generative Audio be Emotional? Towards an Approach for Personalised Emotional Audio Generation.
Proceedings of the 21st IEEE International Workshop on Multimedia Signal Processing, 2019

Predicting Biological Signals from Speech: Introducing a Novel Multimodal Dataset and Results.
Proceedings of the 21st IEEE International Workshop on Multimedia Signal Processing, 2019

AVEC 2019 Workshop and Challenge: State-of-Mind, Detecting Depression with AI, and Cross-Cultural Affect Recognition.
Proceedings of the 9th International on Audio/Visual Emotion Challenge and Workshop, 2019

AVEC'19: Audio/Visual Emotion Challenge and Workshop.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

Emotion and Themes Recognition in Music Utilising Convolutional and Recurrent Neural Networks.
Proceedings of the Working Notes Proceedings of the MediaEval 2019 Workshop, 2019

A Comparison of AI-Based Throughput Prediction for Cellular Vehicle-To-Server Communication.
Proceedings of the 15th International Wireless Communications & Mobile Computing Conference, 2019

Deep Wavelets for Heart Sound Classification.
Proceedings of the 2019 International Symposium on Intelligent Signal Processing and Communication Systems, 2019

A Diplomatic Edition of Il Lauro Secco: Ground Truth for OMR of White Mensural Notation.
Proceedings of the 20th International Society for Music Information Retrieval Conference, 2019

Automatic Detection of Major Depressive Disorder via a Bag-of-Behaviour-Words Approach.
Proceedings of the Third International Symposium on Image Computing and Digital Medicine, 2019

Attention-Enhanced Connectionist Temporal Classification for Discrete Speech Emotion Recognition.
Proceedings of the Interspeech 2019, 2019

Autonomous Emotion Learning in Speech: A View of Zero-Shot Speech Emotion Recognition.
Proceedings of the Interspeech 2019, 2019

Towards Robust Speech Emotion Recognition Using Deep Residual Networks for Speech Enhancement.
Proceedings of the Interspeech 2019, 2019

The INTERSPEECH 2019 Computational Paralinguistics Challenge: Styrian Dialects, Continuous Sleepiness, Baby Sounds & Orca Activity.
Proceedings of the Interspeech 2019, 2019

Continuous Emotion Recognition in Speech - Do We Need Recurrence?
Proceedings of the Interspeech 2019, 2019

Robust Speech Emotion Recognition Under Different Encoding Conditions.
Proceedings of the Interspeech 2019, 2019

A Hierarchical Attention Network-Based Approach for Depression Detection from Transcribed Clinical Interviews.
Proceedings of the Interspeech 2019, 2019

Speech Augmentation via Speaker-Specific Noise in Unseen Environment.
Proceedings of the Interspeech 2019, 2019

Sincerity in Acted Speech: Presenting the Sincere Apology Corpus and Results.
Proceedings of the Interspeech 2019, 2019

Using Speech to Predict Sequentially Measured Cortisol Levels During a Trier Social Stress Test.
Proceedings of the Interspeech 2019, 2019

Analysing and Inferring of Intimacy Based on fNIRS Signals and Peripheral Physiological Signals.
Proceedings of the International Joint Conference on Neural Networks, 2019

Audio-based Recognition of Bipolar Disorder Utilising Capsule Networks.
Proceedings of the International Joint Conference on Neural Networks, 2019

A Walkthrough for the Principle of Logit Separation.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Multi-modal Active Learning From Human Data: A Deep Reinforcement Learning Approach.
Proceedings of the International Conference on Multimodal Interaction, 2019

VCMNet: Weakly Supervised Learning for Automatic Infant Vocalisation Maturity Analysis.
Proceedings of the International Conference on Multimodal Interaction, 2019

A Deep Learning Approach for Location Independent Throughput Prediction.
Proceedings of the 2019 IEEE International Conference on Connected Vehicles and Expo, 2019

Context Modelling Using Hierarchical Attention Networks for Sentiment and Self-assessed Emotion Detection in Spoken Narratives.
Proceedings of the IEEE International Conference on Acoustics, 2019

Modelling Sample Informativeness for Deep Affective Computing.
Proceedings of the IEEE International Conference on Acoustics, 2019

Attention-based Atrous Convolutional Neural Networks: Visualisation and Understanding Perspectives of Acoustic Scenes.
Proceedings of the IEEE International Conference on Acoustics, 2019

Implicit Fusion by Joint Audiovisual Training for Emotion Recognition in Mono Modality.
Proceedings of the IEEE International Conference on Acoustics, 2019

Attention-augmented End-to-end Multi-task Learning for Emotion Prediction from Speech.
Proceedings of the IEEE International Conference on Acoustics, 2019

Time-series Clustering with Jointly Learning Deep Representations, Clusters and Temporal Boundaries.
Proceedings of the 14th IEEE International Conference on Automatic Face & Gesture Recognition, 2019

Performance Analysis of Unimodal and Multimodal Models in Valence-Based Empathy Recognition.
Proceedings of the 14th IEEE International Conference on Automatic Face & Gesture Recognition, 2019

End-to-end Audio Classification with Small Datasets - Making It Work.
Proceedings of the 27th European Signal Processing Conference, 2019

Automated Classification of Airborne Pollen using Neural Networks.
Proceedings of the 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2019

Snoring - An Acoustic Definition.
Proceedings of the 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2019

Teaching Machines to Know Your Depressive State: On Physical Activity in Health and Major Depressive Disorder.
Proceedings of the 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2019

Multi-instance Learning for Bipolar Disorder Diagnosis using Weakly Labelled Speech Data.
Proceedings of the 9th International Conference on Digital Public Health, 2019

Personalized Estimation of Engagement From Videos Using Active Learning With Deep Reinforcement Learning.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2019

Augment to Prevent: Short-Text Data Augmentation in Deep Learning for Hate-Speech Classification.
Proceedings of the 28th ACM International Conference on Information and Knowledge Management, 2019

I Know How you Feel Now, and Here's why!: Demystifying Time-Continuous High Resolution Text-Based Affect Predictions in the Wild.
Proceedings of the 32nd IEEE International Symposium on Computer-Based Medical Systems, 2019

Audiovisual Analysis for Recognising Frustration during Game-Play: Introducing the Multimodal Game Frustration Database.
Proceedings of the 8th International Conference on Affective Computing and Intelligent Interaction, 2019

2018
Introduction to the Special Section on Multimedia Computing and Applications of Socio-Affective Behaviors in the Wild.
ACM Trans. Multim. Comput. Commun. Appl., 2018

MixedEmotions: An Open-Source Toolbox for Multimodal Emotion Analysis.
IEEE Trans. Multim., 2018

Deep Learning for Environmentally Robust Speech Recognition: An Overview of Recent Developments.
ACM Trans. Intell. Syst. Technol., 2018

Guest Editorial Special Issue on Computational Intelligence for End-to-End Audio Processing.
IEEE Trans. Emerg. Top. Comput. Intell., 2018

Semisupervised Autoencoders for Speech Emotion Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2018

Editorial: Transactions on Affective Computing-Good Reasons for Joy and Excitement.
IEEE Trans. Affect. Comput., 2018

Asynchronous and Event-Based Fusion Systems for Affect Recognition on Naturalistic Data in Comparison to Conventional Approaches.
IEEE Trans. Affect. Comput., 2018

A closed-form solution to the graph total variation problem for continuous emotion profiling in noisy environment.
Speech Commun., 2018

Personalized machine learning for robot perception of affect and engagement in autism therapy.
Sci. Robotics, 2018

Deep Canonical Time Warping for Simultaneous Alignment and Representation Learning of Sequences.
IEEE Trans. Pattern Anal. Mach. Intell., 2018

Three recent trends in Paralinguistics on the way to omniscient machine intelligence.
J. Multimodal User Interfaces, 2018

Deep Scalogram Representations for Acoustic Scene Classification.
IEEE CAA J. Autom. Sinica, 2018

Scaling Speech Enhancement in Unseen Environments with Noise Embeddings.
CoRR, 2018

Adversarial Training in Affective Computing and Sentiment Analysis: Recent Advances and Perspectives.
CoRR, 2018

audEERING's approach to the One-Minute-Gradual Emotion Challenge.
CoRR, 2018

Applying Cooperative Machine Learning to Speed Up the Annotation of Social Signals in Large Multi-modal Corpora.
CoRR, 2018

End2You - The Imperial Toolkit for Multimodal Profiling by End-to-End Learning.
CoRR, 2018

Weakly Supervised One-Shot Detection with Attention Siamese Networks.
CoRR, 2018

The Age of Artificial Emotional Intelligence.
Computer, 2018

What Affective Computing Reveals about Autistic Children's Facial Expressions of Joy or Fear.
Computer, 2018

Snoring classified: The Munich-Passau Snore Sound Corpus.
Comput. Biol. Medicine, 2018

Speech emotion recognition: two decades in a nutshell, benchmarks, and ongoing trends.
Commun. ACM, 2018

Leveraging Unlabeled Data for Emotion Recognition With Enhanced Collaborative Semi-Supervised Learning.
IEEE Access, 2018

Calibrated Prediction Intervals for Neural Network Regressors.
IEEE Access, 2018

Trustability-Based Dynamic Active Learning for Crowdsourced Labelling of Emotional Audio Data.
IEEE Access, 2018

Analysing communication requirements for crowd sourced backend generation of HD Maps used in automated driving.
Proceedings of the 2018 IEEE Vehicular Networking Conference, 2018

How Good Is Your Model 'Really'? On 'Wildness' of the In-the-Wild Speech-Based Affect Recognisers.
Proceedings of the Speech and Computer - 20th International Conference, 2018

You Sound Like Your Counterpart: Interpersonal Speech Analysis.
Proceedings of the Speech and Computer - 20th International Conference, 2018

Summary for AVEC 2018: Bipolar Disorder and Cross-Cultural Affect Recognition.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

AVEC 2018 Workshop and Challenge: Bipolar Disorder and Cross-Cultural Affect Recognition.
Proceedings of the 2018 on Audio/Visual Emotion Challenge and Workshop, 2018

ASMMC-MMAC 2018: The Joint Workshop of 4th the Workshop on Affective Social Multimedia Computing and first Multi-Modal Affective Computing of Large-Scale Multimedia Data Workshop.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Passive monitoring and geo-based prediction of mobile network vehicle-to-server communication.
Proceedings of the 14th International Wireless Communications & Mobile Computing Conference, 2018

Musical-Linguistic Annotations of Il Lauro Secco.
Proceedings of the 19th International Society for Music Information Retrieval Conference, 2018

Identifying Emotions in Opera Singing: Implications of Adverse Acoustic Conditions.
Proceedings of the 19th International Society for Music Information Retrieval Conference, 2018

CultureNet: A Deep Learning Approach for Engagement Intensity Estimation from Face Images of Children with Autism.
Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2018

Automated Classification of Children's Linguistic versus Non-Linguistic Vocalisations.
Proceedings of the Interspeech 2018, 2018


State of Mind: Classification through Self-reported Affect and Word Use in Speech.
Proceedings of the Interspeech 2018, 2018

How Did You like 2017? Detection of Language Markers of Depression and Narcissism in Personal Narratives.
Proceedings of the Interspeech 2018, 2018

Categorical vs Dimensional Perception of Italian Emotional Speech.
Proceedings of the Interspeech 2018, 2018

Annotator Trustability-based Cooperative Learning Solutions for Intelligent Audio Analysis.
Proceedings of the Interspeech 2018, 2018

Towards Temporal Modelling of Categorical Speech Emotion Recognition.
Proceedings of the Interspeech 2018, 2018

The Perception and Analysis of the Likeability and Human Likeness of Synthesized Speech.
Proceedings of the Interspeech 2018, 2018

Recognition of Echolalic Autistic Child Vocalisations Utilising Convolutional Recurrent Neural Networks.
Proceedings of the Interspeech 2018, 2018

Bags in Bag: Generating Context-Aware Bags for Tracking Emotions from Speech.
Proceedings of the Interspeech 2018, 2018

Evolving Learning for Analysing Mood-Related Infant Vocalisation.
Proceedings of the Interspeech 2018, 2018

Noise Invariant Frame Selection: A Simple Method to Address the Background Noise Problem for Text-independent Speaker Verification.
Proceedings of the 2018 International Joint Conference on Neural Networks, 2018

Bag-of-Deep-Features: Noise-Robust Deep Feature Representations for Audio Analysis.
Proceedings of the 2018 International Joint Conference on Neural Networks, 2018

Affective Image Content Analysis: A Comprehensive Survey.
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018

Emotion-Awareness for Intelligent Vehicle Assistants: A Research Agenda.
Proceedings of the 1st IEEE/ACM International Workshop on Software Engineering for AI in Autonomous Systems, 2018

Deep End-to-End Representation Learning for Food Type Recognition from Speech.
Proceedings of the 2018 on International Conference on Multimodal Interaction, 2018

EAT -: The ICMI 2018 Eating Analysis and Tracking Challenge.
Proceedings of the 2018 on International Conference on Multimodal Interaction, 2018

Exploring A New Method for Food Likability Rating Based on DT-CWT Theory.
Proceedings of the 2018 on International Conference on Multimodal Interaction, 2018

Fast Single-Class Classification and the Principle of Logit Separation.
Proceedings of the IEEE International Conference on Data Mining, 2018

Introducing an Emotion-Driven Assistance System for Cognitively Impaired Individuals.
Proceedings of the Computers Helping People with Special Needs, 2018

End-to-End Speech Emotion Recognition Using Deep Neural Networks.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

What is my Dog Trying to Tell Me? the Automatic Recognition of the Context and Perceived Emotion of Dog Barks.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Multimodal Bag-of-Words for Cross Domains Sentiment Analysis.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Towards Conditional Adversarial Training for Predicting Emotions from Speech.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

A Cnn-Gru Approach to Capture Time-Frequency Pattern Interdependence for Snore Sound Classification.
Proceedings of the 26th European Signal Processing Conference, 2018

A Fusion of Deep Convolutional Generative Adversarial Networks and Sequence to Sequence Autoencoders for Acoustic Scene Classification.
Proceedings of the 26th European Signal Processing Conference, 2018

Low Level Texture Features for Snore Sound Discrimination.
Proceedings of the 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2018

Deep Unsupervised Representation Learning for Abnormal Heart Sound Classification.
Proceedings of the 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2018

Learning Image-based Representations for Heart Sound Classification.
Proceedings of the 2018 International Conference on Digital Health, 2018

Robust Laughter Detection for Wearable Wellbeing Sensing.
Proceedings of the 2018 International Conference on Digital Health, 2018

Attention-based convolutional neural networks for acoustic scene classification.
Proceedings of the Workshop on Detection and Classification of Acoustic Scenes and Events, 2018

The Perceived Emotion of Isolated Synthetic Audio: The EmoSynth Dataset and Results.
Proceedings of the Audio Mostly 2018 on Sound in Immersion and Emotion, 2018

Evaluation of the Pain Level from Speech: Introducing a Novel Pain Database and Benchmarks.
Proceedings of the 13th ITG Symposium on Speech Communication, 2018

Multimodal user state and trait recognition: an overview.
Proceedings of the Handbook of Multimodal-Multisensor Interfaces: Foundations, User Modeling, and Common Modality Combinations, 2018

Deep learning for multisensorial and multimodal interaction.
Proceedings of the Handbook of Multimodal-Multisensor Interfaces: Foundations, User Modeling, and Common Modality Combinations, 2018

Perspectives on predictive power of multimodal deep learning: surprises and future directions.
Proceedings of the Handbook of Multimodal-Multisensor Interfaces: Foundations, User Modeling, and Common Modality Combinations, 2018

2017
Acquisition of Affect.
Proceedings of the Emotions and Personality in Personalized Services, 2017

Stacked denoising autoencoders for sentiment analysis: a review.
WIREs Data Mining Knowl. Discov., 2017

Classification of the Excitation Location of Snore Sounds in the Upper Airway by Acoustic Multifeature Analysis.
IEEE Trans. Biomed. Eng., 2017

A Two-Dimensional Framework of Multiple Kernel Subspace Learning for Recognizing Emotion in Speech.
IEEE ACM Trans. Audio Speech Lang. Process., 2017

Editorial: IEEE Transactions on Affective Computing - Challenges and Chances.
IEEE Trans. Affect. Comput., 2017

Continuous Estimation of Emotions in Speech by Dynamic Cooperative Speaker Models.
IEEE Trans. Affect. Comput., 2017

Advanced Data Exploitation in Speech Analysis: An overview.
IEEE Signal Process. Mag., 2017

Universum Autoencoder-Based Domain Adaptation for Speech Emotion Recognition.
IEEE Signal Process. Lett., 2017

A Deep Matrix Factorization Method for Learning Attribute Representations.
IEEE Trans. Pattern Anal. Mach. Intell., 2017

End-to-End Multimodal Emotion Recognition Using Deep Neural Networks.
IEEE J. Sel. Top. Signal Process., 2017

openXBOW - Introducing the Passau Open-Source Crossmodal Bag-of-Words Toolkit.
J. Mach. Learn. Res., 2017

auDeep: Unsupervised Learning of Representations from Audio with Deep Recurrent Neural Networks.
J. Mach. Learn. Res., 2017

Guest editorial: Multimodal sentiment analysis and mining in the wild.
Image Vis. Comput., 2017

A survey of multimodal sentiment analysis.
Image Vis. Comput., 2017

Strength modelling for real-worldautomatic continuous affect recognition from audiovisual signals.
Image Vis. Comput., 2017

Measuring Engagement in Robot-Assisted Autism Therapy: A Cross-Cultural Study.
Frontiers Robotics AI, 2017

Learning Audio Sequence Representations for Acoustic Event Classification.
CoRR, 2017

Deep Learning for Environmentally Robust Speech Recognition: An Overview of Recent Developments.
CoRR, 2017

DeepCoder: Semi-parametric Variational Autoencoders for Facial Action Unit Intensity Estimation.
CoRR, 2017

Can Affective Computing Save Lives? Meet Mobile Health.
Computer, 2017

A Novel Graphical Technique for Combinational Logic Representation and Optimization.
Complex., 2017

Deep Recurrent Neural Network-Based Autoencoders for Acoustic Novelty Detection.
Comput. Intell. Neurosci., 2017

Recognizing Emotions From Whispered Speech Based on Acoustic Feature Transfer Learning.
IEEE Access, 2017

Automatic speaker analysis 2.0: Hearing the bigger picture.
Proceedings of the International Conference on Speech Technology and Human-Computer Dialogue, 2017

Big Data, Deep Learning - At the Edge of X-Ray Speaker Analysis.
Proceedings of the Speech and Computer - 19th International Conference, 2017

Enhancing LSTM RNN-Based Speech Overlap Detection by Artificially Mixed Data.
Proceedings of the AES International Conference Semantic Audio 2017, 2017

AVEC 2017: Real-life Depression, and Affect Recognition Workshop and Challenge.
Proceedings of the 7th Annual Workshop on Audio/Visual Emotion Challenge, Mountain View, CA, USA, October 23, 2017

Summary for AVEC 2017: Real-life Depression and Affect Challenge and Workshop.
Proceedings of the 2017 ACM on Multimedia Conference, 2017

From Hard to Soft: Towards more Human-like Emotion Recognition by Modelling the Perception Uncertainty.
Proceedings of the 2017 ACM on Multimedia Conference, 2017

An Image-based Deep Spectrum Feature Representation for the Recognition of Emotional Speech.
Proceedings of the 2017 ACM on Multimedia Conference, 2017

A Paralinguistic Approach To Speaker Diarisation: Using Age, Gender, Voice Likability and Personality Traits.
Proceedings of the 2017 ACM on Multimedia Conference, 2017

The Perception of Emotion in the Singing Voice: The Understanding of Music Mood for Music Organisation.
Proceedings of the 4th International Workshop on Digital Libraries for Musicology, 2017

The SEILS Dataset: Symbolically Encoded Scores in Modern-Early Notation for Computational Musicology.
Proceedings of the 18th International Society for Music Information Retrieval Conference, 2017

Implementing Gender-Dependent Vowel-Level Analysis for Boosting Speech-Based Depression Recognition.
Proceedings of the Interspeech 2017, 2017


Discussion.
Proceedings of the Interspeech 2017, 2017

Earlier Identification of Children with Autism Spectrum Disorder: An Automatic Vocalisation-Based Approach.
Proceedings of the Interspeech 2017, 2017

The Perception of Emotions in Noisified Nonsense Speech.
Proceedings of the Interspeech 2017, 2017

Emotional Speech of Mentally and Physically Disabled Individuals: Introducing the EmotAsS Database and First Findings.
Proceedings of the Interspeech 2017, 2017

Towards Intelligent Crowdsourcing for Audio Data Annotation: Integrating Active Learning in the Real World.
Proceedings of the Interspeech 2017, 2017

"Did you laugh enough today?" - Deep Neural Networks for Mobile and Wearable Laughter Trackers.
Proceedings of the Interspeech 2017, 2017

An 'End-to-Evolution' Hybrid Approach for Snore Sound Classification.
Proceedings of the Interspeech 2017, 2017

Spotting Social Signals in Conversational Speech over IP: A Deep Learning Perspective.
Proceedings of the Interspeech 2017, 2017

Automatic Classification of Autistic Child Vocalisations: A Novel Database and Results.
Proceedings of the Interspeech 2017, 2017

Snore Sound Classification Using Image-Based Deep Spectrum Features.
Proceedings of the Interspeech 2017, 2017

Cross-Domain Classification of Drowsiness in Speech: The Case of Alcohol Intoxication and Sleep Deprivation.
Proceedings of the Interspeech 2017, 2017

Towards intoxicated speech recognition.
Proceedings of the 2017 International Joint Conference on Neural Networks, 2017

Deep recurrent music writer: Memory-enhanced variational autoencoder-based musical score composition and an objective measure.
Proceedings of the 2017 International Joint Conference on Neural Networks, 2017

Seeking the SuperStar: Automatic assessment of perceived singing quality.
Proceedings of the 2017 International Joint Conference on Neural Networks, 2017

Keynote Lecture 1: NLP in Tomorrow's Profiling - Words May Fail You.
Proceedings of the 14th International Conference on Natural Language Processing, 2017

Stimulation of psychological listener experiences by semi-automatically composed electroacoustic environments.
Proceedings of the 2017 IEEE International Conference on Multimedia and Expo, 2017

End-to-end learning for dimensional emotion recognition from physiological signals.
Proceedings of the 2017 IEEE International Conference on Multimedia and Expo, 2017

DeepCoder: Semi-Parametric Variational Autoencoders for Automatic Facial Action Coding.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Prediction-based learning for continuous emotion recognition in speech.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Reconstruction-error-based learning for continuous emotion recognition in speech.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Automatic multi-lingual arousal detection from voice applied to real product testing applications.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Multi-task deep neural network with shared hidden layers: Breaking down the wall between emotion representations.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

A machine learning based system for the automatic evaluation of aphasia speech.
Proceedings of the 19th IEEE International Conference on e-Health Networking, 2017

Detecting Vocal Irony.
Proceedings of the Language Technologies for the Challenges of the Digital Age, 2017

Recognising Guitar Effects - Which Acoustic Features Really Matter?
Proceedings of the 47. Jahrestagung der Gesellschaft für Informatik, 2017

Automatic Guitar String Detection by String-Inverse Frequency Estimation.
Proceedings of the 47. Jahrestagung der Gesellschaft für Informatik, 2017

Snore sound recognition: On wavelets and classifiers from deep nets to kernels.
Proceedings of the 2017 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2017

"You sound ill, take the day off": Automatic recognition of speech affected by upper respiratory tract infection.
Proceedings of the 2017 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2017

Speech-based Diagnosis of Autism Spectrum Condition by Generative Adversarial Network Representations.
Proceedings of the 2017 International Conference on Digital Health, 2017

Contextual Bidirectional Long Short-Term Memory Recurrent Neural Network Language Models: A Generative Approach to Sentiment Analysis.
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, 2017

Deep Sequential Image Features on Acoustic Scene Classification.
Proceedings of the Workshop on Detection and Classification of Acoustic Scenes and Events, 2017

Wavelets Revisited for the Classification of Acoustic Scenes.
Proceedings of the Workshop on Detection and Classification of Acoustic Scenes and Events, 2017

Sequence to Sequence Autoencoders for Unsupervised Representation Learning from Audio.
Proceedings of the Workshop on Detection and Classification of Acoustic Scenes and Events, 2017

Deep Structured Learning for Facial Action Unit Intensity Estimation.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Reading the Author and Speaker: Towards a Holistic and Deep Approach on Automatic Assessment of What is in One's Words.
Proceedings of the Computational Linguistics and Intelligent Text Processing, 2017

Perception of Paralinguistic Traits in Synthesized Voices.
Proceedings of the 12th International Audio Mostly Conference on Augmented and Participatory Sound and Music Experiences, 2017

Enhancing Speech-Based Depression Detection Through Gender Dependent Vowel-Level Formant Features.
Proceedings of the Artificial Intelligence in Medicine, 2017

Emotion-augmented machine learning: Overview of an emerging domain.
Proceedings of the Seventh International Conference on Affective Computing and Intelligent Interaction, 2017

Multimodal multimodel emotion analysis as linked data.
Proceedings of the Seventh International Conference on Affective Computing and Intelligent Interaction Workshops and Demos, 2017

The effect of personality trait, age, and gender on the performance of automatic speech valence recognition.
Proceedings of the Seventh International Conference on Affective Computing and Intelligent Interaction, 2017

VoicePlay - An affective sports game operated by speech emotion recognition based on the component process model.
Proceedings of the Seventh International Conference on Affective Computing and Intelligent Interaction Workshops and Demos, 2017

Deep neural networks for anger detection from real life speech data.
Proceedings of the Seventh International Conference on Affective Computing and Intelligent Interaction Workshops and Demos, 2017

CAST a database: Rapid targeted large-scale big data acquisition via small-world modelling of social media platforms.
Proceedings of the Seventh International Conference on Affective Computing and Intelligent Interaction, 2017

Feature selection in multimodal continuous emotion prediction.
Proceedings of the Seventh International Conference on Affective Computing and Intelligent Interaction Workshops and Demos, 2017

Sentiment analysis using image-based deep spectrum features.
Proceedings of the Seventh International Conference on Affective Computing and Intelligent Interaction Workshops and Demos, 2017

Tunable Sensitivity to Large Errors in Neural Network Training.
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

Computational Analysis of Vocal Expression of Affect: Trends and Challenges.
Proceedings of the Social Signal Processing, 2017

Automatic Analysis of Social Emotions.
Proceedings of the Social Signal Processing, 2017

Automatic Analysis of Aesthetics: Human Beauty, Attractiveness, and Likability.
Proceedings of the Social Signal Processing, 2017

2016
A Decade of Encouraging Speech Processing "Outside of the Box" - A Foreword.
Proceedings of the Recent Advances in Nonlinear Speech Processing, 2016

Route and Stopping Intent Prediction at Intersections From Car Fleet Data.
IEEE Trans. Intell. Veh., 2016

Editorial: Transactions on Affective Computing - Changes and Continuance.
IEEE Trans. Affect. Comput., 2016

The Geneva Minimalistic Acoustic Parameter Set (GeMAPS) for Voice Research and Affective Computing.
IEEE Trans. Affect. Comput., 2016

New avenues in knowledge bases for natural language processing.
Knowl. Based Syst., 2016

Stream fusion for multi-stream automatic speech recognition.
Int. J. Speech Technol., 2016

Using Computer Intelligence for Depression Diagnosis and Crowdsourcing.
Computer, 2016

The Effect of Narrow-Band Transmission on Recognition of Paralinguistic Information From Human Vocalizations.
IEEE Access, 2016

Exploitation of Phase-Based Features for Whispered Speech Emotion Recognition.
IEEE Access, 2016

AVEC 2016: Depression, Mood, and Emotion Recognition Workshop and Challenge.
Proceedings of the 6th International Workshop on Audio/Visual Emotion Challenge, 2016

Summary for AVEC 2016: Depression, Mood, and Emotion Recognition Workshop and Challenge.
Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

Spectral and Cepstral Audio Noise Reduction Techniques in Speech Emotion Recognition.
Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

Social and Affective Robotics Tutorial.
Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

Tendencies regarding the effect of emotional intensity in inter corpus phoneme-level speech emotion modelling.
Proceedings of the 26th IEEE International Workshop on Machine Learning for Signal Processing, 2016

Introducing the Weighted Trustability Evaluator for Crowdsourcing Exemplified by Speaker Likability Classification.
Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, 2016

Assessing the Prosody of Non-Native Speakers of English: Measures and Feature Sets.
Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, 2016

Fisher Kernels on Phase-Based Features for Speech Emotion Recognition.
Proceedings of the Dialogues with Social Robots, 2016

Facing Realism in Spontaneous Emotion Recognition from Speech: Feature Enhancement by Autoencoder with LSTM Neural Networks.
Proceedings of the Interspeech 2016, 2016


The INTERSPEECH 2016 Computational Paralinguistics Challenge: A Summary of Results.
Proceedings of the Interspeech 2016, 2016

The Native Language Sub-Challenge: The Data.
Proceedings of the Interspeech 2016, 2016

The Sincerity Sub-Challenge: The Data.
Proceedings of the Interspeech 2016, 2016


The INTERSPEECH 2016 Computational Paralinguistics Challenge: Deception, Sincerity & Native Language.
Proceedings of the Interspeech 2016, 2016

At the Border of Acoustics and Linguistics: Bag-of-Audio-Words for the Recognition of Emotions in Speech.
Proceedings of the Interspeech 2016, 2016

Enhancing Multilingual Recognition of Emotion in Speech by Language Identification.
Proceedings of the Interspeech 2016, 2016

Automatic Analysis of Typical and Atypical Encoding of Spontaneous Emotion in the Voice of Children.
Proceedings of the Interspeech 2016, 2016

Manual versus Automated: The Challenging Routine of Infant Vocalisation Segmentation in Home Videos to Study Neuro(mal)development.
Proceedings of the Interspeech 2016, 2016

Does She Speak RTT? Towards an Earlier Identification of Rett Syndrome Through Intelligent Pre-Linguistic Vocalisation Analysis.
Proceedings of the Interspeech 2016, 2016

Deep Bidirectional Long Short-Term Memory Recurrent Neural Networks for Grapheme-to-Phoneme Conversion Utilizing Complex Many-to-Many Alignments.
Proceedings of the Interspeech 2016, 2016

Real-Time Tracking of Speakers' Emotions, States, and Traits on Mobile Platforms.
Proceedings of the Interspeech 2016, 2016

Convolutional Neural Networks with Data Augmentation for Classifying Speakers' Native Language.
Proceedings of the Interspeech 2016, 2016

Is Deception Emotional? An Emotion-Driven Predictive Approach.
Proceedings of the Interspeech 2016, 2016

Sincerity and Deception in Speech: Two Sides of the Same Coin? A Transfer- and Multi-Task Learning Perspective.
Proceedings of the Interspeech 2016, 2016

Convolutional RNN: An enhanced model for extracting features from sequential data.
Proceedings of the 2016 International Joint Conference on Neural Networks, 2016

Discriminatively Trained Recurrent Neural Networks for Continuous Dimensional Emotion Recognition from Audio.
Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, 2016

Driver Frustration Detection from Audio and Video in the Wild.
Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, 2016

Detecting road surface wetness from audio: A deep learning approach.
Proceedings of the 23rd International Conference on Pattern Recognition, 2016

Multiscale kernel locally penalised discriminant analysis exemplified by emotion recognition in speech.
Proceedings of the 18th ACM International Conference on Multimodal Interaction, 2016


Language proficiency assessment of English L2 speakers based on joint analysis of prosody and native language.
Proceedings of the 18th ACM International Conference on Multimodal Interaction, 2016

Semi-autonomous data enrichment based on cross-task labelling of missing targets for holistic speech analysis.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Enhanced semi-supervised learning for multimodal emotion recognition.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Adieu features? End-to-end speech emotion recognition using a deep convolutional recurrent network.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Audio watermarking based on empirical mode decomposition and beat detection.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Cross lingual speech emotion recognition using canonical correlation analysis on principal component subspace.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Wavelet features for classification of vote snore sounds.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

GPU-based fast signal processing for large amounts of snore sound data.
Proceedings of the IEEE 5th Global Conference on Consumer Electronics, 2016

Pairwise Decomposition with Deep Neural Networks and Multiscale Kernel Subspace Learning for Acoustic Scene Classification.
Proceedings of the Workshop on Detection and Classification of Acoustic Scenes and Events, 2016

Deep Canonical Time Warping.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

SenticNet 4: A Semantic Resource for Sentiment Analysis Based on Conceptual Primitives.
Proceedings of the COLING 2016, 2016

MEC 2016: The Multimodal Emotion Recognition Challenge of CCPR 2016.
Proceedings of the Pattern Recognition - 7th Chinese Conference, 2016

The University of Passau Open Emotion Recognition System for the Multimodal Emotion Challenge.
Proceedings of the Pattern Recognition - 7th Chinese Conference, 2016

Towards Cross-lingual Automatic Diagnosis of Autism Spectrum Condition in Children's Voices.
Proceedings of the 12th ITG Symposium on Speech Communication, 2016

A Bag-of-Audio-Words Approach for Snore Sounds' Excitation Localisation.
Proceedings of the 12th ITG Symposium on Speech Communication, 2016

2015
Emotional Expressions and Daily Cognitive Functions.
Proceedings of the Advances in Neural Networks: Computational and Theoretical Issues, 2015

Sentiment analysis and opinion mining: on optimal parameters and performances.
WIREs Data Mining Knowl. Discov., 2015

Cooperative Learning and its Application to Emotion Recognition from Speech.
IEEE ACM Trans. Audio Speech Lang. Process., 2015

Prediction of asynchronous dimensional emotion ratings from audiovisual and physiological data.
Pattern Recognit. Lett., 2015

Introducing CURRENNT: the munich open-source CUDA recurrent neural network toolkit.
J. Mach. Learn. Res., 2015

Emotion in the singing voice - a deeperlook at acoustic features in the light ofautomatic classification.
EURASIP J. Audio Speech Music. Process., 2015

Introduction.
Comput. Speech Lang., 2015

A Survey on perceived speaker traits: Personality, likability, pathology, and the first challenge.
Comput. Speech Lang., 2015

The ICSTM+TUM+UP Approach to the 3rd CHIME Challenge: Single-Channel LSTM Speech Enhancement with Multi-Channel Correlation Shaping Dereverberation and LSTM Language Models.
CoRR, 2015

Do Computers Have Personality?
Computer, 2015

Speech Analysis in the Big Data Era.
Proceedings of the Text, Speech, and Dialogue - 18th International Conference, 2015

Exploring the Importance of Individual Differences to the Automatic Estimation of Emotions Induced by Music.
Proceedings of the 5th International Workshop on Audio/Visual Emotion Challenge, 2015

AV+EC 2015: The First Affect Recognition Challenge Bridging Across Audio, Video, and Physiological Data.
Proceedings of the 5th International Workshop on Audio/Visual Emotion Challenge, 2015

AVEC 2015: The 5th International Audio/Visual Emotion Challenge and Workshop.
Proceedings of the 23rd Annual ACM Conference on Multimedia Conference, MM '15, Brisbane, Australia, October 26, 2015

The ICL-TUM-PASSAU Approach for the MediaEval 2015 "Affective Impact of Movies" Task.
Proceedings of the Working Notes Proceedings of the MediaEval 2015 Workshop, 2015

Automatically Estimating Emotion in Music with Deep Long-Short Term Memory Recurrent Neural Networks.
Proceedings of the Working Notes Proceedings of the MediaEval 2015 Workshop, 2015

Modelling User Affect and Sentiment in Intelligent User Interfaces: A Tutorial Overview.
Proceedings of the 20th International Conference on Intelligent User Interfaces, 2015

IDGEI 2015: 3rd International Workshop on Intelligent Digital Games for Empowerment and Inclusion.
Proceedings of the 20th International Conference on Intelligent User Interfaces, 2015

Dimensionality reduction for speech emotion features by multiscale kernels.
Proceedings of the INTERSPEECH 2015, 2015

The INTERSPEECH 2015 computational paralinguistics challenge: nativeness, parkinson's & eating condition.
Proceedings of the INTERSPEECH 2015, 2015

Face reading from speech - predicting facial action units from audio cues.
Proceedings of the INTERSPEECH 2015, 2015

Typicality and emotion in the voice of children with autism spectrum condition: evidence across three languages.
Proceedings of the INTERSPEECH 2015, 2015

Does my speech rock? automatic assessment of public speaking skills.
Proceedings of the INTERSPEECH 2015, 2015

Non-linear prediction with LSTM recurrent neural networks for acoustic novelty detection.
Proceedings of the 2015 International Joint Conference on Neural Networks, 2015

Dynamic Active Learning Based on Agreement and Applied to Emotion Recognition in Spoken Interactions.
Proceedings of the 2015 ACM on International Conference on Multimodal Interaction, Seattle, WA, USA, November 09, 2015

ERM4CT 2015: Workshop on Emotion Representations and Modelling for Companion Systems.
Proceedings of the International Workshop on Emotion Representations and Modelling for Companion Technologies, 2015

A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Speech Enhancement with LSTM Recurrent Neural Networks and its Application to Noise-Robust ASR.
Proceedings of the Latent Variable Analysis and Signal Separation, 2015

Bird sounds classification by large scale acoustic features and extreme learning machine.
Proceedings of the 2015 IEEE Global Conference on Signal and Information Processing, 2015

On rater reliability and agreement based dynamic active learning.
Proceedings of the 2015 International Conference on Affective Computing and Intelligent Interaction, 2015

Cross-corpus acoustic emotion recognition: Variances and strategies (Extended abstract).
Proceedings of the 2015 International Conference on Affective Computing and Intelligent Interaction, 2015

Building autonomous sensitive artificial listeners (Extended abstract).
Proceedings of the 2015 International Conference on Affective Computing and Intelligent Interaction, 2015

Detection of negative emotions in speech signals using bags-of-audio-words.
Proceedings of the 2015 International Conference on Affective Computing and Intelligent Interaction, 2015

Context-sensitive learning for enhanced audiovisual emotion classification (Extended abstract).
Proceedings of the 2015 International Conference on Affective Computing and Intelligent Interaction, 2015

iHEARu-PLAY: Introducing a game for crowdsourced data collection for affective computing.
Proceedings of the 2015 International Conference on Affective Computing and Intelligent Interaction, 2015

Cross-language acoustic emotion recognition: An overview and some tendencies.
Proceedings of the 2015 International Conference on Affective Computing and Intelligent Interaction, 2015

Real-time robust recognition of speakers' emotions and characteristics on mobile platforms.
Proceedings of the 2015 International Conference on Affective Computing and Intelligent Interaction, 2015

Intelligent user interfaces in digital games for empowerment and inclusion.
Proceedings of the 12th International Conference on Advances in Computer Entertainment Technology, 2015

2014
Channel mapping using bidirectional long short-term memory for dereverberation in hands-free voice controlled devices.
IEEE Trans. Consumer Electron., 2014

Memory-Enhanced Neural Networks and NMF for Robust ASR.
IEEE ACM Trans. Audio Speech Lang. Process., 2014

Distributing Recognition in Computational Paralinguistics.
IEEE Trans. Affect. Comput., 2014

Autoencoder-based Unsupervised Domain Adaptation for Speech Emotion Recognition.
IEEE Signal Process. Lett., 2014

Affective neural networks and cognitive learning systems for big data analysis.
Neural Networks, 2014

The TUM Gait from Audio, Image and Depth (GAID) database: Multimodal recognition of subjects and traits.
J. Vis. Commun. Image Represent., 2014

Probabilistic speech feature extraction with context-sensitive Bottleneck neural networks.
Neurocomputing, 2014

Feature enhancement by deep LSTM networks for ASR in reverberant multisource environments.
Comput. Speech Lang., 2014

Medium-term speaker states - A review on intoxication, sleepiness and the first challenge.
Comput. Speech Lang., 2014

Introduction to the Special Issue on Broadening the View on Speaker Analysis.
Comput. Speech Lang., 2014

A Broadcast News Corpus for Evaluation and Tuning of German LVCSR Systems.
CoRR, 2014

The state of play of ASC-Inclusion: An Integrated Internet-Based Environment for Social Inclusion of Children with Autism Spectrum Conditions.
CoRR, 2014

On-Line NMF-Based Stereo Up-Mixing of Speech Improves Perceived Reduction of Non-Stationary Noise.
Proceedings of the AES International Conference on Semantic Audio 2014, 2014

On the Influence of Alcohol Intoxication on Speaker Recognition.
Proceedings of the AES International Conference on Semantic Audio 2014, 2014

AVEC 2014: 3D Dimensional Affect and Depression Recognition Challenge.
Proceedings of the 4th International Workshop on Audio/Visual Emotion Challenge, 2014

AVEC 2014: the 4th international audio/visual emotion challenge and workshop.
Proceedings of the ACM International Conference on Multimedia, MM '14, Orlando, FL, USA, November 03, 2014

Emotional Analysis of Music: A Comparison of Methods.
Proceedings of the ACM International Conference on Multimedia, MM '14, Orlando, FL, USA, November 03, 2014

The Munich LSTM-RNN Approach to the MediaEval 2014 "Emotion in Music'" Task.
Proceedings of the Working Notes Proceedings of the MediaEval 2014 Workshop, 2014

The Munich Biovoice Corpus: Effects of Physical Exercising, Heart Rate, and Skin Conductance on Human Speech Production.
Proceedings of the Ninth International Conference on Language Resources and Evaluation, 2014

IDGEI 2014: 2nd international workshop on intelligent digital games for empowerment and inclusion.
Proceedings of the 19th International Conference on Intelligent User Interfaces, 2014

The INTERSPEECH 2014 computational paralinguistics challenge: cognitive & physical load.
Proceedings of the INTERSPEECH 2014, 2014

Robust speech recognition using long short-term memory recurrent neural networks for hybrid acoustic modelling.
Proceedings of the INTERSPEECH 2014, 2014

Investigating NMF speech enhancement for neural network based acoustic models.
Proceedings of the INTERSPEECH 2014, 2014

Audio onset detection: A wavelet packet based approach with recurrent neural networks.
Proceedings of the 2014 International Joint Conference on Neural Networks, 2014

Transfer learning emotion manifestation across music and speech.
Proceedings of the 2014 International Joint Conference on Neural Networks, 2014

Linked Source and Target Domain Subspace Feature Transfer Learning - Exemplified by Speech Emotion Recognition.
Proceedings of the 22nd International Conference on Pattern Recognition, 2014

A Deep Semi-NMF Model for Learning Hidden Representations.
Proceedings of the 31th International Conference on Machine Learning, 2014

Emotion Recognition in the Wild: Incorporating Voice and Lip Activity in Multimodal Decision-Level Fusion.
Proceedings of the 16th International Conference on Multimodal Interaction, 2014

ERM4HCI 2014: The 2nd Workshop on Emotion Representation and Modelling in Human-Computer-Interaction-Systems.
Proceedings of the 16th International Conference on Multimodal Interaction, 2014

Acoustic Gait-based Person Identification using Hidden Markov Models.
Proceedings of the 2014 Workshop on Mapping Personality Traits Challenge and Workshop, 2014

MAPTRAITS 2014: The First Audio/Visual Mapping Personality Traits Challenge.
Proceedings of the 2014 Workshop on Mapping Personality Traits Challenge and Workshop, 2014

MAPTRAITS 2014 - The First Audio/Visual Mapping Personality Traits Challenge - An Introduction: Perceived Personality and Social Dimensions.
Proceedings of the 16th International Conference on Multimodal Interaction, 2014

Modeling gender information for emotion recognition using Denoising autoencoder.
Proceedings of the IEEE International Conference on Acoustics, 2014

Deep recurrent de-noising auto-encoder and blind de-reverberation for reverberated speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2014

On-line continuous-time music mood regression with deep recurrent neural networks.
Proceedings of the IEEE International Conference on Acoustics, 2014

Single-channel speech separation with memory-enhanced recurrent neural networks.
Proceedings of the IEEE International Conference on Acoustics, 2014

Multi-resolution linear prediction based features for audio onset detection with bidirectional LSTM neural networks.
Proceedings of the IEEE International Conference on Acoustics, 2014

CCA based feature selection with application to continuous depression recognition from acoustic speech features.
Proceedings of the IEEE International Conference on Acoustics, 2014

Introducing shared-hidden-layer autoencoders for transfer learning and their application in acoustic emotion recognition.
Proceedings of the IEEE International Conference on Acoustics, 2014

Social signal classification using deep blstm recurrent neural networks.
Proceedings of the IEEE International Conference on Acoustics, 2014

Discriminatively trained recurrent neural networks for single-channel speech separation.
Proceedings of the 2014 IEEE Global Conference on Signal and Information Processing, 2014

2013
Keyword spotting exploiting Long Short-Term Memory.
Speech Commun., 2013

Serious Gaming for Behavior Change: The State of Play.
IEEE Pervasive Comput., 2013

LSTM-Modeling of continuous emotions in an audiovisual affect recognition framework.
Image Vis. Comput., 2013

Categorical and dimensional affect analysis in continuous input: Current trends and future directions.
Image Vis. Comput., 2013

Introduction To The Special Issue On Affect Analysis In Continuous Input.
Image Vis. Comput., 2013

Words that Fascinate the Listener: Predicting Affective Ratings of On-Line Lectures.
Int. J. Distance Educ. Technol., 2013

YouTube Movie Reviews: Sentiment Analysis in an Audio-Visual Context.
IEEE Intell. Syst., 2013

New Avenues in Opinion Mining and Sentiment Analysis.
IEEE Intell. Syst., 2013

Statistical Approaches to Concept-Level Sentiment Analysis.
IEEE Intell. Syst., 2013

Knowledge-Based Approaches to Concept-Level Sentiment Analysis.
IEEE Intell. Syst., 2013

Computational Audio Analysis (Dagstuhl Seminar 13451).
Dagstuhl Reports, 2013

Noise robust ASR in reverberated multisource environments applying convolutive NMF and Long Short-Term Memory.
Comput. Speech Lang., 2013

Paralinguistics in speech and language - State-of-the-art and the challenge.
Comput. Speech Lang., 2013

Introduction to the special issue on Paralinguistics in Naturalistic Speech and Language.
Comput. Speech Lang., 2013

6th International Symposium on Attention in Cognitive Systems 2013.
CoRR, 2013

A Real-Time Speech Enhancement Framework in Noisy and Reverberated Acoustic Scenarios.
Cogn. Comput., 2013

Likability of human voices: A feature analysis and a neural network regression approach to automatic likability estimation.
Proceedings of the 14th International Workshop on Image Analysis for Multimedia Interactive Services, 2013

Large-scale audio feature extraction and SVM for acoustic scene classification.
Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2013

AVEC 2013: the continuous audio/visual emotion and depression recognition challenge.
Proceedings of the 3rd ACM international workshop on Audio/visual emotion challenge, 2013

Workshop summary for the 3rd international audio/visual emotion challenge and workshop (AVEC'13).
Proceedings of the ACM Multimedia Conference, 2013

Recent developments in openSMILE, the munich open-source multimedia feature extractor.
Proceedings of the ACM Multimedia Conference, 2013

The TUM Approach to the MediaEval Music Emotion Task Using Generic Affective Audio Features.
Proceedings of the MediaEval 2013 Multimedia Benchmark Workshop, 2013

Active learning by label uncertainty for acoustic emotion recognition.
Proceedings of the INTERSPEECH 2013, 2013

The INTERSPEECH 2013 computational paralinguistics challenge: social signals, conflict, emotion, autism.
Proceedings of the INTERSPEECH 2013, 2013

Active learning for dimensional speech emotion recognition.
Proceedings of the INTERSPEECH 2013, 2013

Detecting overlapping speech with long short-term memory recurrent neural networks.
Proceedings of the INTERSPEECH 2013, 2013

Using linguistic information to detect overlapping speech.
Proceedings of the INTERSPEECH 2013, 2013

Affect recognition in real-life acoustic conditions - a new perspective on feature selection.
Proceedings of the INTERSPEECH 2013, 2013

Influence of Low-Level Features Extracted from Rhythmic and Harmonic Sections on Music Genre Classification.
Proceedings of the Man-Machine Interactions 3, 2013

ERM4HCI 2013: the 1st workshop on emotion representation and modelling in human-computer-interaction-systems.
Proceedings of the 2013 International Conference on Multimodal Interaction, 2013

The acoustics of eye contact: detecting visual attention from conversational audio cues.
Proceedings of the 6th workshop on Eye gaze in intelligent human machine interaction: gaze in multimodal interaction, 2013

Co-training succeeds in Computational Paralinguistics.
Proceedings of the IEEE International Conference on Acoustics, 2013

Feature enhancement by bidirectional LSTM networks for conversational speech recognition in highly non-stationary noise.
Proceedings of the IEEE International Conference on Acoustics, 2013

Probabilistic asr feature extraction applying context-sensitive connectionist temporal classification networks.
Proceedings of the IEEE International Conference on Acoustics, 2013

Speaker trait characterization in web videos: Uniting speech, language, and facial features.
Proceedings of the IEEE International Conference on Acoustics, 2013

A discriminative approach to polyphonic piano note transcription using supervised non-negative matrix factorization.
Proceedings of the IEEE International Conference on Acoustics, 2013

Acoustic Geo-Sensing: Recognising cyclists' route, route direction, and route progress from cell-phone audio.
Proceedings of the IEEE International Conference on Acoustics, 2013

Automatic recognition of physiological parameters in the human voice: Heart rate and skin conductance.
Proceedings of the IEEE International Conference on Acoustics, 2013

A comparative study on sparsity penalties for NMF-based speech separation: Beyond LP-norms.
Proceedings of the IEEE International Conference on Acoustics, 2013

Integrating noise estimation and factorization-based speech separation: A novel hybrid approach.
Proceedings of the IEEE International Conference on Acoustics, 2013

Off-line refinement of audio-to-score alignment by observation template adaptation.
Proceedings of the IEEE International Conference on Acoustics, 2013

Gait-based person identification by spectral, cepstral and energy-related audio features.
Proceedings of the IEEE International Conference on Acoustics, 2013

Real-life voice activity detection with LSTM Recurrent Neural Networks and an application to Hollywood movies.
Proceedings of the IEEE International Conference on Acoustics, 2013

Hierarchical neural networks and enhanced class posteriors for social signal classification.
Proceedings of the 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, 2013

Sparse Autoencoder-Based Feature Transfer Learning for Speech Emotion Recognition.
Proceedings of the 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction, 2013

Intelligent Audio Analysis.
Signals and communication technology, Springer, ISBN: 978-3-642-36805-9, 2013

2012
Optimization and Parallelization of Monaural Source Separation Algorithms in the openBliSSART Toolkit.
J. Signal Process. Syst., 2012

Neural Networks and Learning Systems Come Together.
IEEE Trans. Neural Networks Learn. Syst., 2012

A multitask approach to continuous five-dimensional affect sensing in natural speech.
ACM Trans. Interact. Intell. Syst., 2012

The Voice of Leadership: Models and Performances of Automatic Analysis in Online Speeches.
IEEE Trans. Affect. Comput., 2012

Guest Editorial: Special Section on Naturalistic Affect Resources for System Building and Evaluation.
IEEE Trans. Affect. Comput., 2012

Building Autonomous Sensitive Artificial Listeners.
IEEE Trans. Affect. Comput., 2012

Context-Sensitive Learning for Enhanced Audiovisual Emotion Classification.
IEEE Trans. Affect. Comput., 2012

The Computational Paralinguistics Challenge [Social Sciences].
IEEE Signal Process. Mag., 2012

Synthesized speech for model training in cross-corpus recognition of human emotion.
Int. J. Speech Technol., 2012

Applying multiple classifiers and non-linear dynamics features for detecting sleepiness from speech.
Neurocomputing, 2012

Emotion and mental state recognition from speech.
EURASIP J. Adv. Signal Process., 2012

Cognitive and Emotional Information Processing for Human-Machine Interaction.
Cogn. Comput., 2012

Real-Time Activity Detection in a Multi-Talker Reverberated Environment.
Cogn. Comput., 2012

Emotion in the speech of children with autism spectrum conditions: prosody and everything else.
Proceedings of the Third Workshop on Child, Computer and Interaction, 2012

Speech, Emotion, Age, Language, Task, and Typicality: Trying to Disentangle Performance and Feature Relevance.
Proceedings of the 2012 International Conference on Privacy, 2012

Dimensional and continuous analysis of emotions for multimedia applications: a tutorial overview.
Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature Sets.
Proceedings of the Working Notes Proceedings of the MediaEval 2012 Workshop, 2012

Dominance Detection in a Reverberated Acoustic Scenario.
Proceedings of the Advances in Neural Networks - ISNN 2012, 2012

Score-Informed Leading Voice Separation from Monaural Audio.
Proceedings of the 13th International Society for Music Information Retrieval Conference, 2012

Towards distributed recognition of emotion from speech.
Proceedings of the 5th International Symposium on Communications, 2012

Active Learning by Sparse Instance Tracking and Classifier Confidence in Acoustic Emotion Recognition.
Proceedings of the INTERSPEECH 2012, 2012

Temporal and Situational Context Modeling for Improved Dominance Recognition in Meetings.
Proceedings of the INTERSPEECH 2012, 2012

Combining Bottleneck-BLSTM and Semi-Supervised Sparse NMF for Recognition of Conversational Speech in Highly Instationary Noise.
Proceedings of the INTERSPEECH 2012, 2012

Discrimination of Linguistic and Non-Linguistic Vocalizations in Spontaneous Speech: Intra- and Inter-Corpus Perspectives.
Proceedings of the INTERSPEECH 2012, 2012

Improving Recognition of Speaker States and Traits by Cumulative Evidence: Intoxication, Sleepiness, Age and Gender.
Proceedings of the INTERSPEECH 2012, 2012


Novel Metrics of Speech Rhythm for the Assessment of Emotion.
Proceedings of the INTERSPEECH 2012, 2012

Convolutive Non-Negative Sparse Coding and New Features for Speech Overlap Handling in Speaker Diarization.
Proceedings of the INTERSPEECH 2012, 2012

Confidence Measures in Speech Emotion Recognition Based on Semi-supervised Learning.
Proceedings of the INTERSPEECH 2012, 2012

Likability Classification - A Not so Deep Neural Network Approach.
Proceedings of the INTERSPEECH 2012, 2012

AVEC 2012: the continuous audio/visual emotion challenge.
Proceedings of the International Conference on Multimodal Interaction, 2012

AVEC 2012: the continuous audio/visual emotion challenge - an introduction.
Proceedings of the International Conference on Multimodal Interaction, 2012

Preserving actual dynamic trend of emotion in dimensional speech emotion recognition.
Proceedings of the International Conference on Multimodal Interaction, 2012

Improving generalisation and robustness of acoustic affect recognition.
Proceedings of the International Conference on Multimodal Interaction, 2012

Semi-supervised learning helps in sound event classification.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Analyzing the memory of BLSTM Neural Networks for enhanced emotion classification in dyadic spoken interactions.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Non-negative matrix factorization for highly noise-robust ASR: To enhance or to recognize?
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Supervised and semi-supervised suppression of background music in monaural speech recordings.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Robust feature extraction for automatic recognition of vibrato singing in recorded polyphonic music.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Speech overlap detection and attribution using convolutive non-negative sparse coding.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Automatic recognition of emotion evoked by general sound events.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Fine-tuning HMMS for nonverbal vocalizations in spontaneous speech: A multicorpus perspective.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Audiovisual vocal outburst classification in noisy acoustic conditions.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Real-Time Speech Separation by Semi-supervised Nonnegative Matrix Factorization.
Proceedings of the Latent Variable Analysis and Signal Separation, 2012

Speech overlap detection using convolutive non-negative sparse coding: New improvements and insights.
Proceedings of the 20th European Signal Processing Conference, 2012

Music Information Retrieval: An Inspirational Guide to Transfer from Related Disciplines.
Proceedings of the Multimodal Music Processing, 2012

Towards Automatic Intoxication Detection from Speech in Real-Life Acoustic Environments.
Proceedings of the 10th ITG Conference on Speech Communication, 2012

Fully Automatic Audiovisual Emotion Recognition: Voice, Words, and the Face.
Proceedings of the 10th ITG Conference on Speech Communication, 2012

Sparse, Hierarchical and Semi-Supervised Base Learning for Monaural Enhancement of Conversational Speech.
Proceedings of the 10th ITG Conference on Speech Communication, 2012

Exploring Nonnegative Matrix Factorization for Audio Classification: Application to Speaker Recognition.
Proceedings of the 10th ITG Conference on Speech Communication, 2012

Confidence Measures for Speech Emotion Recognition: A Start.
Proceedings of the 10th ITG Conference on Speech Communication, 2012

2011
Tandem decoding of children's speech for keyword detection in a child-robot interaction scenario.
ACM Trans. Speech Lang. Process., 2011

Online Driver Distraction Detection Using Long Short-Term Memory.
IEEE Trans. Intell. Transp. Syst., 2011

Recognizing Affect from Linguistic Information in 3D Continuous Space.
IEEE Trans. Affect. Comput., 2011

Recognising realistic emotions and affect in speech: State of the art and lessons learnt from the first challenge.
Speech Commun., 2011

Introduction to the special issue on sensing emotion and affect - Facing realism in speech processing.
Speech Commun., 2011

Computational Assessment of Interest in Speech - Facing the Real-Life Challenge.
Künstliche Intell., 2011

Affective speaker state analysis in the presence of reverberation.
Int. J. Speech Technol., 2011

Recognition of Nonprototypical Emotions in Reverberated and Noisy Speech by Nonnegative Matrix Factorization.
EURASIP J. Adv. Signal Process., 2011

Whodunnit - Searching for the most important feature types signalling emotion-related user states in speech.
Comput. Speech Lang., 2011

Semantic Speech Tagging: Towards Combined Analysis of Speaker Traits.
Proceedings of the AES International Conference Semantic Audio 2011, 2011

Enhancing Spontaneous Speech Recognition with BLSTM Features.
Proceedings of the Advances in Nonlinear Speech Processing, 2011

A Real-Time Speech Enhancement Framework for Multi-party Meetings.
Proceedings of the Advances in Nonlinear Speech Processing, 2011

Robust Multi-stream Keyword and Non-linguistic Vocalization Detection for Computationally Intelligent Virtual Agents.
Proceedings of the Advances in Neural Networks - ISNN 2011, 2011

Automatic Assessment of Singer Traits in Popular Music: Gender, Age, Height and Race.
Proceedings of the 12th International Society for Music Information Retrieval Conference, 2011

Multi-Modal Non-Prototypical Music Mood Analysis in Continuous Space: Reliability and Performances.
Proceedings of the 12th International Society for Music Information Retrieval Conference, 2011

Interacting with Emotional Virtual Agents.
Proceedings of the Intelligent Technologies for Interactive Entertainment, 2011

Speech-Based Non-Prototypical Affect Recognition for Child-Robot Interaction in Reverberated Environments.
Proceedings of the INTERSPEECH 2011, 2011

Acoustic-Linguistic Recognition of Interest in Speech with Bottleneck-BLSTM Nets.
Proceedings of the INTERSPEECH 2011, 2011

Feature Frame Stacking in RNN-Based Tandem ASR Systems - Learned vs. Predefined Context.
Proceedings of the INTERSPEECH 2011, 2011

Using Multiple Databases for Training in Emotion Recognition: To Unite or to Vote?
Proceedings of the INTERSPEECH 2011, 2011

The INTERSPEECH 2011 Speaker State Challenge.
Proceedings of the INTERSPEECH 2011, 2011

Learning New Acoustic Events in an HMM-Based System Using MAP Adaptation.
Proceedings of the INTERSPEECH 2011, 2011

"Would You Buy a Car from Me?" - On the Likability of Telephone Voices.
Proceedings of the INTERSPEECH 2011, 2011

Real-Time Speech Recognition in a Multi-talker Reverberated Acoustic Scenario.
Proceedings of the Advanced Intelligent Computing Theories and Applications. With Aspects of Artificial Intelligence, 2011

A multi-stream ASR framework for BLSTM modeling of conversational speech.
Proceedings of the IEEE International Conference on Acoustics, 2011

Localization of non-linguistic events in spontaneous speech by Non-Negative Matrix Factorization and Long Short-Term Memory.
Proceedings of the IEEE International Conference on Acoustics, 2011

Audio recognition in the wild: Static and dynamic classification on a real-world database of animal vocalizations.
Proceedings of the IEEE International Conference on Acoustics, 2011

OpenBliSSART: Design and evaluation of a research toolkit for Blind Source Separation in Audio Recognition Tasks.
Proceedings of the IEEE International Conference on Acoustics, 2011

Combining monaural source separation with Long Short-Term Memory for increased robustness in vocalist gender recognition.
Proceedings of the IEEE International Conference on Acoustics, 2011

Deep neural networks for acoustic emotion recognition: Raising the benchmarks.
Proceedings of the IEEE International Conference on Acoustics, 2011

Syllabification of conversational speech using Bidirectional Long-Short-Term Memory Neural Networks.
Proceedings of the IEEE International Conference on Acoustics, 2011

Audiovisual classification of vocal outbursts in human conversation using Long-Short-Term Memory networks.
Proceedings of the IEEE International Conference on Acoustics, 2011

Come and have an emotional workout with sensitive artificial listeners!
Proceedings of the Ninth IEEE International Conference on Automatic Face and Gesture Recognition (FG 2011), 2011

Emotion representation, analysis and synthesis in continuous space: A survey.
Proceedings of the Ninth IEEE International Conference on Automatic Face and Gesture Recognition (FG 2011), 2011

String-based audiovisual fusion of behavioural events for the assessment of dimensional affect.
Proceedings of the Ninth IEEE International Conference on Automatic Face and Gesture Recognition (FG 2011), 2011

Ten Recent Trends in Computational Paralinguistics.
Proceedings of the Cognitive Behavioural Systems, 2011

Conversational Speech Recognition in Non-stationary Reverberated Environments.
Proceedings of the Cognitive Behavioural Systems, 2011

Unsupervised learning in cross-corpus acoustic emotion recognition.
Proceedings of the 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, 2011

A novel bottleneck-BLSTM front-end for feature-level context modeling in conversational speech recognition.
Proceedings of the 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, 2011

AVEC 2011-The First International Audio/Visual Emotion Challenge.
Proceedings of the Affective Computing and Intelligent Interaction, 2011

The First Audio/Visual Emotion Challenge and Workshop - An Introduction.
Proceedings of the Affective Computing and Intelligent Interaction, 2011

Voice and Speech Analysis in Search of States and Traits.
Proceedings of the Computer Analysis of Human Behavior., 2011

2010
Cross-Corpus Acoustic Emotion Recognition: Variances and Strategies.
IEEE Trans. Affect. Comput., 2010

Combining Long Short-Term Memory and Dynamic Bayesian Networks for Incremental Emotion-Sensitive Artificial Listening.
IEEE J. Sel. Top. Signal Process., 2010

On-line emotion recognition in a 3-D activation-valence-time continuum using acoustic and linguistic cues.
J. Multimodal User Interfaces, 2010

On the Impact of Children's Emotional Speech on Acoustic and Language Models.
EURASIP J. Audio Speech Music. Process., 2010

Determination of Nonprototypical Valence and Arousal in Popular Music: Features and Performances.
EURASIP J. Audio Speech Music. Process., 2010

Bidirectional LSTM Networks for Context-Sensitive Keyword Detection in a Cognitive Virtual Agent Framework.
Cogn. Comput., 2010

Emotion on the Road - Necessity, Acceptance, and Feasibility of Affective Computing in the Car.
Adv. Hum. Comput. Interact., 2010

Segmenting into Adequate Units for Automatic Recognition of Emotion-Related Episodes: A Speech-Based Approach.
Adv. Hum. Comput. Interact., 2010

Opensmile: the munich versatile and fast open-source audio feature extractor.
Proceedings of the 18th International Conference on Multimedia 2010, 2010

3d gesture recognition applying long short-term memory and contextual knowledge in a CAVE.
Proceedings of the 1st ACM international workshop on Multimodal pervasive video analysis, 2010

CINEMO - A French Spoken Language Resource for Complex Emotions: Facts and Baselines.
Proceedings of the International Conference on Language Resources and Evaluation, 2010

Vocalist Gender Recognition in Recorded Popular Music.
Proceedings of the 11th International Society for Music Information Retrieval Conference, 2010

Universal Onset Detection with Bidirectional Long Short-Term Memory Neural Networks.
Proceedings of the 11th International Society for Music Information Retrieval Conference, 2010

Long short-term memory networks for noise robust speech recognition.
Proceedings of the INTERSPEECH 2010, 2010

Context-sensitive multimodal emotion recognition from speech and facial expression using bidirectional LSTM modeling.
Proceedings of the INTERSPEECH 2010, 2010

Recognition of spontaneous conversational speech using long short-term memory phoneme predictions.
Proceedings of the INTERSPEECH 2010, 2010

The INTERSPEECH 2010 paralinguistic challenge.
Proceedings of the INTERSPEECH 2010, 2010

Incremental acoustic valence recognition: an inter-corpus perspective on features, matching, and performance in a gating paradigm.
Proceedings of the INTERSPEECH 2010, 2010

Emotion recognition using imperfect speech recognition.
Proceedings of the INTERSPEECH 2010, 2010

Spoken term detection with Connectionist Temporal Classification: A novel hybrid CTC-DBN decoder.
Proceedings of the IEEE International Conference on Acoustics, 2010

Non-negative matrix factorization as noise-robust feature extractor for speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2010

Discrimination of speech and non-linguistic vocalizations by Non-Negative Matrix Factorization.
Proceedings of the IEEE International Conference on Acoustics, 2010

Late fusion of individual engines for improved recognition of negative emotion in speech - learning vs. democratic vote.
Proceedings of the IEEE International Conference on Acoustics, 2010

Learning with synthesized speech for automatic emotion recognition.
Proceedings of the IEEE International Conference on Acoustics, 2010

Learning and Knowledge-Based Sentiment Analysis in Movie Review Key Excerpts.
Proceedings of the Toward Autonomous, Adaptive, and Context-Aware Multimodal Interfaces. Theoretical and Practical Issues, 2010

Real Time Person Tracking and Behavior Interpretation in Multi Camera Scenarios Applying Homography and Coupled HMMs.
Proceedings of the Analysis of Verbal and Nonverbal Communication and Enactment. The Processing Issues, 2010

Switching Linear Dynamic Models for Recognition of Emotionally Colored and Noisy Speech.
Proceedings of the 9. ITG-Fachtagung Sprachkommunikation 2010, 2010

2009
Being bored? Recognising natural interest by extensive audiovisual integration for real-life application.
Image Vis. Comput., 2009

A multidimensional dynamic time warping algorithm for efficient multimodal fusion of asynchronous data streams.
Neurocomputing, 2009

Recognition of Noisy Speech: A Comparative Survey of Robust Model Architecture and Feature Enhancement.
EURASIP J. Audio Speech Music. Process., 2009

Applying Bayes Markov chains for the detection of ATM related scenarios.
Proceedings of the IEEE Workshop on Applications of Computer Vision (WACV 2009), 2009

Improving Keyword Spotting with a Tandem BLSTM-DBN Architecture.
Proceedings of the Advances in Nonlinear Speech Processing, 2009

Robust in-car spelling recognition - a tandem BLSTM-HMM approach.
Proceedings of the INTERSPEECH 2009, 2009

Data-driven clustering in emotional space for affect recognition using discriminatively trained LSTM networks.
Proceedings of the INTERSPEECH 2009, 2009

The INTERSPEECH 2009 emotion challenge.
Proceedings of the INTERSPEECH 2009, 2009

Recognising interest in conversational speech - comparing bag of frames and supra-segmental features.
Proceedings of the INTERSPEECH 2009, 2009

Audio chord labeling by musiological modeling and beat-synchronization.
Proceedings of the 2009 IEEE International Conference on Multimedia and Expo, 2009

Speech control in surgery: A field analysis and strategies.
Proceedings of the 2009 IEEE International Conference on Multimedia and Expo, 2009

Boosting multi-modal camera selection with semantic features.
Proceedings of the 2009 IEEE International Conference on Multimedia and Expo, 2009

Graphical models for multi-modal automatic video editing in meetings.
Proceedings of the 16th International Conference on Digital Signal Processing, 2009

Resolving partial occlusions in crowded environments utilizing range data and video cameras.
Proceedings of the 16th International Conference on Digital Signal Processing, 2009

A hierarchical approach for visual suspicious behavior detection in aircrafts.
Proceedings of the 16th International Conference on Digital Signal Processing, 2009

"The Godfather" vs. "Chaos": Comparing Linguistic Analysis Based on On-line Knowledge Sources and Bags-of-N-Grams for Movie Review Valence Estimation.
Proceedings of the 10th International Conference on Document Analysis and Recognition, 2009

GMs in On-Line Handwritten Whiteboard Note Recognition: The Influence of Implementation and Modeling.
Proceedings of the 10th International Conference on Document Analysis and Recognition, 2009

Robust discriminative keyword spotting for emotionally colored spontaneous speech using bidirectional LSTM networks.
Proceedings of the IEEE International Conference on Acoustics, 2009

Emotion recognition from speech: Putting ASR in the loop.
Proceedings of the IEEE International Conference on Acoustics, 2009

Robust vocabulary independent keyword spotting with graphical models.
Proceedings of the 2009 IEEE Workshop on Automatic Speech Recognition & Understanding, 2009

Acoustic emotion recognition: A benchmark comparison of performances.
Proceedings of the 2009 IEEE Workshop on Automatic Speech Recognition & Understanding, 2009

From speech to letters - using a novel neural network architecture for grapheme based ASR.
Proceedings of the 2009 IEEE Workshop on Automatic Speech Recognition & Understanding, 2009

The hinterland of emotions: Facing the open-microphone challenge.
Proceedings of the Affective Computing and Intelligent Interaction, 2009

A demonstration of audiovisual sensitive artificial listeners.
Proceedings of the Affective Computing and Intelligent Interaction, 2009

OpenEAR - Introducing the munich open-source emotion and affect recognition toolkit.
Proceedings of the Affective Computing and Intelligent Interaction, 2009

2008
Tango or Waltz?: Putting Ballroom Dance Style into Tempo Detection.
EURASIP J. Audio Speech Music. Process., 2008

Detecting problems in spoken child-computer interaction.
Proceedings of the 1st Workshop on Child, Computer and Interaction, 2008

Does affect affect automatic recognition of children<sup>2</sup>s speech?
Proceedings of the 1st Workshop on Child, Computer and Interaction, 2008

Low-Level Fusion of Audio, Video Feature for Multi-Modal Emotion Recognition.
Proceedings of the VISAPP 2008: Proceedings of the Third International Conference on Computer Vision Theory and Applications, Funchal, Madeira, Portugal, January 22-25, 2008, 2008

Emotion sensitive speech control for human-robot interaction in minimal invasive surgery.
Proceedings of the 17th IEEE International Symposium on Robot and Human Interactive Communication, 2008

On the Influence of Phonetic Content Variation for Acoustic Emotion Recognition.
Proceedings of the Perception in Multimodal Dialogue Systems, 2008

Static and Dynamic Modelling for the Recognition of Non-verbal Vocalisations in Conversational Speech.
Proceedings of the Perception in Multimodal Dialogue Systems, 2008

Abandoning emotion classes - towards continuous emotion recognition with modelling of long-range dependencies.
Proceedings of the INTERSPEECH 2008, 2008

Balancing spoken content adaptation and unit length in the recognition of emotion and interest.
Proceedings of the INTERSPEECH 2008, 2008

Patterns, prototypes, performance: classifying emotional user states.
Proceedings of the INTERSPEECH 2008, 2008

Prosodic and spectral features within segment-based acoustic modeling.
Proceedings of the INTERSPEECH 2008, 2008

Speech recognition in noisy environments using a switching linear dynamic model for feature enhancement.
Proceedings of the INTERSPEECH 2008, 2008

Detection of security related affect and behaviour in passenger transport.
Proceedings of the INTERSPEECH 2008, 2008

Combining speech recognition and acoustic word emotion models for robust text-independent emotion recognition.
Proceedings of the 2008 IEEE International Conference on Multimedia and Expo, 2008

Applying multi layer homography for multi camera person tracking.
Proceedings of the 2008 Second ACM/IEEE International Conference on Distributed Smart Cameras, 2008

Brute-forcing hierarchical functionals for paralinguistics: A waste of feature space?
Proceedings of the IEEE International Conference on Acoustics, 2008

Mothers, adults, children, pets - towards the acoustics of intimacy.
Proceedings of the IEEE International Conference on Acoustics, 2008

Switching Linear Dynamic Models for Noise Robust In-Car Speech Recognition.
Proceedings of the Pattern Recognition, 2008

Music Thumbnailing Incorporating Harmony- and Rhythm Structure.
Proceedings of the Adaptive Multimedia Retrieval. Identifying, 2008

2007
Mensch, Maschine, Emotion: Erkennung aus sprachlicher und manueller Interaktion.
VDM, ISBN: 978-3-8364-1522-4, 2007

Combining frame and turn-level information for robust recognition of emotions within speech.
Proceedings of the INTERSPEECH 2007, 2007

The relevance of feature type for the automatic classification of emotional user states: low level descriptors and functionals.
Proceedings of the INTERSPEECH 2007, 2007

Audiovisual recognition of spontaneous interest within conversations.
Proceedings of the 9th International Conference on Multimodal Interfaces, 2007

Hidden Conditional Random Fields for Meeting Segmentation.
Proceedings of the 2007 IEEE International Conference on Multimedia and Expo, 2007

Wearable Assistance for the Ballroom-Dance Hobbyist - Holistic Rhythm Analysis and Dance-Style Classification.
Proceedings of the 2007 IEEE International Conference on Multimedia and Expo, 2007

Suspicious Behavior Detection in Public Transport by Fusion of Low-Level Video Descriptors.
Proceedings of the 2007 IEEE International Conference on Multimedia and Expo, 2007

Towards More Reality in the Recognition of Emotional Speech.
Proceedings of the IEEE International Conference on Acoustics, 2007

Fast and Robust Meter and Tempo Recognition for the Automatic Discrimination of Ballroom Dance Styles.
Proceedings of the IEEE International Conference on Acoustics, 2007

Audiovisual Behavior Modeling by Combined Feature Spaces.
Proceedings of the IEEE International Conference on Acoustics, 2007

Comparing one and two-stage acoustic modeling in the recognition of emotion in speech.
Proceedings of the IEEE Workshop on Automatic Speech Recognition & Understanding, 2007

Frame vs. Turn-Level: Emotion Recognition from Speech Considering Static and Dynamic Processing.
Proceedings of the Affective Computing and Intelligent Interaction, 2007

What Should a Generic Emotion Markup Language Be Able to Represent?
Proceedings of the Affective Computing and Intelligent Interaction, 2007

On the Necessity and Feasibility of Detecting a Driver's Emotional State While Driving.
Proceedings of the Affective Computing and Intelligent Interaction, 2007

2006
Timing levels in segment-based speech emotion recognition.
Proceedings of the INTERSPEECH 2006, 2006

Recognition of interest in human conversational speech.
Proceedings of the INTERSPEECH 2006, 2006

Efficient Recognition of Authentic Dynamic Facial Expressions on the Feedtum Database.
Proceedings of the 2006 IEEE International Conference on Multimedia and Expo, 2006

Musical Signal Type Discrimination based on Large Open Feature Sets.
Proceedings of the 2006 IEEE International Conference on Multimedia and Expo, 2006

Evolutionary Feature Generation in Speech Emotion Recognition.
Proceedings of the 2006 IEEE International Conference on Multimedia and Expo, 2006

Segmentation and Recognition of Meeting Events using a Two-Layered HMM and a Combined MLP-HMM Approach.
Proceedings of the 2006 IEEE International Conference on Multimedia and Expo, 2006

A Two-Layer Graphical Model for Combined Video Shot and Scene Boundary Detection.
Proceedings of the 2006 IEEE International Conference on Multimedia and Expo, 2006

Submotions for Hidden Markov Model Based Dynamic Facial Action Recognition.
Proceedings of the International Conference on Image Processing, 2006

A Combined LSTM-RNN - HMM - Approach for Meeting Event Segmentation and Recognition.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

2005
Automatische Emotionserkennung aus sprachlicher und manueller Interaktion.
PhD thesis, 2005

Speaker independent emotion recognition by early fusion of acoustic and linguistic features within ensembles.
Proceedings of the INTERSPEECH 2005, 2005

Feature Selection and Stacking for Robust Discrimination of Speech, Monophonic Singing, and Polyphonic Music.
Proceedings of the 2005 IEEE International Conference on Multimedia and Expo, 2005

Speaker Independent Speech Emotion Recognition by Ensemble Classification.
Proceedings of the 2005 IEEE International Conference on Multimedia and Expo, 2005

Video based online behavior detection using probabilistic multi stream fusion.
Proceedings of the 2005 International Conference on Image Processing, 2005

Meta-Classifiers in Acoustic and Linguistic Feature Fusion-Based Affect Recognition.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

2004
Discrimination of speech and monophonic singing in continuous audio streams applying multi-layer support vector machines.
Proceedings of the 2004 IEEE International Conference on Multimedia and Expo, 2004

Emotion recognition in the manual interaction with graphical user interfaces.
Proceedings of the 2004 IEEE International Conference on Multimedia and Expo, 2004

Multimodal music retrieval for large databases.
Proceedings of the 2004 IEEE International Conference on Multimedia and Expo, 2004

Applying Bayesian belief networks in approximate string matching for robust keyword-based retrieval.
Proceedings of the 2004 IEEE International Conference on Multimedia and Expo, 2004

Speech emotion recognition combining acoustic features and linguistic information in a hybrid support vector machine-belief network architecture.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

2003
A real-time system for hand gesture controlled operation of in-car devices.
Proceedings of the 2003 IEEE International Conference on Multimedia and Expo, 2003

A hybrid music retrieval system using belief networks to integrate multimodal queries and contextual knowledge.
Proceedings of the 2003 IEEE International Conference on Multimedia and Expo, 2003

HMM-based music retrieval using stereophonic feature information and framelength adaptation.
Proceedings of the 2003 IEEE International Conference on Multimedia and Expo, 2003

Hidden Markov model-based speech emotion recognition.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

2002
Aspekte effizienten Usability Engineerings (Aspects of Efficient Usability Engineering).
Informationstechnik Tech. Inform., 2002

Towards intuitive speech interaction by the integration of emotional aspects.
Proceedings of the IEEE International Conference on Systems, Man and Cybernetics: Bridging the Digital Divide, Yasmine Hammamet, Tunisia, October 6-9, 2002, 2002

Multimodal emotion recognition in audiovisual communication.
Proceedings of the 2002 IEEE International Conference on Multimedia and Expo, 2002

A new technique for adjusting distraction moments in multitasking non-field usability tests.
Proceedings of the Extended abstracts of the 2002 Conference on Human Factors in Computing Systems, 2002

Experimental evaluation of user errors at the skill-based level in an automative environment.
Proceedings of the Extended abstracts of the 2002 Conference on Human Factors in Computing Systems, 2002

2001
Using multimodal interaction to navigate in arbitrary virtual VRML worlds.
Proceedings of the 2001 workshop on Perceptive user interfaces, 2001


  Loading...