Martin Wöllmer

Proceedings of the 2015 International Conference on Affective Computing and Intelligent Interaction, 2015

2014

Memory-Enhanced Neural Networks and NMF for Robust ASR.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2014

Probabilistic speech feature extraction with context-sensitive Bottleneck neural networks.

[BibT_eX]

[DOI]

Neurocomputing, 2014

Feature enhancement by deep LSTM networks for ASR in reverberant multisource environments.

[BibT_eX]

[DOI]

Comput. Speech Lang., 2014

A Broadcast News Corpus for Evaluation and Tuning of German LVCSR Systems.

[BibT_eX]

[DOI]

CoRR, 2014

2013

Keyword spotting exploiting Long Short-Term Memory.

[BibT_eX]

[DOI]

Speech Commun., 2013

LSTM-Modeling of continuous emotions in an audiovisual affect recognition framework.

[BibT_eX]

[DOI]

Image Vis. Comput., 2013

YouTube Movie Reviews: Sentiment Analysis in an Audio-Visual Context.

[BibT_eX]

[DOI]

Louis-Philippe Morency

IEEE Intell. Syst., 2013

Noise robust ASR in reverberated multisource environments applying convolutive NMF and Long Short-Term Memory.

[BibT_eX]

[DOI]

Comput. Speech Lang., 2013

Feature enhancement by bidirectional LSTM networks for conversational speech recognition in highly non-stationary noise.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2013

Probabilistic asr feature extraction applying context-sensitive connectionist temporal classification networks.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2013

Speaker trait characterization in web videos: Uniting speech, language, and facial features.

[BibT_eX]

[DOI]

Louis-Philippe Morency

Proceedings of the IEEE International Conference on Acoustics, 2013

2012

Context-sensitive machine learning for intelligent human behavior analysis.

[BibT_eX]

[DOI]

PhD thesis, 2012

A multitask approach to continuous five-dimensional affect sensing in natural speech.

[BibT_eX]

[DOI]

ACM Trans. Interact. Intell. Syst., 2012

Building Autonomous Sensitive Artificial Listeners.

[BibT_eX]

[DOI]

Michel François Valstar

IEEE Trans. Affect. Comput., 2012

Context-Sensitive Learning for Enhanced Audiovisual Emotion Classification.

[BibT_eX]

[DOI]

Angeliki Metallinou

Athanasios Katsamanis

IEEE Trans. Affect. Comput., 2012

Real-Time Activity Detection in a Multi-Talker Reverberated Environment.

[BibT_eX]

[DOI]

Cogn. Comput., 2012

Dominance Detection in a Reverberated Acoustic Scenario.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Networks - ISNN 2012, 2012

Towards distributed recognition of emotion from speech.

[BibT_eX]

[DOI]

Proceedings of the 5th International Symposium on Communications, 2012

Temporal and Situational Context Modeling for Improved Dominance Recognition in Meetings.

[BibT_eX]

[DOI]

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Combining Bottleneck-BLSTM and Semi-Supervised Sparse NMF for Recognition of Conversational Speech in Highly Instationary Noise.

[BibT_eX]

[DOI]

Felix Weninger

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Analyzing the memory of BLSTM Neural Networks for enhanced emotion classification in dyadic spoken interactions.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Non-negative matrix factorization for highly noise-robust ASR: To enhance or to recognize?

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Fully Automatic Audiovisual Emotion Recognition: Voice, Words, and the Face.

[BibT_eX]

[DOI]

Proceedings of the 10th ITG Conference on Speech Communication, 2012

Sparse, Hierarchical and Semi-Supervised Base Learning for Monaural Enhancement of Conversational Speech.

[BibT_eX]

[DOI]

Felix Weninger

Proceedings of the 10th ITG Conference on Speech Communication, 2012

2011

Tandem decoding of children's speech for keyword detection in a child-robot interaction scenario.

[BibT_eX]

[DOI]

ACM Trans. Speech Lang. Process., 2011

Online Driver Distraction Detection Using Long Short-Term Memory.

[BibT_eX]

[DOI]

IEEE Trans. Intell. Transp. Syst., 2011

Computational Assessment of Interest in Speech - Facing the Real-Life Challenge.

[BibT_eX]

[DOI]

Künstliche Intell., 2011

Semantic Speech Tagging: Towards Combined Analysis of Speaker Traits.

[BibT_eX]

[DOI]

Proceedings of the AES International Conference Semantic Audio 2011, 2011

Enhancing Spontaneous Speech Recognition with BLSTM Features.

[BibT_eX]

[DOI]

Proceedings of the Advances in Nonlinear Speech Processing, 2011

Robust Multi-stream Keyword and Non-linguistic Vocalization Detection for Computationally Intelligent Virtual Agents.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Networks - ISNN 2011, 2011

Automatic Assessment of Singer Traits in Popular Music: Gender, Age, Height and Race.

[BibT_eX]

[DOI]

Felix Weninger

Proceedings of the 12th International Society for Music Information Retrieval Conference, 2011

Interacting with Emotional Virtual Agents.

[BibT_eX]

[DOI]

Proceedings of the Intelligent Technologies for Interactive Entertainment, 2011

Speech-Based Non-Prototypical Affect Recognition for Child-Robot Interaction in Reverberated Environments.

[BibT_eX]

[DOI]

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Acoustic-Linguistic Recognition of Interest in Speech with Bottleneck-BLSTM Nets.

[BibT_eX]

[DOI]

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Feature Frame Stacking in RNN-Based Tandem ASR Systems - Learned vs. Predefined Context.

[BibT_eX]

[DOI]

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

A multi-stream ASR framework for BLSTM modeling of conversational speech.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2011

Localization of non-linguistic events in spontaneous speech by Non-Negative Matrix Factorization and Long Short-Term Memory.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2011

Come and have an emotional workout with sensitive artificial listeners!

[BibT_eX]

[DOI]

Michel François Valstar

Proceedings of the Ninth IEEE International Conference on Automatic Face and Gesture Recognition (FG 2011), 2011

String-based audiovisual fusion of behavioural events for the assessment of dimensional affect.

[BibT_eX]

[DOI]

Michel François Valstar

Hatice Gunes

Maja Pantic

Proceedings of the Ninth IEEE International Conference on Automatic Face and Gesture Recognition (FG 2011), 2011

Conversational Speech Recognition in Non-stationary Reverberated Environments.

[BibT_eX]

[DOI]

Proceedings of the Cognitive Behavioural Systems, 2011

Unsupervised learning in cross-corpus acoustic emotion recognition.

[BibT_eX]

[DOI]

Proceedings of the 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, 2011

A novel bottleneck-BLSTM front-end for feature-level context modeling in conversational speech recognition.

[BibT_eX]

[DOI]

Proceedings of the 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, 2011

2010

Cross-Corpus Acoustic Emotion Recognition: Variances and Strategies.

[BibT_eX]

[DOI]

IEEE Trans. Affect. Comput., 2010

Combining Long Short-Term Memory and Dynamic Bayesian Networks for Incremental Emotion-Sensitive Artificial Listening.

[BibT_eX]

[DOI]

IEEE J. Sel. Top. Signal Process., 2010

On-line emotion recognition in a 3-D activation-valence-time continuum using acoustic and linguistic cues.

[BibT_eX]

[DOI]

J. Multimodal User Interfaces, 2010

Bidirectional LSTM Networks for Context-Sensitive Keyword Detection in a Cognitive Virtual Agent Framework.

[BibT_eX]

[DOI]

Cogn. Comput., 2010

Emotion on the Road - Necessity, Acceptance, and Feasibility of Affective Computing in the Car.

[BibT_eX]

[DOI]

Adv. Hum. Comput. Interact., 2010

Opensmile: the munich versatile and fast open-source audio feature extractor.

[BibT_eX]

[DOI]

Proceedings of the 18th International Conference on Multimedia 2010, 2010

3d gesture recognition applying long short-term memory and contextual knowledge in a CAVE.

[BibT_eX]

[DOI]

Proceedings of the 1st ACM international workshop on Multimodal pervasive video analysis, 2010

Long short-term memory networks for noise robust speech recognition.

[BibT_eX]

[DOI]

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Context-sensitive multimodal emotion recognition from speech and facial expression using bidirectional LSTM modeling.

[BibT_eX]

[DOI]

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Recognition of spontaneous conversational speech using long short-term memory phoneme predictions.

[BibT_eX]

[DOI]

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Spoken term detection with Connectionist Temporal Classification: A novel hybrid CTC-DBN decoder.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2010

Non-negative matrix factorization as noise-robust feature extractor for speech recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2010

Switching Linear Dynamic Models for Recognition of Emotionally Colored and Noisy Speech.

[BibT_eX]

[DOI]

Nikolaj Klebert

Proceedings of the 9. ITG-Fachtagung Sprachkommunikation 2010, 2010

2009

Being bored? Recognising natural interest by extensive audiovisual integration for real-life application.

[BibT_eX]

[DOI]

Image Vis. Comput., 2009

A multidimensional dynamic time warping algorithm for efficient multimodal fusion of asynchronous data streams.

[BibT_eX]

[DOI]

Neurocomputing, 2009

Recognition of Noisy Speech: A Comparative Survey of Robust Model Architecture and Feature Enhancement.

[BibT_eX]

[DOI]

EURASIP J. Audio Speech Music. Process., 2009

Improving Keyword Spotting with a Tandem BLSTM-DBN Architecture.

[BibT_eX]

[DOI]

Proceedings of the Advances in Nonlinear Speech Processing, 2009

Robust in-car spelling recognition - a tandem BLSTM-HMM approach.

[BibT_eX]

[DOI]

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Data-driven clustering in emotional space for affect recognition using discriminatively trained LSTM networks.

[BibT_eX]

[DOI]

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Speech control in surgery: A field analysis and strategies.

[BibT_eX]

[DOI]

Proceedings of the 2009 IEEE International Conference on Multimedia and Expo, 2009

Robust discriminative keyword spotting for emotionally colored spontaneous speech using bidirectional LSTM networks.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2009

Robust vocabulary independent keyword spotting with graphical models.

[BibT_eX]

[DOI]

Proceedings of the 2009 IEEE Workshop on Automatic Speech Recognition & Understanding, 2009

From speech to letters - using a novel neural network architecture for grapheme based ASR.

[BibT_eX]

[DOI]

Proceedings of the 2009 IEEE Workshop on Automatic Speech Recognition & Understanding, 2009

A demonstration of audiovisual sensitive artificial listeners.

[BibT_eX]

[DOI]

Proceedings of the Affective Computing and Intelligent Interaction, 2009

OpenEAR - Introducing the munich open-source emotion and affect recognition toolkit.

[BibT_eX]

[DOI]