Shiva Sundaram

Proceedings of the IEEE International Conference on Acoustics, 2023

2022

Scene Representation Learning from Videos Using Self-Supervised and Weakly-Supervised Techniques.

[BibT_eX]

[DOI]

Raghuveer Peri

Proceedings of the 2022 IEEE International Conference on Image Processing, 2022

Enhancing Contrastive Learning with Temporal Cognizance for Audio-Visual Representation Generation.

[BibT_eX]

[DOI]

Chandrashekhar Lavania

Sundararajan Srinivasan

Katrin Kirchhoff

Proceedings of the IEEE International Conference on Acoustics, 2022

2021

Detecting Expressions with Multimodal Transformers.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Self-Supervised Learning with Cross-Modal Transformers for Emotion Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Disentanglement for Audio-Visual Emotion Recognition Using Multitask Setup.

[BibT_eX]

[DOI]

Raghuveer Peri

Charles Bradshaw

Proceedings of the IEEE International Conference on Acoustics, 2021

Audiovisual Highlight Detection in Videos.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

2020

Multiresolution and Multimodal Speech Recognition with Transformers.

[BibT_eX]

[DOI]

Georgios Paraskevopoulos

CoRR, 2020

Multi-channel Acoustic Modeling using Mixed Bitrate OPUS Compression.

[BibT_eX]

[DOI]

Minhua Wu

CoRR, 2020

Multi-Modal Embeddings Using Multi-Task Learning for Emotion Recognition.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Training Strategies to Handle Missing Modalities for Audio-Visual Expression Recognition.

[BibT_eX]

[DOI]

Proceedings of the Companion Publication of the 2020 International Conference on Multimodal Interaction, 2020

Fully Learnable Front-End for Multi-Channel Acoustic Modeling Using Semi-Supervised Learning.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Robust Multi-Channel Speech Recognition Using Frequency Aligned Network.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Multimodal and Multiresolution Speech Recognition with Transformers.

[BibT_eX]

[DOI]

Georgios Paraskevopoulos

Sree Hari Krishnan Parthasarathi

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

2019

Frequency Domain Multi-channel Acoustic Modeling for Distant Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

Improving Noise Robustness of Automatic Speech Recognition via Parallel Data and Teacher-student Learning.

[BibT_eX]

[DOI]

Ladislav Mosner

Minhua Wu

Anirudh Raju

Proceedings of the IEEE International Conference on Acoustics, 2019

Multi-geometry Spatial Acoustic Modeling for Distant Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

2018

Detecting Media Sound Presence in Acoustic Scenes.

[BibT_eX]

[DOI]

Constantinos Papayiannis

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

2013

An Overview on Perceptually Motivated Audio Indexing and Classification.

[BibT_eX]

[DOI]

Gaël Richard

Proc. IEEE, 2013

Affective classification of generic audio clips using regression models.

[BibT_eX]

[DOI]

Nikos Malandrakis

Alexandros Potamianos

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

2012

Exemplar-Based Processing for Speech Recognition: An Overview.

[BibT_eX]

[DOI]

IEEE Signal Process. Mag., 2012

Latent perceptual mapping with data-driven variable-length acoustic units for template-based speech recognition.

[BibT_eX]

[DOI]

Jerome R. Bellegarda

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

2011

Towards the influence of vibration on evaluation of speech utterances in mobile devices.

[BibT_eX]

[DOI]

Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2011

Hotflashes: Thumbnailing videos of social gatherings by detecting camera flash illuminated frames.

[BibT_eX]

[DOI]

Vladan Velisavljevic

Yujie Qin

Proceedings of the 2011 IEEE International Conference on Multimedia and Expo, 2011

Experiments in context-independent recognition of non-lexical 'yes' or 'no' responses.

[BibT_eX]

[DOI]

Nathalie Diehl

Proceedings of the IEEE International Conference on Acoustics, 2011

2010

A demonstration of automatic recognition of 'yes' or 'no' non-lexical verbal responses for speech-based interaction.

[BibT_eX]

[DOI]

Nathalie Diehl

Proceedings of the 2010 IEEE Spoken Language Technology Workshop, 2010

An N-gram model for unstructured audio signals toward information retrieval.

[BibT_eX]

[DOI]

Proceedings of the 2010 IEEE International Workshop on Multimedia Signal Processing, 2010

Latent perceptual mapping: a new acoustic modeling framework for speech recognition.

[BibT_eX]

[DOI]

Jerome R. Bellegarda

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Towards evaluation of example-based audio retrieval system using affective dimensions.

[BibT_eX]

[DOI]

Proceedings of the 2010 IEEE International Conference on Multimedia and Expo, 2010

Using naïve text queries for robust audio information retrieval.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2010

Clustering audio clips by context-free description and affective ratings.

[BibT_eX]

[DOI]

Julia Seebode

Proceedings of the 18th European Signal Processing Conference, 2010

Acoustic stopwords for unstructured audio information retrieval.

[BibT_eX]

[DOI]

Proceedings of the 18th European Signal Processing Conference, 2010

2009

Acoustic topic model for audio information retrieval.

[BibT_eX]

[DOI]

Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2009

Saliency-driven unstructured acoustic scene classification using latent perceptual indexing.

[BibT_eX]

[DOI]

Ozlem Kalinli

Proceedings of the 2009 IEEE International Workshop on Multimedia Signal Processing, 2009

Emotion classification in children's speech using fusion of acoustic and linguistic features.

[BibT_eX]

[DOI]

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

A divide-and-conquer approach to Latent Perceptual Indexing of audio for large Web 2.0 applications.

[BibT_eX]

[DOI]

Proceedings of the 2009 IEEE International Conference on Multimedia and Expo, 2009

2008

Classification of sound clips by two schemes: Using onomatopoeia and semantic labels.

[BibT_eX]

[DOI]

Proceedings of the 2008 IEEE International Conference on Multimedia and Expo, 2008

Audio retrieval by latent perceptual indexing.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2008

2007

Experiments in Automatic Genre Classification of Full-length Music Tracks using Audio Activity Rate.

[BibT_eX]

[DOI]

Proceedings of the IEEE 9th Workshop on Multimedia Signal Processing, 2007

Analysis of Audio Clustering using Word Descriptions.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2007

Discriminating Two Types of Noise Sources using Cortical Representation and Dimension Reduction Technique.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2007

2006

An attribute-based approach to audio description applied to segmenting vocal sections in popular music songs.

[BibT_eX]

[DOI]

Proceedings of the IEEE 8th Workshop on Multimedia Signal Processing, 2006

Speech Recognition Engineering Issues in Speech to Speech Translation System Design for Low Resource Languages and Domains.

[BibT_eX]

[DOI]

Sankaranarayanan Ananthakrishnan

Venkata Ramana Rao Gadde

Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

Vector-based Representation and Clustering of Audio Using Onomatopoeia Words.

[BibT_eX]

[DOI]

Proceedings of the Aurally Informed Performance: Integrating Machine Listening and Auditory Presentation in Robotic Systems, 2006

2003

An empirical text transformation method for spontaneous speech synthesizers.

[BibT_eX]

[DOI]