Krishna Somandepalli

Orcid: 0000-0002-2845-1079

According to our database1, Krishna Somandepalli authored at least 46 papers between 2016 and 2023.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2023
A study of bias mitigation strategies for speaker recognition.
Comput. Speech Lang., April, 2023

Cross Modal Video Representations for Weakly Supervised Active Speaker Localization.
IEEE Trans. Multim., 2023

VideoPoet: A Large Language Model for Zero-Shot Video Generation.
CoRR, 2023

LanSER: Language-Model Supported Speech Emotion Recognition.
CoRR, 2023

MovieCLIP: Visual Scene Recognition in Movies.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023

MM-AU: Towards Multimodal Understanding of Advertisement Videos.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Heterogeneous Graph Learning for Acoustic Event Classification.
Proceedings of the IEEE International Conference on Acoustics, 2023

A Dataset for Audio-Visual Sound Event Detection in Movies.
Proceedings of the IEEE International Conference on Acoustics, 2023

Contextually-Rich Human Affect Perception Using Multimodal Scene Information.
Proceedings of the IEEE International Conference on Acoustics, 2023

2022
Robust Character Labeling in Movie Videos: Data Resources and Self-Supervised Feature Adaptation.
IEEE Trans. Multim., 2022

Self-Supervised Graphs for Audio Representation Learning With Limited Labeled Data.
IEEE J. Sel. Top. Signal Process., 2022

Studying Large-Scale Behavioral Differences in Auschwitz-Birkenau with Simulation of Gendered Narratives.
Digit. Humanit. Q., 2022

Multitask vocal burst modeling with ResNets and pre-trained paralinguistic Conformers.
CoRR, 2022

Visually-aware Acoustic Event Detection using Heterogeneous Graphs.
Proceedings of the Interspeech 2022, 2022

Federated Learning for Affective Computing Tasks.
Proceedings of the 10th International Conference on Affective Computing and Intelligent Interaction, 2022

2021
Generalized Multiview Shared Subspace Learning Using View Bootstrapping.
IEEE Trans. Signal Process., 2021

Computational Media Intelligence: Human-Centered Machine Analysis of Media.
Proc. IEEE, 2021

Understanding of Emotion Perception from Art.
CoRR, 2021

Representation of professions in entertainment media: Insights into frequency and sentiment trends through computational text analysis.
CoRR, 2021

Loss Function Approaches for Multi-label Music Tagging.
Proceedings of the 18th International Conference on Content-Based Multimedia Indexing, 2021

A Computational Tool to Study Vocal Participation of Women in UN-ITU Meetings.
Proceedings of the 18th International Conference on Content-Based Multimedia Indexing, 2021

2020
Multi-Face: Self-supervised Multiview Adaptation for Robust Face Clustering in Videos.
CoRR, 2020

Victim or Perpetrator? Analysis of Violent Characters Portrayals from Movie Scripts.
CoRR, 2020

Generalized Multi-view Shared Subspace Learning using View Bootstrapping.
CoRR, 2020

Crossmodal learning for audio-visual speech event localization.
CoRR, 2020

An Empirical Analysis of Information Encoded in Disentangled Neural Speaker Representations.
Proceedings of the Odyssey 2020: The Speaker and Language Recognition Workshop, 2020

ATQAM/MAST'20: Joint Workshop on Aesthetic and Technical Quality Assessment of Multimedia and Media Analytics for Societal Trends.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

MediaEval 2020 Emotion and Theme Recognition in Music Task: Loss Function Approaches for Multi-label Music Tagging.
Proceedings of the Working Notes Proceedings of the MediaEval 2020 Workshop, 2020

Robust Speaker Recognition Using Unsupervised Adversarial Invariance.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Vocal Tract Articulatory Contour Detection in Real-Time Magnetic Resonance Images Using Spatio-Temporal Context.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Joint Estimation and Analysis of Risk Behavior Ratings in Movie Scripts.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

2019
Multimodal Representation Learning using Deep Multiset Canonical Correlation.
CoRR, 2019

Multiview Shared Subspace Learning Across Speakers and Speech Commands.
Proceedings of the Interspeech 2019, 2019

Identifying Therapist and Client Personae for Therapeutic Alliance Estimation.
Proceedings of the Interspeech 2019, 2019

Toward Visual Voice Activity Detection for Unconstrained Videos.
Proceedings of the 2019 IEEE International Conference on Image Processing, 2019

Reinforcing Self-expressive Representation with Constraint Propagation for Face Clustering in Movies.
Proceedings of the IEEE International Conference on Acoustics, 2019

Speaker Agnostic Foreground Speech Detection from Audio Recordings in Workplace Settings from Wearable Recorders.
Proceedings of the IEEE International Conference on Acoustics, 2019

Robust Speech Activity Detection in Movie Audio: Data Resources and Experimental Evaluation.
Proceedings of the IEEE International Conference on Acoustics, 2019

Violence Rating Prediction from Movie Scripts.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
Unsupervised Discovery of Character Dictionaries in Animation Movies.
IEEE Trans. Multim., 2018

Improving Gender Identification in Movie Audio Using Cross-Domain Data.
Proceedings of the Interspeech 2018, 2018

Multimodal Representation of Advertisements Using Segment-level Autoencoders.
Proceedings of the 2018 on International Conference on Multimodal Interaction, 2018

2017
The Neural Correlates of Emotional Lability in Children with Autism Spectrum Disorder.
Brain Connect., 2017

Semantic Edge Detection for Tracking Vocal Tract Air-Tissue Boundaries in Real-Time Magnetic Resonance Images.
Proceedings of the Interspeech 2017, 2017

2016
Online Affect Tracking with Multimodal Kalman Filters.
Proceedings of the 6th International Workshop on Audio/Visual Emotion Challenge, 2016

Articulatory Synthesis Based on Real-Time Magnetic Resonance Imaging Data.
Proceedings of the Interspeech 2016, 2016


  Loading...