Soham Deshmukh

According to our database1, Soham Deshmukh authored at least 26 papers between 2019 and 2025.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
OpenBEATs: A Fully Open-Source General-Purpose Audio Encoder.
CoRR, July, 2025

CoLMbo: Speaker Language Model for Descriptive Profiling.
CoRR, June, 2025

Mellow: a small audio language model for reasoning.
CoRR, March, 2025

ADIFF: Explaining audio difference using natural language.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

MACE: Leveraging Audio for Evaluating Audio Captioning Systems.
Proceedings of the IEEE International Conference on Acoustics, 2025

Audio Entailment: Assessing Deductive Reasoning for Audio Understanding.
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

2024
Domain Adaptation for Contrastive Audio-Language Models.
Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

PAM: Prompting Audio-Language Models for Audio Quality Assessment.
Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

SELM: Enhancing Speech Emotion Recognition for Out-of-Domain Scenarios.
Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Natural Language Supervision For General-Purpose Audio Representations.
Proceedings of the IEEE International Conference on Acoustics, 2024

Prompting Audios Using Acoustic Properties for Emotion Representation.
Proceedings of the IEEE International Conference on Acoustics, 2024

Training Audio Captioning Models without Audio.
Proceedings of the IEEE International Conference on Acoustics, 2024

2023
LoFT: Local Proxy Fine-tuning For Improving Transferability Of Adversarial Attacks Against Large Language Model.
CoRR, 2023

Synergy between human and machine approaches to sound/scene recognition and processing: An overview of ICASSP special session.
CoRR, 2023

Pengi: An Audio Language Model for Audio Tasks.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Audio Retrieval with WavText5K and CLAP Training.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Multi-View Learning for Speech Emotion Recognition with Categorical Emotion, Categorical Sentiment, and Dimensional Scores.
Proceedings of the IEEE International Conference on Acoustics, 2023

CLAP Learning Audio Concepts from Natural Language Supervision.
Proceedings of the IEEE International Conference on Acoustics, 2023

2022
Describing emotions with acoustic property prompts for speech emotion recognition.
CoRR, 2022

Adapting Task-Oriented Dialogue Models for Email Conversations.
CoRR, 2022

2021
NaRLE: Natural Language Models using Reinforcement Learning with Emotion Feedback.
CoRR, 2021

Improving Weakly Supervised Sound Event Detection with Self-Supervised Auxiliary Tasks.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Detection of Covid-19 Through the Analysis of Vocal Fold Oscillations.
Proceedings of the IEEE International Conference on Acoustics, 2021

Interpreting Glottal Flow Dynamics for Detecting Covid-19 From Voice.
Proceedings of the IEEE International Conference on Acoustics, 2021

2020
Multi-Task Learning for Interpretable Weakly Labelled Sound Event Detection.
CoRR, 2020

2019
Attacker Behaviour Profiling using Stochastic Ensemble of Hidden Markov Models.
CoRR, 2019


  Loading...