Sreyan Ghosh

Orcid: 0000-0003-3773-561X

According to our database1, Sreyan Ghosh authored at least 46 papers between 2016 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
GAMA: A Large Audio-Language Model with Advanced Audio Understanding and Complex Reasoning Abilities.
CoRR, 2024

LipGER: Visually-Conditioned Generative Error Correction for Robust Automatic Speech Recognition.
CoRR, 2024

ABEX: Data Augmentation for Low-Resource NLU via Expanding Abstract Descriptions.
CoRR, 2024

VDGD: Mitigating LVLM Hallucinations in Cognitive Prompts by Bridging the Visual Perception Gap.
CoRR, 2024

Do Vision-Language Models Understand Compound Nouns?
CoRR, 2024

Do Vision-Language Models Understand Compound Nouns?
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Short Papers, 2024

CoDa: Constrained Generation based Data Augmentation for Low-Resource NLP.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2024, 2024

A Closer Look at the Limitations of Instruction Tuning.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

CompA: Addressing the Gap in Compositional Reasoning in Audio-Language Models.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

FusDom: Combining in-Domain and Out-of-Domain Knowledge for Continuous Self-Supervised Learning.
Proceedings of the IEEE International Conference on Acoustics, 2024

Stable Distillation: Regularizing Continued Pre-Training for Low-Resource Automatic Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2024

Recap: Retrieval-Augmented Audio Captioning.
Proceedings of the IEEE International Conference on Acoustics, 2024

ABEX: Data Augmentation for Low-Resource NLU via Expanding Abstract Descriptions.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

ASPIRE: Language-Guided Data Augmentation for Improving Robustness Against Spurious Correlations.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

2023
AV-RIR: Audio-Visual Room Impulse Response Estimation.
CoRR, 2023

ASPIRE: Language-Guided Augmentation for Robust Image Classification.
CoRR, 2023

BioAug: Conditional Generation based Data Augmentation for Low-Resource Biomedical NER.
Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2023

MMER: Multimodal Multi-task Learning for Speech Emotion Recognition.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

AdVerb: Visually Guided Audio Dereverberation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

SLICER: Learning Universal Audio Representations Using Low-Resource Self-Supervised Pre-Training.
Proceedings of the IEEE International Conference on Acoustics, 2023

Unfused: Unsupervised Finetuning Using Self Supervised Distillation.
Proceedings of the IEEE International Conference on Acoustics, 2023

Data2vec-Aqc: Search for the Right Teaching Assistant in the Teacher-Student Training Setup.
Proceedings of the IEEE International Conference on Acoustics, 2023

MAST: Multiscale Audio Spectrogram Transformers.
Proceedings of the IEEE International Conference on Acoustics, 2023

CoSyn: Detecting Implicit Hate Speech in Online Conversations Using a Context Synergized Hyperbolic Network.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

DALE: Generative Data Augmentation for Low-Resource Legal NLP.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

ACLM: A Selective-Denoising based Generative Data Augmentation Approach for Low-Resource Complex NER.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2022
Decorrelating Feature Spaces for Learning General-Purpose Audio Representations.
IEEE J. Sel. Top. Signal Process., 2022

A novel multimodal dynamic fusion network for disfluency detection in spoken utterances.
CoRR, 2022

Analyzing the factors affecting usefulness of Self-Supervised Pre-trained Representations for Speech Recognition.
CoRR, 2022

PADA: Pruning Assisted Domain Adaptation for Self-Supervised Speech Representations.
CoRR, 2022

A Discourse Aware Sequence Learning Approach for Emotion Recognition in Conversations.
CoRR, 2022

MMER: Multimodal Multi-task learning for Emotion Recognition in Spoken Utterances.
CoRR, 2022

DeLoRes: Decorrelating Latent Spaces for Low-Resource Audio Representation Learning.
CoRR, 2022

PADA: Pruning Assisted Domain Adaptation for Self-Supervised Speech Representations.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022

CCC-WAV2VEC 2.0: Clustering AIDED Cross Contrastive Self-Supervised Learning of Speech Representations.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022

DeToxy: A Large-Scale Multimodal Dataset for Toxicity Classification in Spoken Utterances.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Span Classification with Structured Information for Disfluency Detection in Spoken Utterances.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Span Extraction Aided Improved Code-mixed Sentiment Classification.
Proceedings of the Eighth Workshop on Noisy User-generated Text, 2022

2021
Deep Clustering For General-Purpose Audio Representations.
CoRR, 2021

Speech Toxicity Analysis: A New Spoken Language Processing Task.
CoRR, 2021

Cisco at AAAI-CAD21 shared task: Predicting Emphasis in Presentation Slides using Contextualised Embeddings.
CoRR, 2021

Cisco at SemEval-2021 Task 5: What's Toxic?: Leveraging Transformers for Multiple Toxic Span Extraction from Online Comments.
Proceedings of the 15th International Workshop on Semantic Evaluation, 2021

Leveraging Transformers for Hate Speech Detection in Conversational Code-Mixed Tweets.
Proceedings of the Working Notes of FIRE 2021, 2021

2020
End-to-End Named Entity Recognition from English Speech.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

2016
Structure and stability of noble gas bound EX3+ compounds (E = C, Ge, Sn, Pb; X = H, F, Cl, Br).
J. Comput. Chem., 2016

Embodied Material Guidance: Augmenting Material for Carving.
Proceedings of the 9th Forum Media Technology 2016 and 2nd All Around Audio Symposium 2016, 2016


  Loading...