Ramon Sanabria

Affiliations:
  • Carnegie Mellon University, Language Technology Institute, Pittsburgh, PA, USA


According to our database1, Ramon Sanabria authored at least 31 papers between 2016 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Layer-Wise Analysis of Self-Supervised Acoustic Word Embeddings: A Study on Speech Emotion Recognition.
CoRR, 2024

2023
Acoustic Word Embeddings for Untranscribed Target Languages with Continued Pretraining and Learned Pooling.
CoRR, 2023

Analyzing Acoustic Word Embeddings from Pre-Trained Self-Supervised Speech Models.
Proceedings of the IEEE International Conference on Acoustics, 2023

Measuring the Impact of Domain Factors in Self-Supervised Pre-Training.
Proceedings of the IEEE International Conference on Acoustics, 2023

The Edinburgh International Accents of English Corpus: Towards the Democratization of English ASR.
Proceedings of the IEEE International Conference on Acoustics, 2023

2022
Measuring the Impact of Individual Domain Factors in Self-Supervised Pre-Training.
CoRR, 2022

2021
On the Difficulty of Segmenting Words with Attention.
CoRR, 2021

Talk, Don't Write: A Study of Direct Speech-Based Image Retrieval.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

2020
Grounded Sequence to Sequence Transduction.
IEEE J. Sel. Top. Signal Process., 2020

Transfer learning for multimodal dialog.
Comput. Speech Lang., 2020

Multimodal Speech Recognition with Unstructured Audio Masking.
CoRR, 2020

Looking Enhances Listening: Recovering Missing Speech Using Images.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Fine-Grained Grounding for Multimodal Speech Recognition.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020

2019
Analyzing Utility of Visual Context in Multimodal Speech Recognition Under Noisy Conditions.
CoRR, 2019

Grounding Object Detections With Transcriptions.
CoRR, 2019


MediaEval 2019: Eyes and Ears Together.
Proceedings of the Working Notes Proceedings of the MediaEval 2019 Workshop, 2019

Multitask Learning For Different Subword Segmentations In Neural Machine Translation.
Proceedings of the 16th International Conference on Spoken Language Translation, 2019

CMU's Machine Translation System for IWSLT 2019.
Proceedings of the 16th International Conference on Spoken Language Translation, 2019

The IWSLT 2019 Evaluation Campaign.
Proceedings of the 16th International Conference on Spoken Language Translation, 2019

Multimodal Grounding for Sequence-to-sequence Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2019

2018
How2: A Large-scale Dataset for Multimodal Language Understanding.
CoRR, 2018

Hierarchical Multi Task Learning With CTC.
CoRR, 2018


Hierarchical Multitask Learning With CTC.
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

Eyes and Ears Together: New Task for Multimodal Spoken Content Analysis.
Proceedings of the Working Notes Proceedings of the MediaEval 2018 Workshop, 2018

Subword and Crossword Units for CTC Acoustic Models.
Proceedings of the Interspeech 2018, 2018

End-to-end Multimodal Speech Recognition.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Sequence-Based Multi-Lingual Low Resource Speech Recognition.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017
Comparison of Decoding Strategies for CTC Acoustic Models.
Proceedings of the Interspeech 2017, 2017

2016
Robust end-to-end deep audiovisual speech recognition.
CoRR, 2016


  Loading...