Abdel-rahman Mohamed

According to our database1, Abdel-rahman Mohamed authored at least 96 papers between 2010 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
VoiceCraft: Zero-Shot Speech Editing and Text-to-Speech in the Wild.
CoRR, 2024

Peacock: A Family of Arabic Multimodal Large Language Models and Benchmarks.
CoRR, 2024

SpeechDPR: End-to-End Spoken Passage Retrieval for Open-Domain Spoken Question Answering.
CoRR, 2024

2023
Twenty-Five Years of Evolution in Speech and Language Processing.
IEEE Signal Process. Mag., July, 2023

LegoNN: Building Modular Encoder-Decoder Models.
IEEE ACM Trans. Audio Speech Lang. Process., 2023

SD-HuBERT: Self-Distillation Induces Syllabic Organization in HuBERT.
CoRR, 2023

Self-Supervised Models of Speech Infer Universal Articulatory Kinematics.
CoRR, 2023

Findings of the 2023 ML-SUPERB Challenge: Pre-Training and Evaluation over More Languages and Beyond.
CoRR, 2023

Low-Resource Self-Supervised Learning with SSL-Enhanced TTS.
CoRR, 2023

AV-SUPERB: A Multi-Task Evaluation Benchmark for Audio-Visual Representation Models.
CoRR, 2023

Syllable Discovery and Cross-Lingual Generalization in a Visually Grounded, Self-Supervised Speech Mode.
CoRR, 2023

ML-SUPERB: Multilingual Speech Universal PERformance Benchmark.
CoRR, 2023

Efficient Speech Representation Learning with Low-Bit Quantization.
CoRR, 2023

Violet: A Vision-Language Model for Arabic Image Captioning with Gemini Decoder.
Proceedings of ArabicNLP 2023, Singapore (Hybrid), December 7, 2023, 2023

Massively Multilingual ASR on 70 Languages: Tokenization, Architecture, and Generalization Capabilities.
Proceedings of the IEEE International Conference on Acoustics, 2023

Do Coarser Units Benefit Cluster Prediction-Based Speech Pre-Training?
Proceedings of the IEEE International Conference on Acoustics, 2023

Continual Learning for On-Device Speech Recognition Using Disentangled Conformers.
Proceedings of the IEEE International Conference on Acoustics, 2023

Evidence of Vocal Tract Articulation in Self-Supervised Learning of Speech.
Proceedings of the IEEE International Conference on Acoustics, 2023

D<sup>3</sup>Former: Debiased Dual Distilled Transformer for Incremental Learning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Findings of the 2023 ML-Superb Challenge: Pre-Training And Evaluation Over More Languages And Beyond.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022
DP-Parse: Finding Word Boundaries from Raw Speech with an Instance Lexicon.
Trans. Assoc. Comput. Linguistics, 2022

Self-Supervised Speech Representation Learning: A Review.
IEEE J. Sel. Top. Signal Process., 2022

Editorial Editorial of Special Issue on Self-Supervised Learning for Speech and Audio Processing.
IEEE J. Sel. Top. Signal Process., 2022

Biased Self-supervised learning for ASR.
CoRR, 2022

STOP: A dataset for Spoken Task Oriented Semantic Parsing.
CoRR, 2022

Generative Spoken Dialogue Language Modeling.
CoRR, 2022

textless-lib: a Library for Textless Spoken Language Processing.
CoRR, 2022

Object Detection in Aerial Images: What Improves the Accuracy?
CoRR, 2022

Stop: A Dataset for Spoken Task Oriented Semantic Parsing.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Superb @ SLT 2022: Challenge on Generalization and Efficiency of Self-Supervised Speech Representation Learning.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Scaling ASR Improves Zero and Few Shot Learning.
Proceedings of the Interspeech 2022, 2022

Learning Lip-Based Audio-Visual Speaker Embeddings with AV-HuBERT.
Proceedings of the Interspeech 2022, 2022

Robust Self-Supervised Audio-Visual Speech Recognition.
Proceedings of the Interspeech 2022, 2022

DUAL: Discrete Spoken Unit Adaptive Learning for Textless Spoken Question Answering.
Proceedings of the Interspeech 2022, 2022

Federated Learning with Partial Model Personalization.
Proceedings of the International Conference on Machine Learning, 2022

Learning Audio-Visual Speech Representation by Masked Multimodal Cluster Prediction.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Textless Speech Emotion Conversion using Discrete & Decomposed Representations.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark for Semantic and Generative Capabilities.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

Unified Speech-Text Pre-training for Speech Translation and Recognition.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

Text-Free Prosody-Aware Generative Spoken Language Modeling.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

2021
HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

Textless Speech Emotion Conversion using Decomposed and Discrete Representations.
CoRR, 2021

SUPERB: Speech Processing Universal PERformance Benchmark.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Speech Resynthesis from Discrete Disentangled Self-Supervised Representations.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Unsupervised Cross-Lingual Representation Learning for Speech Recognition.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Contrastive Semi-Supervised Learning for ASR.
Proceedings of the IEEE International Conference on Acoustics, 2021

Hubert: How Much Can a Bad Teacher Benefit ASR Pre-Training?
Proceedings of the IEEE International Conference on Acoustics, 2021

Kaizen: Continuously Improving Teacher Using Exponential Moving Average for Semi-Supervised Speech Recognition.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

2020
wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations.
CoRR, 2020

wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Large Scale Weakly and Semi-Supervised Learning for Low-Resource Video ASR.
Proceedings of the Interspeech 2020, 2020

Transformer-Based Acoustic Modeling for Hybrid Speech Recognition.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Training ASR Models By Generation of Contextual Information.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Libri-Light: A Benchmark for ASR with Limited or No Supervision.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Effectiveness of Self-Supervised Pre-Training for ASR.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

2019
Effectiveness of self-supervised pre-training for speech recognition.
CoRR, 2019

Enforcing Encoder-Decoder Modularity in Sequence-to-Sequence Models.
CoRR, 2019

Transformers with convolutional context for ASR.
CoRR, 2019

2018
Mechanical Rubbing of Blood Clots Using Helical Robots Under Ultrasound Guidance.
IEEE Robotics Autom. Lett., 2018

Differentiable Greedy Networks.
CoRR, 2018

Direct Optimization of F-Measure for Retrieval-Based Personal Question Answering.
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

2017
Mean Actor Critic.
CoRR, 2017

Deep API Programmer: Learning to Program with APIs.
CoRR, 2017

Sequence Modeling via Segmentations.
Proceedings of the 34th International Conference on Machine Learning, 2017

RobustFill: Neural Program Learning under Noisy I/O.
Proceedings of the 34th International Conference on Machine Learning, 2017

Do Deep Convolutional Nets Really Need to be Deep and Convolutional?
Proceedings of the 5th International Conference on Learning Representations, 2017

Neuro-Symbolic Program Synthesis.
Proceedings of the 5th International Conference on Learning Representations, 2017

2016
Do Deep Convolutional Nets Really Need to be Deep (Or Even Convolutional)?
CoRR, 2016

Memory-augmented Attention Modelling for Videos.
CoRR, 2016

MSR System Description - TAC 2016 KBP Cold Start Slof Filling Track.
Proceedings of the 2016 Text Analysis Conference, 2016

Analysis of Deep Neural Networks with Extended Data Jacobian Matrix.
Proceedings of the 33nd International Conference on Machine Learning, 2016

Exploring multidimensional lstms for large vocabulary ASR.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

2015
Deep Convolutional Neural Networks for Large-scale Speech Tasks.
Neural Networks, 2015

Compressing LSTMs into CNNs.
CoRR, 2015

Deep bi-directional recurrent networks over spectral windows.
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

LSTM time and frequency recurrence for automatic speech recognition.
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

Learning Lexical Embeddings with Syntactic and Lexicographic Knowledge.
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, 2015

2014
Deep Neural Network Acoustic Models for ASR.
PhD thesis, 2014

Convolutional Neural Networks for Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2014

Improvements to filterbank and delta learning within a deep neural network framework.
Proceedings of the IEEE International Conference on Acoustics, 2014

2013
Deep convolutional neural networks for LVCSR.
Proceedings of the IEEE International Conference on Acoustics, 2013

Speech recognition with deep recurrent neural networks.
Proceedings of the IEEE International Conference on Acoustics, 2013

Learning filter banks within a deep neural network framework.
Proceedings of the 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, 2013

Improvements to Deep Convolutional Neural Networks for LVCSR.
Proceedings of the 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, 2013

Hybrid speech recognition with Deep Bidirectional LSTM.
Proceedings of the 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, 2013

2012
Acoustic Modeling Using Deep Belief Networks.
IEEE Trans. Speech Audio Process., 2012

Multiresolution Deep Belief Networks.
Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, 2012

Understanding how Deep Belief Networks perform acoustic modelling.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Applying Convolutional Neural Networks concepts to hybrid NN-HMM model for speech recognition.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

2011
Deep Belief Networks using discriminative features for phone recognition.
Proceedings of the IEEE International Conference on Acoustics, 2011

Making Deep Belief Networks effective for large vocabulary continuous speech recognition.
Proceedings of the 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, 2011

2010
Phone Recognition with the Mean-Covariance Restricted Boltzmann Machine.
Proceedings of the Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010. Proceedings of a meeting held 6-9 December 2010, 2010

Investigation of full-sequence training of deep belief networks for speech recognition.
Proceedings of the INTERSPEECH 2010, 2010

Binary coding of speech spectrograms using a deep auto-encoder.
Proceedings of the INTERSPEECH 2010, 2010

Phone recognition using Restricted Boltzmann Machines.
Proceedings of the IEEE International Conference on Acoustics, 2010


  Loading...