Stavros Petridis

According to our database1, Stavros Petridis authored at least 87 papers between 2006 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Hearing Loss Detection from Facial Expressions in One-on-one Conversations.
CoRR, 2024

Diffused Heads: Diffusion Models Beat GANs on Talking-Face Generation.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024

2023
KAN-AV dataset for audio-visual face and speech analysis in the wild.
Image Vis. Comput., December, 2023

Self-Supervised Video-Centralised Transformer for Video Face Clustering.
IEEE Trans. Pattern Anal. Mach. Intell., November, 2023

End-to-End Video-to-Speech Synthesis Using Generative Adversarial Networks.
IEEE Trans. Cybern., June, 2023

Does Visual Self-Supervision Improve Learning of Speech Representations for Emotion Recognition?
IEEE Trans. Affect. Comput., 2023

TorchAudio 2.1: Advancing speech recognition, self-supervised learning, and audio processing components for PyTorch.
CoRR, 2023

SparseVSR: Lightweight and Noise Robust Visual Speech Recognition.
CoRR, 2023

Laughing Matters: Introducing Laughing-Face Generation using Diffusion Models.
CoRR, 2023

Is dataset condensation a silver bullet for healthcare data sharing?
CoRR, 2023

SynthVSR: Scaling Up Visual Speech Recognition With Synthetic Supervision.
CoRR, 2023

Jointly Learning Visual and Auditory Speech Representations from Raw Data.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Learning Cross-Lingual Visual Speech Representations.
Proceedings of the IEEE International Conference on Acoustics, 2023

LA-VOCE: LOW-SNR Audio-Visual Speech Enhancement Using Neural Vocoders.
Proceedings of the IEEE International Conference on Acoustics, 2023

Auto-AVSR: Audio-Visual Speech Recognition with Automatic Labels.
Proceedings of the IEEE International Conference on Acoustics, 2023

SS-VAERR: Self-Supervised Apparent Emotional Reaction Recognition from Video.
Proceedings of the 17th IEEE International Conference on Automatic Face and Gesture Recognition, 2023

SynthVSR: Scaling Up Visual Speech RecognitionWith Synthetic Supervision.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Laughing Matters: Introducing Audio-Driven Laughing-Face Generation with Diffusion Models.
Proceedings of the 34th British Machine Vision Conference 2023, 2023

2022
Visual speech recognition for multiple languages in the wild.
Nat. Mac. Intell., November, 2022

Streaming Audio-Visual Speech Recognition with Alignment Regularization.
CoRR, 2022

SVTS: Scalable Video-to-Speech Synthesis.
Proceedings of the Interspeech 2022, 2022

Training Strategies for Improved Lip-Reading.
Proceedings of the IEEE International Conference on Acoustics, 2022

Leveraging Real Talking Faces via Self-Supervision for Robust Forgery Detection.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021
Domain Generalisation for Apparent Emotional Facial Expression Recognition across Age-Groups.
CoRR, 2021

Lip-reading with Densely Connected Temporal Convolutional Networks.
Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2021

LiRA: Learning Visual Speech Representations from Audio Through Self-Supervision.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

DINO: A Conditional Energy-Based GAN for Domain Translation.
Proceedings of the 9th International Conference on Learning Representations, 2021

End-To-End Audio-Visual Speech Recognition with Conformers.
Proceedings of the IEEE International Conference on Acoustics, 2021

Detecting Adversarial Attacks on Audiovisual Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2021

Towards Practical Lipreading with Distilled and Efficient Models.
Proceedings of the IEEE International Conference on Acoustics, 2021

Lips Don't Lie: A Generalisable and Robust Approach To Face Forgery Detection.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

2020
Blind Audio-Visual Localization and Separation via Low-Rank and Sparsity.
IEEE Trans. Cybern., 2020

End-to-end visual speech recognition for small-scale datasets.
Pattern Recognit. Lett., 2020

Realistic Speech-Driven Facial Animation with GANs.
Int. J. Comput. Vis., 2020

Learning Speech Representations from Raw Audio by Joint Audiovisual Self-Supervision.
CoRR, 2020

Does Visual Self-Supervision Improve Learning of Speech Representations?
CoRR, 2020

Domain Adversarial Neural Networks for Dysarthric Speech Recognition.
Proceedings of the Interspeech 2020, 2020

Visually Guided Self Supervised Learning of Speech Representations.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Lipreading Using Temporal Convolutional Networks.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Speech-Driven Facial Animation Using Polynomial Fusion of Features.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Towards Pose-Invariant Lip-Reading.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019
A real-time and unsupervised face re-identification system for human-robot interaction.
Pattern Recognit. Lett., 2019

Detecting Adversarial Attacks On Audio-Visual Speech Recognition.
CoRR, 2019

Video-Driven Speech Reconstruction Using Generative Adversarial Networks.
Proceedings of the Interspeech 2019, 2019

Investigating the Lombard Effect Influence on End-to-End Audio-Visual Speech Recognition.
Proceedings of the Interspeech 2019, 2019

End-to-End Speech-Driven Realistic Facial Animation with Temporal GANs.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2019

2018
Transfer Learning for Action Unit Recognition.
CoRR, 2018

Audio-Visual Speech Recognition with a Hybrid CTC/Attention Architecture.
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

End-to-End Audiovisual Speech Recognition.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Visual-Only Recognition of Normal, Whispered and Silent Speech.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Online Attention for Interpretable Conflict Estimation in Political Debates.
Proceedings of the 13th IEEE International Conference on Automatic Face & Gesture Recognition, 2018

End-to-End Speech-Driven Facial Animation with Temporal GANs.
Proceedings of the British Machine Vision Conference 2018, 2018

2017
Local Deep Neural Networks for Age and Gender Classification.
CoRR, 2017

Audio-visual object localization and separation using low-rank and sparsity.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

End-to-end visual speech recognition with LSTMS.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

End-to-End Multi-View Lipreading.
Proceedings of the British Machine Vision Conference 2017, 2017

End-to-End Audiovisual Fusion with LSTMs.
Proceedings of the Auditory-Visual Speech Processing, 2017

2016
Discrimination Between Native and Non-Native Speech Using Visual Features Only.
IEEE Trans. Cybern., 2016

Prediction-Based Audiovisual Fusion for Classification of Non-Linguistic Vocalisations.
IEEE Trans. Affect. Comput., 2016

Multi-modal Neural Conditional Ordinal Random Fields for agreement level estimation.
Proceedings of the 23rd International Conference on Pattern Recognition, 2016

Deep complementary bottleneck features for visual speech recognition.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

2015
The MAHNOB Mimicry Database: A database of naturalistic human interactions.
Pattern Recognit. Lett., 2015

Comparison of single-model and multiple-model prediction-based audiovisual fusion.
Proceedings of the Auditory-Visual Speech Processing, 2015

Neural conditional ordinal random fields for agreement level estimation.
Proceedings of the 2015 International Conference on Affective Computing and Intelligent Interaction, 2015

2014
Discriminating Native from Non-Native Speech Using Fusion of Visual Cues.
Proceedings of the ACM International Conference on Multimedia, MM '14, Orlando, FL, USA, November 03, 2014

Visual-only discrimination between native and non-native speech.
Proceedings of the IEEE International Conference on Acoustics, 2014

2013
The MAHNOB Laughter database.
Image Vis. Comput., 2013

Bimodal log-linear regression for fusion of audio and visual features.
Proceedings of the ACM Multimedia Conference, 2013

Audiovisual Detection of Laughter in Human-Machine Interaction.
Proceedings of the 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction, 2013

Audiovisual Detection of Behavioural Mimicry.
Proceedings of the 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction, 2013

2012
Audiovisual discrimination between laughter and speech.
PhD thesis, 2012

Comparison of prediction-based fusion and feature-level fusion across different learning models.
Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

Audiovisual vocal outburst classification in noisy acoustic conditions.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

2011
Audiovisual Discrimination Between Speech and Laughter: Why and When Visual Information Might Help.
IEEE Trans. Multim., 2011

Audiovisual classification of vocal outbursts in human conversation using Long-Short-Term Memory networks.
Proceedings of the IEEE International Conference on Acoustics, 2011

Prediction-based classification for audiovisual discrimination between laughter and speech.
Proceedings of the Ninth IEEE International Conference on Automatic Face and Gesture Recognition (FG 2011), 2011

2010
Classifying laughter and speech using audio-visual feature prediction.
Proceedings of the IEEE International Conference on Acoustics, 2010

2009
Static vs. dynamic modeling of human nonverbal behavior from multiple cues and modalities.
Proceedings of the 11th International Conference on Multimodal Interfaces, 2009

Is this joke really funny? judging the mirth by audiovisual laughter analysis.
Proceedings of the 2009 IEEE International Conference on Multimedia and Expo, 2009

2008
Learning to Detect Aircraft at Low Resolutions.
Proceedings of the Computer Vision Systems, 6th International Conference, 2008

Audiovisual laughter detection based on temporal features.
Proceedings of the 10th International Conference on Multimodal Interfaces, 2008

Audiovisual discrimination between laughter and speech.
Proceedings of the IEEE International Conference on Acoustics, 2008

Fusion of audio and visual cues for laughter detection.
Proceedings of the 7th ACM International Conference on Image and Video Retrieval, 2008

2007
Machine learned regression for abductive DNA sequencing.
Proceedings of the Sixth International Conference on Machine Learning and Applications, 2007

Decoding Trace Peak Behaviour - A Neuro-Fuzzy Approach.
Proceedings of the FUZZ-IEEE 2007, 2007

2006
Construction of Neural Network Based Lyapunov Functions.
Proceedings of the International Joint Conference on Neural Networks, 2006

Machine Learning in Basecalling Decoding Trace Peak Behaviour.
Proceedings of the 2006 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, 2006


  Loading...