Engin Erzin

Orcid: 0000-0002-2715-2368

Affiliations:
  • Koç University, Istanbul, Turkey


According to our database1, Engin Erzin authored at least 115 papers between 1993 and 2023.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2023
The eHRI database: a multimodal database of engagement in human-robot interactions.
Lang. Resour. Evaluation, September, 2023

Use of Affective Visual Information for Summarization of Human-Centric Videos.
IEEE Trans. Affect. Comput., 2023

AffectON: Incorporating Affect Into Dialog Generation.
IEEE Trans. Affect. Comput., 2023

Role of Audio In Video Summarization.
Proceedings of the IEEE International Conference on Acoustics, 2023

2022
Emotion Dependent Domain Adaptation for Speech Driven Affective Facial Feature Synthesis.
IEEE Trans. Affect. Comput., 2022

Training Socially Engaging Robots: Modeling Backchannel Behaviors with Batch Reinforcement Learning.
IEEE Trans. Affect. Comput., 2022

Role of Audio in Audio-Visual Video Summarization.
CoRR, 2022

HuBERT-TR: Reviving Turkish Automatic Speech Recognition with Self-supervised Speech Representation Learning.
CoRR, 2022

Audience Response Prediction from Textual Context.
CoRR, 2022

Detection of Stride Time and Stance Phase Ratio from Accelerometer Data for Gait Analysis.
Proceedings of the 30th Signal Processing and Communications Applications Conference, 2022

Affective Burst Detection from Speech using Kernel-fusion Dilated Convolutional Neural Networks.
Proceedings of the 30th European Signal Processing Conference, 2022

2021
Domain Adaptation for Food Intake Classification With Teacher/Student Learning.
IEEE Trans. Multim., 2021

Improving phoneme recognition of throat microphone speech recordings using transfer learning.
Speech Commun., 2021

Use of affect context in dyadic interactions for continuous emotion recognition.
Speech Commun., 2021

Investigating Contributions of Speech and Facial Landmarks for Talking Head Generation.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Engagement Rewarded Actor-Critic with Conservative Q-Learning for Speech-Driven Laughter Backchannel Generation.
Proceedings of the ICMI '21: International Conference on Multimodal Interaction, 2021

2020
Vocal Tract Contour Tracking in rtMRI Using Deep Temporal Regression Network.
IEEE ACM Trans. Audio Speech Lang. Process., 2020

Affective synthesis and animation of arm gestures from speech prosody.
Speech Commun., 2020

Multimodal Continuous Emotion Recognition using Deep Multi-Task Learning with Correlation Loss.
CoRR, 2020

Emotion Dependent Facial Animation from Affective Speech.
Proceedings of the 22nd IEEE International Workshop on Multimedia Signal Processing, 2020

Automatic Vocal Tractlandmark Tracking in Rtmri Using Fully Convolutional Networks and Kalman Filter.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019
Head Nod Detection in Dyadic Conversations.
Proceedings of the 27th Signal Processing and Communications Applications Conference, 2019

A New Interface for Affective State Estimation and Annotation from Speech.
Proceedings of the 27th Signal Processing and Communications Applications Conference, 2019

Speech Driven Backchannel Generation Using Deep Q-Network for Enhancing Engagement in Human-Robot Interaction.
Proceedings of the Interspeech 2019, 2019

Batch Recurrent Q-Learning for Backchannel Generation Towards Engaging Agents.
Proceedings of the 8th International Conference on Affective Computing and Intelligent Interaction, 2019

2018
On the importance of hidden bias and hidden entropy in representational efficiency of the Gaussian-Bipolar Restricted Boltzmann Machines.
Neural Networks, 2018

A Deep Learning Approach for Data Driven Vocal Tract Area Function Estimation.
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

Multimodal prediction of head nods in dyadic conversations.
Proceedings of the 26th Signal Processing and Communications Applications Conference, 2018

Food intake detection using autoencoder-based deep neural networks.
Proceedings of the 26th Signal Processing and Communications Applications Conference, 2018

Audio-Visual Prediction of Head-Nod and Turn-Taking Events in Dyadic Interactions.
Proceedings of the Interspeech 2018, 2018

Monitoring Infant's Emotional Cry in Domestic Environments Using the Capsule Network Architecture.
Proceedings of the Interspeech 2018, 2018

Detection of Food Intake Events From Throat Microphone Recordings Using Convolutional Neural Networks.
Proceedings of the 2018 IEEE International Conference on Multimedia & Expo Workshops, 2018

Multifaceted Engagement in Social Interaction with a Machine: The JOKER Project.
Proceedings of the 13th IEEE International Conference on Automatic Face & Gesture Recognition, 2018

Multimodal Speech Driven Facial Shape Animation Using Deep Neural Networks.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

2017
Audio-Facial Laughter Detection in Naturalistic Dyadic Conversations.
IEEE Trans. Affect. Comput., 2017

The JESTKOD database: an affective multimodal database of dyadic interactions.
Lang. Resour. Evaluation, 2017

Real-time audiovisual laughter detection.
Proceedings of the 25th Signal Processing and Communications Applications Conference, 2017

Classification of ingestion sounds using Hilbert-huang transform.
Proceedings of the 25th Signal Processing and Communications Applications Conference, 2017

Empirical Mode Decomposition of Throat Microphone Recordings for Intake Classification.
Proceedings of the 2nd International Workshop on Multimedia for Personal Health and Health Care, 2017

Analysis of Engagement and User Experience with a Laughter Responsive Social Robot.
Proceedings of the Interspeech 2017, 2017

Cross-Subject Continuous Emotion Recognition Using Speech and Body Motion in Dyadic Interactions.
Proceedings of the Interspeech 2017, 2017

Vocal Tract Airway Tissue Boundary Tracking for rtMRI Using Shape and Appearance Priors.
Proceedings of the Interspeech 2017, 2017

Affect recognition from lip articulations.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Use of affect based interaction classification for continuous emotion tracking.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Speech features for telemonitoring of Parkinson's disease symptoms.
Proceedings of the 2017 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2017

2016
Source and Filter Estimation for Throat-Microphone Speech Enhancement.
IEEE ACM Trans. Audio Speech Lang. Process., 2016

Multimodal analysis of speech and arm motion for prosody-driven synthesis of beat gestures.
Speech Commun., 2016

Food intake classification using throat microphone.
Proceedings of the 24th Signal Processing and Communication Application Conference, 2016

Analysis of JestKOD database using affective state annotations.
Proceedings of the 24th Signal Processing and Communication Application Conference, 2016

Real-time speech driven gesture animation.
Proceedings of the 24th Signal Processing and Communication Application Conference, 2016

Use of Agreement/Disagreement Classification in Dyadic Interactions for Continuous Emotion Recognition.
Proceedings of the Interspeech 2016, 2016

Agreement and disagreement classification of dyadic interactions using vocal and gestural cues.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

A subjective listening test of six different artificial bandwidth extension approaches in English, Chinese, German, and Korean.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

2015
Affect burst detection using multi-modal cues.
Proceedings of the 2015 23nd Signal Processing and Communications Applications Conference (SIU), 2015

Artificial bandwidth extension of speech excitation.
Proceedings of the 2015 23nd Signal Processing and Communications Applications Conference (SIU), 2015

JESTKOD database: Dyadic interaction analysis.
Proceedings of the 2015 23nd Signal Processing and Communications Applications Conference (SIU), 2015

Synchronous overlap and add of spectra for enhancement of excitation in artificial bandwidth extension of speech.
Proceedings of the INTERSPEECH 2015, 2015

Continuous emotion tracking using total variability space.
Proceedings of the INTERSPEECH 2015, 2015

Affect-expressive hand gestures synthesis and animation.
Proceedings of the 2015 IEEE International Conference on Multimedia and Expo, 2015

2014
Affect burst recognition using multi-modal cues.
Proceedings of the 2014 22nd Signal Processing and Communications Applications Conference (SIU), 2014

A phonetic classification for throat microphone enhancement.
Proceedings of the 2014 22nd Signal Processing and Communications Applications Conference (SIU), 2014

Analysis of interaction attitudes using data-driven hand gesture phrases.
Proceedings of the IEEE International Conference on Acoustics, 2014

2013
Artificial bandwidth extension of spectral envelope along a Viterbi path.
Speech Commun., 2013

Speech rhythm-driven gesture animation.
Proceedings of the 21st Signal Processing and Communications Applications Conference, 2013

A new statistical excitation mapping for enhancement of throat microphone recordings.
Proceedings of the INTERSPEECH 2013, 2013

Enhancement of throat microphone recordings by learning phone-dependent mappings of speech spectra.
Proceedings of the IEEE International Conference on Acoustics, 2013

Multimodal analysis of speech prosody and upper body gestures using hidden semi-Markov models.
Proceedings of the IEEE International Conference on Acoustics, 2013

2012
Correction to "Learn2Dance: Learning Statistical Music-to-Dance Mappings for Choreography Synthesis".
IEEE Trans. Multim., 2012

Learn2Dance: Learning Statistical Music-to-Dance Mappings for Choreography Synthesis.
IEEE Trans. Multim., 2012

Evaluation of emotion recognition from speech.
Proceedings of the 20th Signal Processing and Communications Applications Conference, 2012

2011
Formant position based weighted spectral features for emotion recognition.
Speech Commun., 2011

RANSAC-Based Training Data Selection for Speaker State Recognition.
Proceedings of the INTERSPEECH 2011, 2011

Artificial bandwidth extension of spectral envelope with temporal clustering.
Proceedings of the IEEE International Conference on Acoustics, 2011

2010
RANSAC-based training data selection for emotion recognition from spontaneous speech.
Proceedings of the 3rd international workshop on Affective interaction in natural environments, 2010

Use of Line Spectral Frequencies for Emotion Recognition from Speech.
Proceedings of the 20th International Conference on Pattern Recognition, 2010

Multi-modal analysis of dance performances for music-driven choreography synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2010

RANSAC-Based Training Data Selection on Spectral Features for Emotion Recognition from Spontaneous Speech.
Proceedings of the Analysis of Verbal and Nonverbal Communication and Enactment. The Processing Issues, 2010

2009
Improving Throat Microphone Speech Recognition by Joint Analysis of Throat and Acoustic Microphone Recordings.
IEEE Trans. Speech Audio Process., 2009

Inter Genre Similarity Modelling For Automatic Music Genre Classification
CoRR, 2009

Improving automatic emotion recognition from speech signals.
Proceedings of the INTERSPEECH 2009, 2009

2008
Analysis of Head Gesture and Prosody Patterns for Prosody-Driven Head-Gesture Animation.
IEEE Trans. Pattern Anal. Mach. Intell., 2008

An audio-driven dancing avatar.
J. Multimodal User Interfaces, 2008

Unsupervised dance figure analysis from video for dancing Avatar animation.
Proceedings of the International Conference on Image Processing, 2008

Audio-driven human body motion analysis and synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2008

Evaluation of audio features for audio-visual analysis of dance figures.
Proceedings of the 2008 16th European Signal Processing Conference, 2008

2007
Audiovisual Synchronization and Fusion Using Canonical Correlation Analysis.
IEEE Trans. Multim., 2007

Automatic Classification of Musical Genres Using Inter-Genre Similarity.
IEEE Signal Process. Lett., 2007

Multicamera Audio-Visual Analysis of Dance Figures.
Proceedings of the 2007 IEEE International Conference on Multimedia and Expo, 2007

Estimation and Analysis of Facial Animation Parameter Patterns.
Proceedings of the International Conference on Image Processing, 2007

Prosody-Driven Head-Gesture Animation.
Proceedings of the IEEE International Conference on Acoustics, 2007

Multicamera audio-visual analysis of dance figures using segmented body model.
Proceedings of the 15th European Signal Processing Conference, 2007

2006
Discriminative Analysis of Lip Motion Features for Speaker Identification and Speech-Reading.
IEEE Trans. Image Process., 2006

Multimodal speaker/speech recognition using lip motion, lip texture and audio.
Signal Process., 2006

Multimodal Person Recognition for Human-Vehicle Interaction.
IEEE Multim., 2006

Extracting Gene Regulation Information from Microarray Time-Series Data Using Hidden Markov Models.
Proceedings of the Computer and Information Sciences, 2006

Combined Gesture-Speech Analysis and Speech Driven Gesture Synthesis.
Proceedings of the 2006 IEEE International Conference on Multimedia and Expo, 2006

Multimodal Speaker Identification Using Canonical Correlation Analysis.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

2005
Multimodal speaker identification using an adaptive classifier cascade based on modality reliability.
IEEE Trans. Multim., 2005

Boosting Classifiers for Music Genre Classification.
Proceedings of the Computer and Information Sciences, 2005

Robust Lip-Motion Features For Speaker Identification.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

Lip feature extraction based on audio-visual correlation.
Proceedings of the 13th European Signal Processing Conference, 2005

2004
On optimal selection of lip-motion features for speaker identification.
Proceedings of the IEEE 6th Workshop on Multimedia Signal Processing, 2004

Adaptive classifier cascade for multimodal speaker identification.
Proceedings of the INTERSPEECH 2004, 2004

Discriminative lip-motion features for biometric speaker identification.
Proceedings of the 2004 International Conference on Image Processing, 2004

2003
Multimodal speaker identification with audio-video processing.
Proceedings of the 2003 International Conference on Image Processing, 2003

Joint audio-video processing for biometric speaker identification.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

2000
Shaped fixed codebook search for CELP coding at low bit rates.
Proceedings of the IEEE International Conference on Acoustics, 2000

1999
Teager energy based feature parameters for speech recognition in car noise.
IEEE Signal Process. Lett., 1999

1997
Natural quality variable-rate spectral speech coding below 3.0 kbps.
Proceedings of the 1997 IEEE International Conference on Acoustics, 1997

1995
Line spectral frequency representation of subbands for speech recognition.
Signal Process., 1995

Subband analysis for robust speech recognition in the presence of car noise.
Proceedings of the 1995 International Conference on Acoustics, 1995

Adaptive filtering approaches for non-Gaussian stable processes.
Proceedings of the 1995 International Conference on Acoustics, 1995

1994
Interframe differential coding of line spectrum frequencies.
IEEE Trans. Speech Audio Process., 1994

Adaptive filtering for non-Gaussian stable processes.
IEEE Signal Process. Lett., 1994

1993
Interframe differential vector coding of line spectrum frequencies.
Proceedings of the IEEE International Conference on Acoustics, 1993


  Loading...