Engin Erzin

CoRR, May, 2025

ARTMV: A Cross-Modal Art Music Video Dataset for Proprioceptive Valence Perception.

[BibT_eX]

[DOI]

Sitare Arslantürk

Proceedings of the IEEE International Conference on Multimedia and Expo, ICME 2025 - Workshops, Nantes, France, June 30, 2025

2024

Efficient and Safe Contact-rich pHRI via Subtask Detection and Motion Estimation using Deep Learning.

[BibT_eX]

[DOI]

Pouya P. Niaz

Cagatay Basdogan

CoRR, 2024

Cluster-to-Predict Affect Contours from Speech.

[BibT_eX]

[DOI]

Gökhan Kusçu

CoRR, 2024

Learning-based Adaptive Admittance Controller for Efficient and Safe pHRI in Contact-rich Manufacturing Tasks.

[BibT_eX]

[DOI]

Pouya P. Niaz

Cagatay Basdogan

Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2024

2023

The eHRI database: a multimodal database of engagement in human-robot interactions.

[BibT_eX]

[DOI]

Ege Kesim

Tuge Numanoglu

Öykü Zeynep Bayramoglu

Lang. Resour. Evaluation, September, 2023

Use of Affective Visual Information for Summarization of Human-Centric Videos.

[BibT_eX]

[DOI]

IEEE Trans. Affect. Comput., 2023

AffectON: Incorporating Affect Into Dialog Generation.

[BibT_eX]

[DOI]

IEEE Trans. Affect. Comput., 2023

Role of Audio In Video Summarization.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

2022

Emotion Dependent Domain Adaptation for Speech Driven Affective Facial Feature Synthesis.

[BibT_eX]

[DOI]

IEEE Trans. Affect. Comput., 2022

Training Socially Engaging Robots: Modeling Backchannel Behaviors with Batch Reinforcement Learning.

[BibT_eX]

[DOI]

IEEE Trans. Affect. Comput., 2022

Role of Audio in Audio-Visual Video Summarization.

[BibT_eX]

[DOI]

CoRR, 2022

HuBERT-TR: Reviving Turkish Automatic Speech Recognition with Self-supervised Speech Representation Learning.

[BibT_eX]

[DOI]

Ali Safaya

CoRR, 2022

Audience Response Prediction from Textual Context.

[BibT_eX]

[DOI]

Bekir Berker Türker

CoRR, 2022

Detection of Stride Time and Stance Phase Ratio from Accelerometer Data for Gait Analysis.

[BibT_eX]

[DOI]

Proceedings of the 30th Signal Processing and Communications Applications Conference, 2022

Affective Burst Detection from Speech using Kernel-fusion Dilated Convolutional Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the 30th European Signal Processing Conference, 2022

2021

Domain Adaptation for Food Intake Classification With Teacher/Student Learning.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2021

Improving phoneme recognition of throat microphone speech recordings using transfer learning.

[BibT_eX]

[DOI]

Speech Commun., 2021

Use of affect context in dyadic interactions for continuous emotion recognition.

[BibT_eX]

[DOI]

Syeda Narjis Fatima

Speech Commun., 2021

Investigating Contributions of Speech and Facial Landmarks for Talking Head Generation.

[BibT_eX]

[DOI]

Ege Kesim

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Engagement Rewarded Actor-Critic with Conservative Q-Learning for Speech-Driven Laughter Backchannel Generation.

[BibT_eX]

[DOI]

Öykü Zeynep Bayramoglu

Tevfik Metin Sezgin

Proceedings of the ICMI '21: International Conference on Multimodal Interaction, 2021

2020

Vocal Tract Contour Tracking in rtMRI Using Deep Temporal Regression Network.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2020

Affective synthesis and animation of arm gestures from speech prosody.

[BibT_eX]

[DOI]

Speech Commun., 2020

Multimodal Continuous Emotion Recognition using Deep Multi-Task Learning with Correlation Loss.

[BibT_eX]

[DOI]

CoRR, 2020

Emotion Dependent Facial Animation from Affective Speech.

[BibT_eX]

[DOI]

Proceedings of the 22nd IEEE International Workshop on Multimedia Signal Processing, 2020

Automatic Vocal Tractlandmark Tracking in Rtmri Using Fully Convolutional Networks and Kalman Filter.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019

Head Nod Detection in Dyadic Conversations.

[BibT_eX]

[DOI]

Proceedings of the 27th Signal Processing and Communications Applications Conference, 2019

A New Interface for Affective State Estimation and Annotation from Speech.

[BibT_eX]

[DOI]

Proceedings of the 27th Signal Processing and Communications Applications Conference, 2019

Speech Driven Backchannel Generation Using Deep Q-Network for Enhancing Engagement in Human-Robot Interaction.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Batch Recurrent Q-Learning for Backchannel Generation Towards Engaging Agents.

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Affective Computing and Intelligent Interaction, 2019

2018

On the importance of hidden bias and hidden entropy in representational efficiency of the Gaussian-Bipolar Restricted Boltzmann Machines.

[BibT_eX]

[DOI]

Altynbek Isabekov

Neural Networks, 2018

A Deep Learning Approach for Data Driven Vocal Tract Area Function Estimation.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

Multimodal prediction of head nods in dyadic conversations.

[BibT_eX]

[DOI]

Proceedings of the 26th Signal Processing and Communications Applications Conference, 2018

Food intake detection using autoencoder-based deep neural networks.

[BibT_eX]

[DOI]

Mehmet Ali Tugtekin Turan

Proceedings of the 26th Signal Processing and Communications Applications Conference, 2018

Audio-Visual Prediction of Head-Nod and Turn-Taking Events in Dyadic Interactions.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Monitoring Infant's Emotional Cry in Domestic Environments Using the Capsule Network Architecture.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Detection of Food Intake Events From Throat Microphone Recordings Using Convolutional Neural Networks.

[BibT_eX]

[DOI]

Guillaume Dubuisson Duplessis

Proceedings of the 2018 IEEE International Conference on Multimedia & Expo Workshops, 2018

Multifaceted Engagement in Social Interaction with a Machine: The JOKER Project.

[BibT_eX]

[DOI]

Laurence Devillers

Sophie Rosset

Proceedings of the 13th IEEE International Conference on Automatic Face & Gesture Recognition, 2018

Multimodal Speech Driven Facial Shape Animation Using Deep Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

2017

Audio-Facial Laughter Detection in Naturalistic Dyadic Conversations.

[BibT_eX]

[DOI]

IEEE Trans. Affect. Comput., 2017

The JESTKOD database: an affective multimodal database of dyadic interactions.

[BibT_eX]

[DOI]

Lang. Resour. Evaluation, 2017

Real-time audiovisual laughter detection.

[BibT_eX]

[DOI]

Proceedings of the 25th Signal Processing and Communications Applications Conference, 2017

Classification of ingestion sounds using Hilbert-huang transform.

[BibT_eX]

[DOI]

Proceedings of the 25th Signal Processing and Communications Applications Conference, 2017

Empirical Mode Decomposition of Throat Microphone Recordings for Intake Classification.

[BibT_eX]

[DOI]

Proceedings of the 2nd International Workshop on Multimedia for Personal Health and Health Care, 2017

Analysis of Engagement and User Experience with a Laughter Responsive Social Robot.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Cross-Subject Continuous Emotion Recognition Using Speech and Body Motion in Dyadic Interactions.

[BibT_eX]

[DOI]

Syeda Narjis Fatima

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Vocal Tract Airway Tissue Boundary Tracking for rtMRI Using Shape and Appearance Priors.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Affect recognition from lip articulations.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Use of affect based interaction classification for continuous emotion tracking.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Speech features for telemonitoring of Parkinson's disease symptoms.

[BibT_eX]

[DOI]

Proceedings of the 2017 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2017

2016

Source and Filter Estimation for Throat-Microphone Speech Enhancement.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2016

Multimodal analysis of speech and arm motion for prosody-driven synthesis of beat gestures.

[BibT_eX]

[DOI]

Speech Commun., 2016

Food intake classification using throat microphone.

[BibT_eX]

[DOI]

Proceedings of the 24th Signal Processing and Communication Application Conference, 2016

Analysis of JestKOD database using affective state annotations.

[BibT_eX]

[DOI]

Sinan Keçeci

Yucel Yemez

Proceedings of the 24th Signal Processing and Communication Application Conference, 2016

Real-time speech driven gesture animation.

[BibT_eX]

[DOI]

Proceedings of the 24th Signal Processing and Communication Application Conference, 2016

Use of Agreement/Disagreement Classification in Dyadic Interactions for Continuous Emotion Recognition.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Agreement and disagreement classification of dyadic interactions using vocal and gestural cues.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

A subjective listening test of six different artificial bandwidth extension approaches in English, Chinese, German, and Korean.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

2015

Affect burst detection using multi-modal cues.

[BibT_eX]

[DOI]

Proceedings of the 2015 23nd Signal Processing and Communications Applications Conference (SIU), 2015

Artificial bandwidth extension of speech excitation.

[BibT_eX]

[DOI]

Proceedings of the 2015 23nd Signal Processing and Communications Applications Conference (SIU), 2015

JESTKOD database: Dyadic interaction analysis.

[BibT_eX]

[DOI]

Proceedings of the 2015 23nd Signal Processing and Communications Applications Conference (SIU), 2015

Synchronous overlap and add of spectra for enhancement of excitation in artificial bandwidth extension of speech.

[BibT_eX]

[DOI]

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Continuous emotion tracking using total variability space.

[BibT_eX]

[DOI]

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Affect-expressive hand gestures synthesis and animation.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Multimedia and Expo, 2015

2014

Affect burst recognition using multi-modal cues.

[BibT_eX]

[DOI]

Proceedings of the 2014 22nd Signal Processing and Communications Applications Conference (SIU), 2014

A phonetic classification for throat microphone enhancement.

[BibT_eX]

[DOI]

Proceedings of the 2014 22nd Signal Processing and Communications Applications Conference (SIU), 2014

Analysis of interaction attitudes using data-driven hand gesture phrases.

[BibT_eX]

[DOI]

Zhaojun Yang

Angeliki Metallinou

Shrikanth S. Narayanan

Proceedings of the IEEE International Conference on Acoustics, 2014

2013

Artificial bandwidth extension of spectral envelope along a Viterbi path.

[BibT_eX]

[DOI]

Can Yagli

Speech Commun., 2013

Speech rhythm-driven gesture animation.

[BibT_eX]

[DOI]

Proceedings of the 21st Signal Processing and Communications Applications Conference, 2013

A new statistical excitation mapping for enhancement of throat microphone recordings.

[BibT_eX]

[DOI]

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Enhancement of throat microphone recordings by learning phone-dependent mappings of speech spectra.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2013

Multimodal analysis of speech prosody and upper body gestures using hidden semi-Markov models.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2013

2012

Correction to "Learn2Dance: Learning Statistical Music-to-Dance Mappings for Choreography Synthesis".

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2012

Learn2Dance: Learning Statistical Music-to-Dance Mappings for Choreography Synthesis.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2012

Evaluation of emotion recognition from speech.

[BibT_eX]

[DOI]

Proceedings of the 20th Signal Processing and Communications Applications Conference, 2012

2011

Formant position based weighted spectral features for emotion recognition.

[BibT_eX]

[DOI]

Speech Commun., 2011

RANSAC-Based Training Data Selection for Speaker State Recognition.

[BibT_eX]

[DOI]

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Artificial bandwidth extension of spectral envelope with temporal clustering.

[BibT_eX]

[DOI]

Can Yagli

Proceedings of the IEEE International Conference on Acoustics, 2011

2010

RANSAC-based training data selection for emotion recognition from spontaneous speech.

[BibT_eX]

[DOI]

Proceedings of the 3rd international workshop on Affective interaction in natural environments, 2010

Use of Line Spectral Frequencies for Emotion Recognition from Speech.

[BibT_eX]

[DOI]

Proceedings of the 20th International Conference on Pattern Recognition, 2010

Multi-modal analysis of dance performances for music-driven choreography synthesis.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2010

RANSAC-Based Training Data Selection on Spectral Features for Emotion Recognition from Spontaneous Speech.

[BibT_eX]

[DOI]

Proceedings of the Analysis of Verbal and Nonverbal Communication and Enactment. The Processing Issues, 2010

2009

Improving Throat Microphone Speech Recognition by Joint Analysis of Throat and Acoustic Microphone Recordings.

[BibT_eX]

[DOI]

IEEE Trans. Speech Audio Process., 2009

Inter Genre Similarity Modelling For Automatic Music Genre Classification

[BibT_eX]

[DOI]

Ulas Bagci

CoRR, 2009

Improving automatic emotion recognition from speech signals.

[BibT_eX]

[DOI]

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

2008

Analysis of Head Gesture and Prosody Patterns for Prosody-Driven Head-Gesture Animation.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2008

An audio-driven dancing avatar.

[BibT_eX]

[DOI]

Cristian Canton-Ferrer

Joëlle Tilmanne

A. Tanju Erdem

J. Multimodal User Interfaces, 2008

Unsupervised dance figure analysis from video for dancing Avatar animation.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Image Processing, 2008

Audio-driven human body motion analysis and synthesis.

[BibT_eX]

[DOI]

Ferda Ofli

Cristian Canton-Ferrer

Proceedings of the IEEE International Conference on Acoustics, 2008

Evaluation of audio features for audio-visual analysis of dance figures.

[BibT_eX]

[DOI]

Proceedings of the 2008 16th European Signal Processing Conference, 2008

2007

Audiovisual Synchronization and Fusion Using Canonical Correlation Analysis.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2007

Automatic Classification of Musical Genres Using Inter-Genre Similarity.

[BibT_eX]

[DOI]

Ulas Bagci

IEEE Signal Process. Lett., 2007

Multicamera Audio-Visual Analysis of Dance Figures.

[BibT_eX]

[DOI]

Proceedings of the 2007 IEEE International Conference on Multimedia and Expo, 2007

Estimation and Analysis of Facial Animation Parameter Patterns.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Image Processing, 2007

Prosody-Driven Head-Gesture Animation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2007

Multicamera audio-visual analysis of dance figures using segmented body model.

[BibT_eX]

[DOI]

Proceedings of the 15th European Signal Processing Conference, 2007

2006

Discriminative Analysis of Lip Motion Features for Speaker Identification and Speech-Reading.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2006

Multimodal speaker/speech recognition using lip motion, lip texture and audio.

[BibT_eX]

[DOI]

Signal Process., 2006

Multimodal Person Recognition for Human-Vehicle Interaction.

[BibT_eX]

[DOI]

IEEE Multim., 2006

Extracting Gene Regulation Information from Microarray Time-Series Data Using Hidden Markov Models.

[BibT_eX]

[DOI]

Osman N. Yogurtçu

Attila Gürsoy

Proceedings of the Computer and Information Sciences, 2006

Combined Gesture-Speech Analysis and Speech Driven Gesture Synthesis.

[BibT_eX]

[DOI]

Proceedings of the 2006 IEEE International Conference on Multimedia and Expo, 2006

Multimodal Speaker Identification Using Canonical Correlation Analysis.

[BibT_eX]

[DOI]

Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

2005

Multimodal speaker identification using an adaptive classifier cascade based on modality reliability.

[BibT_eX]

[DOI]

A. Murat Tekalp

IEEE Trans. Multim., 2005

Boosting Classifiers for Music Genre Classification.

[BibT_eX]

[DOI]

Ulas Bagci

Proceedings of the Computer and Information Sciences, 2005

Robust Lip-Motion Features For Speaker Identification.

[BibT_eX]

[DOI]

Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

Lip feature extraction based on audio-visual correlation.

[BibT_eX]

[DOI]

Proceedings of the 13th European Signal Processing Conference, 2005

2004

On optimal selection of lip-motion features for speaker identification.

[BibT_eX]

[DOI]

Proceedings of the IEEE 6th Workshop on Multimedia Signal Processing, 2004

Adaptive classifier cascade for multimodal speaker identification.

[BibT_eX]

[DOI]

Yucel Yemez

A. Murat Tekalp

Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Discriminative lip-motion features for biometric speaker identification.

[BibT_eX]

Proceedings of the 2004 International Conference on Image Processing, 2004

2003

Multimodal speaker identification with audio-video processing.

[BibT_eX]

[DOI]

Proceedings of the 2003 International Conference on Image Processing, 2003

Joint audio-video processing for biometric speaker identification.

[BibT_eX]

[DOI]

Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

2000

Shaped fixed codebook search for CELP coding at low bit rates.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2000

1999

Teager energy based feature parameters for speech recognition in car noise.

[BibT_eX]

[DOI]

Firas Jabloun

IEEE Signal Process. Lett., 1999

1997

Natural quality variable-rate spectral speech coding below 3.0 kbps.

[BibT_eX]

[DOI]

Arun Kumar

Allen Gersho

Proceedings of the 1997 IEEE International Conference on Acoustics, 1997

1995

Line spectral frequency representation of subbands for speech recognition.

[BibT_eX]

[DOI]

Signal Process., 1995

Subband analysis for robust speech recognition in the presence of car noise.

[BibT_eX]

[DOI]

Yasemin Yardimci

Proceedings of the 1995 International Conference on Acoustics, 1995

Adaptive filtering approaches for non-Gaussian stable processes.

[BibT_eX]

[DOI]

Proceedings of the 1995 International Conference on Acoustics, 1995

1994

Interframe differential coding of line spectrum frequencies.

[BibT_eX]

[DOI]

IEEE Trans. Speech Audio Process., 1994

Adaptive filtering for non-Gaussian stable processes.

[BibT_eX]

[DOI]

Orhan Arikan

IEEE Signal Process. Lett., 1994

1993

Interframe differential vector coding of line spectrum frequencies.

[BibT_eX]

[DOI]