Thierry Dutoit

Orcid: 0000-0001-7024-2150

According to our database1, Thierry Dutoit authored at least 225 papers between 1993 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Latent Space Interpolation of Synthesizer Parameters Using Timbre-Regularized Auto-Encoders.
IEEE ACM Trans. Audio Speech Lang. Process., 2024

TIPAA-SSL: Text Independent Phone-to-Audio Alignment based on Self-Supervised Learning and Knowledge Transfer.
CoRR, 2024

Self-Avatar Animation in Virtual Reality: Impact of Motion Signals Artifacts on the Full-Body Pose Reconstruction.
CoRR, 2024

Social-MAE: A Transformer-Based Multimodal Autoencoder for Face and Voice.
Proceedings of the 18th IEEE International Conference on Automatic Face and Gesture Recognition, 2024

2023
A Recipe for Efficient SBIR Models: Combining Relative Triplet Loss with Batch Normalization and Knowledge Distillation.
CoRR, 2023

Developing an Interactive Agent for Blind and Visually Impaired People.
Proceedings of the 2023 ACM International Conference on Interactive Media Experiences, 2023

Validating Objective Evaluation Metric: Is Fréchet Motion Distance able to Capture Foot Skating Artifacts ?
Proceedings of the 2023 ACM International Conference on Interactive Media Experiences, 2023

Objective Evaluation Metric for Motion Generative Models: Validating Fréchet Motion Distance on Foot Skating and Over-smoothing Artifacts.
Proceedings of the 16th ACM SIGGRAPH Conference on Motion, Interaction and Games, 2023

The Limitations of Current Similarity-Based Objective Metrics in the Context of Human-Agent Interaction Applications.
Proceedings of the International Conference on Multimodal Interaction, 2023

Synthesizer Preset Interpolation Using Transformer Auto-Encoders.
Proceedings of the IEEE International Conference on Acoustics, 2023

Deep Learning-Based Stereo Camera Multi-Video Synchronization.
Proceedings of the IEEE International Conference on Acoustics, 2023

Self-Avatar's Animation in VR: Extending Sparse Motion Features with Cartesian Coordinates in Transformer-based Model.
Proceedings of the 34th British Machine Vision Conference Workshop Proceedings, 2023

Cardiotocography Signal Abnormality Detection Based on Deep Semi-Unsupervised Learning.
Proceedings of the IEEE/ACM 10th International Conference on Big Data Computing, 2023

2022
PhyDAA: Physiological Dataset Assessing Attention.
IEEE Trans. Circuits Syst. Video Technol., 2022

Where Is My Mind (Looking at)? A Study of the EEG-Visual Attention Relationship.
Informatics, 2022

Transformers and CNNs both Beat Humans on SBIR.
CoRR, 2022

Analysis of Co-Laughter Gesture Relationship on RGB videos in Dyadic Conversation Contex.
CoRR, 2022

Where Is My Mind (looking at)? Predicting Visual Attention from Brain Activity.
CoRR, 2022

Evaluating the Quality of a Synthesized Motion with the Fréchet Motion Distance.
Proceedings of the SIGGRAPH '22: Special Interest Group on Computer Graphics and Interactive Techniques Conference, Posters, Vancouver BC Canada, August 7, 2022

Spatio-Temporal Analysis of Transformer based Architecture for Attention Estimation from EEG.
Proceedings of the 26th International Conference on Pattern Recognition, 2022

Towards Lightweight Neural Animation: Exploration of Neural Network Pruning in Mixture of Experts-based Animation Models.
Proceedings of the 17th International Joint Conference on Computer Vision, 2022

A Saliency based Feature Fusion Model for EEG Emotion Estimation.
Proceedings of the 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2022

Towards Human Performance on Sketch-Based Image Retrieval.
Proceedings of the CBMI 2022: International Conference on Content-based Multimedia Indexing, Graz, Austria, September 14, 2022

A New Perspective on Smiling and Laughter Detection: Intensity Levels Matter.
Proceedings of the 10th International Conference on Affective Computing and Intelligent Interaction, 2022

2021
ICE-Talk 2: Interface for Controllable Expressive TTS with perceptual assessment tool.
Softw. Impacts, 2021

Behavior and usability analysis for multimodal user interfaces.
J. Multimodal User Interfaces, 2021

Analysis and Assessment of Controllability of an Expressive Deep Learning-Based TTS System.
Informatics, 2021

Improving Synthesizer Programming From Variational Autoencoders Latent Space.
Proceedings of the 24th International Conference on Digital Audio Effects, 2021

2020
Depth prediction from 2D images: A taxonomy and an evaluation study.
Image Vis. Comput., 2020

Detection and identification of European woodpeckers with deep convolutional neural networks.
Ecol. Informatics, 2020

Excitation-based Voice Quality Analysis and Modification.
CoRR, 2020

VERA: Virtual Environments Recording Attention.
Proceedings of the 8th IEEE International Conference on Serious Games and Applications for Health, 2020

Analytic vs. holistic approaches for the live search of sound presets using graphical interpolation.
Proceedings of the 20th International Conference on New Interfaces for Musical Expression, 2020

Laughter Synthesis: Combining Seq2seq Modeling with Transfer Learning.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

ICE-Talk: An Interface for a Controllable Expressive Talking Machine.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Neural Speech Synthesis with Style Intensity Interpolation: A Perceptual Analysis.
Proceedings of the Companion of the 2020 ACM/IEEE International Conference on Human-Robot Interaction, 2020

Unsupervised depth prediction from monocular sequences: Improving performances through instance segmentation.
Proceedings of the 17th Conference on Computer and Robot Vision, 2020

An Experimental Study of the Impact of Pre-Training on the Pruning of a Convolutional Neural Network.
Proceedings of the APPIS 2020: 3rd International Conference on Applications of Intelligent Systems, 2020

Attention Estimation in Virtual Reality with EEG based Image Regression.
Proceedings of the IEEE International Conference on Artificial Intelligence and Virtual Reality, 2020

2019
The Theory behind Controllable Expressive Speech Synthesis: a Cross-disciplinary Approach.
CoRR, 2019

Visualization and Interpretation of Latent Spaces for Controlling Expressive Speech Synthesis Through Audio Analysis.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Emotional Speech Datasets for English Speech Synthesis Purpose: A Review.
Proceedings of the Intelligent Systems and Applications, 2019

Exploring Transfer Learning for Low Resource Emotional TTS.
Proceedings of the Intelligent Systems and Applications, 2019

Leveraging Pre-trained CNN Models for Skeleton-Based Action Recognition.
Proceedings of the Computer Vision Systems, 12th International Conference, 2019

An End-to-end Network to Synthesize Intonation Using a Generalized Command Response Model.
Proceedings of the IEEE International Conference on Acoustics, 2019

An Open-Source Avatar for Real-Time Human-Agent Interaction Applications.
Proceedings of the 8th International Conference on Affective Computing and Intelligent Interaction Workshops and Demos, 2019

2018
HMM-based generation of laughter facial expression.
Speech Commun., 2018

The Emotional Voices Database: Towards Controlling the Emotion Dimension in Voice Generation Systems.
CoRR, 2018

ASR-based Features for Emotion Recognition: A Transfer Learning Approach.
CoRR, 2018

Investigating a Hybrid Learning Approach for Robust Automatic Speech Recognition.
Proceedings of the Statistical Language and Speech Processing, 2018

People Groups Analysis for AR Applications.
Proceedings of the International Conference on 3D Immersion, 2018

A Dyadic Conversation Dataset on Moral Emotions.
Proceedings of the 13th IEEE International Conference on Automatic Face & Gesture Recognition, 2018

2017
3D skeleton-based action recognition by representing motion capture sequences as 2D-RGB images.
Comput. Animat. Virtual Worlds, 2017

Amused speech components analysis and classification: Towards an amusement arousal level assessment system.
Comput. Electr. Eng., 2017

Noise and Speech Estimation as Auxiliary Tasks for Robust Speech Recognition.
Proceedings of the Statistical Language and Speech Processing, 2017

Introducing AmuS: The Amused Speech Database.
Proceedings of the Statistical Language and Speech Processing, 2017

Morphology Independent Feature Engineering in Motion Capture Database for Gesture Evaluation.
Proceedings of the 4th International Conference on Movement Computing, 2017

Investigating the impact of the training data volume for robust speech recognition using multi-task learning.
Proceedings of the 2017 IEEE International Symposium on Signal Processing and Information Technology, 2017

Portable C++ Framework for Low-Latency Musical Touch Interaction with Geometrical Shapes.
Proceedings of the 2017 International Computer Music Conference, 2017

2016
Laughter Research: A Review of the ILHAIRE Project.
Proceedings of the Toward Robotic Socially Believable Behaving Systems - Volume I, 2016

Identification of European woodpecker species in audio recordings from their drumming rolls.
Ecol. Informatics, 2016

I-Vector estimation as auxiliary task for Multi-Task Learning based acoustic modeling for automatic speech recognition.
Proceedings of the 2016 IEEE Spoken Language Technology Workshop, 2016

AVAB-DBS: an Audio-Visual Affect Bursts Database for Synthesis.
Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, 2016

Speaker-aware Multi-Task Learning for automatic speech recognition.
Proceedings of the 23rd International Conference on Pattern Recognition, 2016

Towards a listening agent: a system generating audiovisual laughs and smiles to show interest.
Proceedings of the 18th ACM International Conference on Multimodal Interaction, 2016

Surgery of Speech Synthesis Models to Overcome the Scarcity of Training Data.
Proceedings of the Advances in Speech and Language Technologies for Iberian Languages, 2016

Speaker-aware long short-term memory multi-task learning for speech recognition.
Proceedings of the 24th European Signal Processing Conference, 2016

Audio affect burst synthesis: A multilevel synthesis system for emotional expressions.
Proceedings of the 24th European Signal Processing Conference, 2016

Multi-task learning for speech recognition: an overview.
Proceedings of the 24th European Symposium on Artificial Neural Networks, 2016

InspectorWidget: A System to Analyze Users Behaviors in Their Applications.
Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, 2016

A Semantic and Content-Based Search User Interface for Browsing Large Collections of Foley Sounds.
Proceedings of the Audio Mostly 2016, Norrköping, Sweden, October 4-6, 2016, 2016

2015
A taxonomy of camera calibration and video projection correction methods.
EAI Endorsed Trans. Creative Technol., 2015

An evaluation criterion of saliency models for video seam carving.
Proceedings of the 17th IEEE International Workshop on Multimedia Signal Processing, 2015

Adaptation procedure for HMM-based sensor-dependent gesture recognition.
Proceedings of the 8th ACM SIGGRAPH Conference on Motion in Games, 2015

UMons at MediaEval 2015 Affective Impact of Movies Task including Violent Scenes Detection.
Proceedings of the Working Notes Proceedings of the MediaEval 2015 Workshop, 2015

An HMM approach for synthesizing amused speech with a controllable intensity of smile.
Proceedings of the IEEE International Symposium on Signal Processing and Information Technology, 2015

Towards a level assessment system of amusement in speech signals: Amused speech components classification.
Proceedings of the IEEE International Symposium on Signal Processing and Information Technology, 2015

Speech-laughs: An HMM-based approach for amused speech synthesis.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Synchronization rules for HMM-based audio-visual laughter synthesis.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Shaking and speech-smile vowels classification: An attempt at amusement arousal estimation from speech signals.
Proceedings of the 2015 IEEE Global Conference on Signal and Information Processing, 2015

An HMM-based speech-smile synthesis system: An approach for amusement synthesis.
Proceedings of the 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition, 2015

Breath and repeat: An attempt at enhancing speech-laugh synthesis quality.
Proceedings of the 23rd European Signal Processing Conference, 2015

Video saliency based on rarity prediction: Hyperaptor.
Proceedings of the 23rd European Signal Processing Conference, 2015

Investigating sparse deep neural networks for speech recognition.
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

GMM-based synchronization rules for HMM-based audio-visual laughter synthesis.
Proceedings of the 2015 International Conference on Affective Computing and Intelligent Interaction, 2015

2014
Arousal-Driven Synthesis of Laughter.
IEEE J. Sel. Top. Signal Process., 2014

Automatic Variation of the Degree of Articulation in New HMM-Based Voices.
IEEE J. Sel. Top. Signal Process., 2014

HMM-based speech synthesis with various degrees of articulation: A perceptual study.
Neurocomputing, 2014

Speech polarity determination: A comparative evaluation.
Neurocomputing, 2014

Analysis and HMM-based synthesis of hypo and hyperarticulated speech.
Comput. Speech Lang., 2014

Tangible needle, digital haystack: tangible interfaces for reusing media content organized by similarity.
Proceedings of the Eighth International Conference on Tangible, 2014

Scenarizing CADastre Exquisse: A Crossover between Snoezeling in Hospitals/Domes, and Authoring/Experiencing Soundful Comic Strips.
Proceedings of the MultiMedia Modeling - 20th Anniversary International Conference, 2014

Scenarizing Metropolitan Views: FlanoGraphing the Urban Spaces.
Proceedings of the MultiMedia Modeling - 20th Anniversary International Conference, 2014

The AV-LASYN Database : A synchronous corpus of audio and 3D facial marker data for audio-visual laughter synthesis.
Proceedings of the Ninth International Conference on Language Resources and Evaluation, 2014

A Proximity Grid Optimization Method to Improve Audio Search for Sound Design.
Proceedings of the 15th International Society for Music Information Retrieval Conference, 2014

elite-HTS: a NLP tool for French HMM-based speech synthesis.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Evaluation of HMM-based visual laughter synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2014

Parametric representation for singing voice synthesis: A comparative evaluation.
Proceedings of the IEEE International Conference on Acoustics, 2014

AudioMetro: directing search for sound designers through content-based cues.
Proceedings of the Audio Mostly 2014, AM '14, 2014

2013
Objective Study of Sensor Relevance for Automatic Cough Detection.
IEEE J. Biomed. Health Informatics, 2013

RARE2012: A multi-scale rarity-based saliency detection with its comparative statistical analysis.
Signal Process. Image Commun., 2013

A study of parameters affecting visual saliency assessment.
CoRR, 2013

Detecting Speech Polarity with High-Order Statistics.
Cogn. Comput., 2013

Mage - HMM-based speech synthesis reactively controlled by the articulators.
Proceedings of the Eighth ISCA Tutorial and Research Workshop on Speech Synthesis, 2013

Mage - reactive articulatory feature control of HMM-based parametric speech synthesis.
Proceedings of the Eighth ISCA Tutorial and Research Workshop on Speech Synthesis, 2013

MAGE 2.0: New Features and its Application in the Development of a Talking Guitar.
Proceedings of the 13th International Conference on New Interfaces for Musical Expression, 2013

VideoCycle: User-Friendly Navigation by Similarity in Video Databases.
Proceedings of the Advances in Multimedia Modeling, 19th International Conference, 2013

MAGEFACE: Performative Conversion of Facial Characteristics into Speech Synthesis Parameters.
Proceedings of the Intelligent Technologies for Interactive Entertainment, 2013

Stylistic Walk Synthesis Based on Fourier Decomposition.
Proceedings of the Intelligent Technologies for Interactive Entertainment, 2013

MashtaCycle: On-Stage Improvised Audio Collage by Content-Based Similarity and Gesture Recognition.
Proceedings of the Intelligent Technologies for Interactive Entertainment, 2013

A quantitative comparison of glottal closure instant estimation algorithms on a large variety of singing sounds.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Reactive accent interpolation through an interactive map application.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Biologically plausible context recognition algorithms.
Proceedings of the IEEE International Conference on Image Processing, 2013

Spatio-temporal saliency based on rare model.
Proceedings of the IEEE International Conference on Image Processing, 2013

Saliency and Human Fixations: State-of-the-Art and Study of Comparison Metrics.
Proceedings of the IEEE International Conference on Computer Vision, 2013

Evaluation of HMM-based laughter synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2013

A comparative study of pitch extraction algorithms on a large variety of singing sounds.
Proceedings of the IEEE International Conference on Acoustics, 2013

A quantitative comparison of the most sophisticated EOG-based eye movement recognition techniques.
Proceedings of the 2013 IEEE Symposium on Computational Intelligence, 2013

Automatic Phonetic Transcription of Laughter and Its Application to Laughter Synthesis.
Proceedings of the 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction, 2013

2012
Continuous Control of Style and Style Transitions through Linear Interpolation in Hidden Markov Model Based Walk Synthesis.
Trans. Comput. Sci., 2012

Detection of Glottal Closure Instants From Speech Signals: A Quantitative Review.
IEEE Trans. Speech Audio Process., 2012

The Deterministic Plus Stochastic Model of the Residual Signal and Its Applications.
IEEE Trans. Speech Audio Process., 2012

Stylistic gait synthesis based on hidden Markov models.
EURASIP J. Adv. Signal Process., 2012

A comparative study of glottal source estimation techniques.
Comput. Speech Lang., 2012

Statistical methods for varying the degree of articulation in new HMM-based voices.
Proceedings of the 2012 IEEE Spoken Language Technology Workshop (SLT), 2012

Reactive and continuous control of HMM-based speech synthesis.
Proceedings of the 2012 IEEE Spoken Language Technology Workshop (SLT), 2012

LoopJam: turning the dance floor into a collaborative instrumental map.
Proceedings of the 12th International Conference on New Interfaces for Musical Expression, 2012

MAGE -A Platform for Tangible Speech Synthesis.
Proceedings of the 12th International Conference on New Interfaces for Musical Expression, 2012

Walker Speed Adaptation in Gait Synthesis.
Proceedings of the Motion in Games - 5th International Conference, 2012

Audio and Contact Microphones for Cough Detection.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Are current gait-related artifact removal techniques useful for low-complexity BCIs?
Proceedings of the 2012 International Joint Conference on Neural Networks (IJCNN), 2012

Rare: A new bottom-up saliency model.
Proceedings of the 19th IEEE International Conference on Image Processing, 2012

A subjective assessment of a P300 BCI system for lower-limb rehabilitation purposes.
Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2012

EEG and Human Locomotion - Descending Commands and Sensory Feedback should be Disentangled from Artifacts Thanks to New Experimental Protocols Position Paper.
Proceedings of the BIOSIGNALS 2012, 2012

Dynamic Saliency Models and Human Attention: A Comparative Study on Videos.
Proceedings of the Computer Vision - ACCV 2012, 2012

2011
Causal-anticausal decomposition of speech using complex cepstrum for glottal source estimation.
Speech Commun., 2011

Enriching the user experience with multimodal interfaces.
J. Multimodal User Interfaces, 2011

Optimizing the Performances of a P300-Based Brain-Computer Interface in Ambulatory Conditions.
IEEE J. Emerg. Sel. Topics Circuits Syst., 2011

Adaptive training of hidden Markov models for stylistic walk synthesis.
Proceedings of the International Conference on Computer Graphics and Interactive Techniques, 2011

Perceptual Effects of the Degree of Articulation in HMM-Based Speech Synthesis.
Proceedings of the Advances in Nonlinear Speech Processing, 2011

Oscillating Statistical Moments for Speech Polarity Detection.
Proceedings of the Advances in Nonlinear Speech Processing, 2011

The Attentive Machine: Be Different!
Proceedings of the Intelligent Technologies for Interactive Entertainment, 2011

Continuous Control of the Degree of Articulation in HMM-Based Speech Synthesis.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

3D Saliency for Abnormal Motion Selection: The Role of the Depth Map.
Proceedings of the Computer Vision Systems - 8th International Conference, 2011

Analysis-by-Performance: Gesturally-Controlled Voice Synthesis as an Input for Modelling of Vibrato in Singing.
Proceedings of the 2011 International Computer Music Conference, 2011

Phase-based information for voice pathology detection.
Proceedings of the IEEE International Conference on Acoustics, 2011

Assessment of audio features for automatic cough detection.
Proceedings of the 19th European Signal Processing Conference, 2011

Automatic sleep spindles detection - Overview and development of a standard proposal assessment method.
Proceedings of the 33rd Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2011

Continuous Control of Style through Linear Interpolation in Hidden Markov Model Based Stylistic Walk Synthesis.
Proceedings of the 2011 International Conference on Cyberworlds, 2011

Control of a lower limb active prosthesis with eye movement sequences.
Proceedings of the 2011 IEEE Symposium on Computational Intelligence, 2011

ECG Artifact Removal from Surface EMG Signals by Combining Empirical Mode Decomposition and Independent Component Analysis.
Proceedings of the BIOSIGNALS 2011, 2011

A Phonetic Analysis of Natural Laughter, for Use in Automatic Laughter Processing Systems.
Proceedings of the Affective Computing and Intelligent Interaction, 2011

Toward a Social Attentive Machine.
Proceedings of the Robot-Human Teamwork in Dynamic Adverse Environment, 2011

2010
Cross-disciplinary approaches to multimodal user interfaces.
J. Multimodal User Interfaces, 2010

AVLaughterCycle.
J. Multimodal User Interfaces, 2010

A novel method for pediatric heart sound segmentation without using the ECG.
Comput. Methods Programs Biomed., 2010

Analysis and synthesis of hypo- and hyperarticulated speech.
Proceedings of the Seventh ISCA Tutorial and Research Workshop on Speech Synthesis, 2010

DeviceCycle: Rapid and Reusable Prototyping of Gestural Interfaces, Applied to Audio Browsing by Similarity.
Proceedings of the 10th International Conference on New Interfaces for Musical Expression, 2010

Expressive Gait Synthesis Using PCA and Gaussian Modeling.
Proceedings of the Motion in Games - Third International Conference, 2010

The AVLaughterCycle Database.
Proceedings of the International Conference on Language Resources and Evaluation, 2010

Glottal-based analysis of the lombard effect.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Chirp complex cepstrum-based decomposition for asynchronous glottal analysis.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

On the potential of glottal signatures for speaker recognition.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

A comparative evaluation of pitch modification techniques.
Proceedings of the 18th European Signal Processing Conference, 2010

Towards user-friendly audio creation.
Proceedings of the AM '10, 2010

2009
On the Use of the Correlation between Acoustic Descriptors for the Normal/Pathological Voices Discrimination.
EURASIP J. Adv. Signal Process., 2009

Glottal Source Estimation Using an Automatic Chirp Decomposition.
Proceedings of the Advances in Nonlinear Speech Processing, 2009

Advanced Techniques for Vertical Tablet Playing A Overview of Two Years of Practicing the HandSketch 1.x.
Proceedings of the 9th International Conference on New Interfaces for Musical Expression, 2009

On the mutual information of glottal source estimation techniques for the automatic detection of speech pathologies.
Proceedings of the Sixth International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications, 2009

A deterministic plus stochastic model of the residual signal for improved parametric speech synthesis.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

On the mutual information between source and filter contributions for voice pathology detection.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Glottal closure and opening instant detection from speech signals.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Complex cepstrum-based decomposition of speech for glottal source estimation.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Cross-language voice conversion based on eigenvoices.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Generating Robot/Agent backchannels during a storytelling experiment.
Proceedings of the 2009 IEEE International Conference on Robotics and Automation, 2009

Using a pitch-synchronous residual codebook for hybrid HMM/frame selection speech synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2009

Eigenresiduals for improved parametric speech synthesis.
Proceedings of the 17th European Signal Processing Conference, 2009

2008
Cancelling ECG Artifacts in EEG Using a Modified Independent Component Analysis Approach.
EURASIP J. Adv. Signal Process., 2008

Computerized screening of children congenital heart diseases.
Comput. Methods Programs Biomed., 2008

Glottal Source Estimation Robustness - A Comparison of Sensitivity of Voice Source Estimation Techniques.
Proceedings of the SIGMAP 2008, 2008

Dynamic modality weighting for multi-stream hmms inaudio-visual speech recognition.
Proceedings of the 10th International Conference on Multimodal Interfaces, 2008

Voice source parameters estimation by fitting the glottal formant and the inverse filtering open phase.
Proceedings of the 2008 16th European Signal Processing Conference, 2008

2007
Chirp group delay analysis of speech signals.
Speech Commun., 2007

Realtime and accurate musical control of expression in singing synthesis.
J. Multimodal User Interfaces, 2007

Phase-Based Methods for Voice Source Analysis.
Proceedings of the Advances in Nonlinear Speech Processing, 2007

HandSketch Bi-Manual Controller Investigation on Expressive Control Issues of an Augmented Tablet.
Proceedings of the Seventh International Conference on New Interfaces for Musical Expression, 2007

Improvement of source-tract decomposition of speech using analogy with LF model for glottal source and tube model for vocal tract.
Proceedings of the Fifth International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications, 2007

RAMCESS/handsketch: a multi-representation framework for realtime and expressive singing synthesis.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Causal/anticausal Decomposition for mixed-phase Description of brass and Bowed String sounds.
Proceedings of the 2007 International Computer Music Conference, 2007

Towards a Voice Conversion System Based on Frame Selection.
Proceedings of the IEEE International Conference on Acoustics, 2007

2006
A probabilistic framework for dialog simulation and optimal strategy learning.
IEEE Trans. Speech Audio Process., 2006

Multimodal human-computer interfaces.
Signal Process., 2006

Dynamic Bayesian Networks for NLU Simulation with Applications to Dialog Optimal Strategy Learning.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

Automatic Sleep Spindle Detection in Patients with Sleep Disorders.
Proceedings of the 28th International Conference of the IEEE Engineering in Medicine and Biology Society, 2006

2005
Zeros of Z-transform representation with application to source-filter separation in speech.
IEEE Signal Process. Lett., 2005

Spectral Analysis of Speech Signals Using Chirp Group Delay.
Proceedings of the Progress in Nonlinear Speech Processing, 2005

TTSBOX: a MATLAB toolbox for teaching text-to-speech synthesis.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

Passive versus active: Vocal classification system.
Proceedings of the 13th European Signal Processing Conference, 2005

MaxMBROLA: A Max/MSP MBROLA-based tool for real-time voice synthesis.
Proceedings of the 13th European Signal Processing Conference, 2005

2004
An Algorithm to Estimate Anticausal Glottal Flow Component from Speech Signals.
Proceedings of the Nonlinear Speech Modeling and Applications, 2004

Zeros of z-transform (ZZT) decomposition of speech for source-tract separation.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Improved differential phase spectrum processing for formant tracking.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

A method for glottal formant frequency estimation.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Unusual teaching short-cuts to the Levinson and lattice algorithms.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

Appropriate windowing for group delay analysis and roots of z-transform of speech signals.
Proceedings of the 2004 12th European Signal Processing Conference, 2004

2003
Phonetic alignment: speech synthesis-based vs. Viterbi-based.
Speech Commun., 2003

Text design for TTS speech corpus building using a modified greedy selection.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Aided design of finite-state dialogue management systems.
Proceedings of the 2003 IEEE International Conference on Multimedia and Expo, 2003

2001
An implementation and evaluation of two diphone-based synthesizers for Turkish.
Proceedings of the 4th ITRW on Speech Synthesis, 2001

Demo rystem for NU-MBROLA concatonator.
Proceedings of the 4th ITRW on Speech Synthesis, 2001

From MBROLA to NU-MBROLA.
Proceedings of the 4th ITRW on Speech Synthesis, 2001

2000
EULER: an Open, Generic, Multilingual and Multi-platform Text-to-Speech System.
Proceedings of the Second International Conference on Language Resources and Evaluation, 2000

1998
Automatic prosody generation using suprasegmental unit selection.
Proceedings of the Third ESCA/COCOSDA Workshop on Speech Synthesis, 1998

Fully automatic prosody generator for text-to-speech.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

Phonetic alignment: speech synthesis based vs. hybrid HMM/ANN.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

Plug and play software for designing high-level speech processing systems.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

Comparison of two different text-to-speech alignment systems: Speech synthesis based vs. hybrid HMM/ANN.
Proceedings of the 9th European Signal Processing Conference, 1998

1997
A simple and efficient algorithm for the compression of MBROLA segment databases.
Proceedings of the Fifth European Conference on Speech Communication and Technology, 1997

Diphone concatenation using a harmonic plus noise model of speech.
Proceedings of the Fifth European Conference on Speech Communication and Technology, 1997

High-quality speech synthesis for phonetic speech segmentation.
Proceedings of the Fifth European Conference on Speech Communication and Technology, 1997

1996
On the use of a hybrid harmonic/stochastic model for TTS synthesis-by-concatenation.
Speech Commun., 1996

The MBROLA project: towards a set of high quality speech synthesizers free of use for non commercial purposes.
Proceedings of the 4th International Conference on Spoken Language Processing, 1996

1994
High quality text-to-speech synthesis: a comparison of four candidate algorithms.
Proceedings of ICASSP '94: IEEE International Conference on Acoustics, 1994

1993
MBR-PSOLA: Text-To-Speech synthesis based on an MBE re-synthesis of the segments database.
Speech Commun., 1993

An analysis of the performances of the MBE model when used in the context of a text-to-speech system.
Proceedings of the Third European Conference on Speech Communication and Technology, 1993


  Loading...