Gustav Eje Henter

Orcid: 0000-0002-1643-1054

Affiliations:
  • KTH Royal Institute of Technology, Stockholm, Sweden


According to our database1, Gustav Eje Henter authored at least 76 papers between 2010 and 2023.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2023
Listen, Denoise, Action! Audio-Driven Motion Synthesis with Diffusion Models.
ACM Trans. Graph., August, 2023

A Comprehensive Review of Data-Driven Co-Speech Gesture Generation.
Comput. Graph. Forum, May, 2023

Unified speech and gesture synthesis using flow matching.
CoRR, 2023

Matcha-TTS: A fast TTS architecture with conditional flow matching.
CoRR, 2023

On the Use of Self-Supervised Speech Representations in Spontaneous Speech Synthesis.
CoRR, 2023

Diff-TTSG: Denoising probabilistic integrated speech and gesture synthesis.
CoRR, 2023

Evaluating gesture-generation in a large-scale open challenge: The GENEA Challenge 2022.
CoRR, 2023

Context-specific kernel-based hidden Markov model for time series analysis.
CoRR, 2023

GENEA Workshop 2023: The 4th Workshop on Generation and Evaluation of Non-verbal Behaviour for Embodied Agents.
Proceedings of the 25th International Conference on Multimodal Interaction, 2023

"Am I listening?", Evaluating the Quality of Generated Data-driven Listening Motion.
Proceedings of the International Conference on Multimodal Interaction, 2023

The GENEA Challenge 2023: A large-scale evaluation of gesture generation models in monadic and dyadic settings.
Proceedings of the 25th International Conference on Multimodal Interaction, 2023

A Processing Framework to Access Large Quantities of Whispered Speech Found in ASMR.
Proceedings of the IEEE International Conference on Acoustics, 2023

Autovocoder: Fast Waveform Generation from a Learned Speech Representation Using Differentiable Digital Signal Processing.
Proceedings of the IEEE International Conference on Acoustics, 2023

A Comparative Study of Self-Supervised Speech Representations in Read and Spontaneous TTS.
Proceedings of the IEEE International Conference on Acoustics, 2023

Prosody-Controllable Spontaneous TTS with Neural HMMS.
Proceedings of the IEEE International Conference on Acoustics, 2023

2022
Kernel-based hidden Markov conditional densities.
Comput. Stat. Data Anal., 2022

OverFlow: Putting flows on top of neural transducers for better TTS.
CoRR, 2022

Predicting pairwise preferences between TTS audio stimuli using parallel ratings data and anti-symmetric twin neural networks.
Proceedings of the Interspeech 2022, 2022

Speech Audio Corrector: using speech from non-target speakers for one-off correction of mispronunciations in grapheme-input text-to-speech.
Proceedings of the Interspeech 2022, 2022

The GENEA Challenge 2022: A large evaluation of data-driven co-speech gesture generation.
Proceedings of the International Conference on Multimodal Interaction, 2022

GENEA Workshop 2022: The 3rd Workshop on Generation and Evaluation of Non-verbal Behaviour for Embodied Agents.
Proceedings of the International Conference on Multimodal Interaction, 2022

Neural HMMS Are All You Need (For High-Quality Attention-Free TTS).
Proceedings of the IEEE International Conference on Acoustics, 2022

Wavebender GAN: An Architecture for Phonetically Meaningful Speech Manipulation.
Proceedings of the IEEE International Conference on Acoustics, 2022

Multimodal Analysis of the Predictability of Hand-gesture Properties.
Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems, 2022

2021
Transflower: probabilistic autoregressive dance generation with multimodal attention.
ACM Trans. Graph., 2021

Moving Fast and Slow: Analysis of Representations and Post-Processing in Speech-Driven Automatic Gesture Generation.
Int. J. Hum. Comput. Interact., 2021

Multimodal Capture of Patient Behaviour for Improved Detection of Early Dementia: Clinical Feasibility and Preliminary Results.
Frontiers Comput. Sci., 2021

Normalizing Flow based Hidden Markov Models for Classification of Speech Phones with Explainability.
CoRR, 2021

Speech2Properties2Gestures: Gesture-Property Prediction as a Tool for Generating Representational Gestures from Speech.
Proceedings of the IVA '21: ACM International Conference on Intelligent Virtual Agents, 2021

A Large, Crowdsourced Evaluation of Gesture Generation Systems on Common Data: The GENEA Challenge 2020.
Proceedings of the IUI '21: 26th International Conference on Intelligent User Interfaces, 2021

Integrated Speech and Gesture Synthesis.
Proceedings of the ICMI '21: International Conference on Multimodal Interaction, 2021

GENEA Workshop 2021: The 2nd Workshop on Generation and Evaluation of Non-verbal Behaviour for Embodied Agents.
Proceedings of the ICMI '21: International Conference on Multimodal Interaction, 2021

HEMVIP: Human Evaluation of Multiple Videos in Parallel.
Proceedings of the ICMI '21: International Conference on Multimodal Interaction, 2021

Full-Glow: Fully Conditional Glow for More Realistic Image Generation.
Proceedings of the Pattern Recognition - 43rd DAGM German Conference, DAGM GCPR 2021, Bonn, Germany, September 28, 2021

The Case for Translation-Invariant Self-Attention in Transformer-Based Language Models.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

2020
MoGlow: probabilistic and controllable motion synthesis using normalising flows.
ACM Trans. Graph., 2020

Robust model training and generalisation with Studentising flows.
CoRR, 2020

Style-Controllable Speech-Driven Gesture Synthesis Using Normalising Flows.
Comput. Graph. Forum, 2020

Robust Classification Using Hidden Markov Models and Mixtures of Normalizing Flows.
Proceedings of the 30th IEEE International Workshop on Machine Learning for Signal Processing, 2020

Let's Face It: Probabilistic Multi-modal Interlocutor-aware Generation of Facial Gestures in Dyadic Settings.
Proceedings of the IVA '20: ACM International Conference on Intelligent Virtual Agents, 2020

Generating coherent spontaneous speech and gesture from text.
Proceedings of the IVA '20: ACM International Conference on Intelligent Virtual Agents, 2020

Gesticulator: A framework for semantically-aware speech-driven gesture generation.
Proceedings of the ICMI '20: International Conference on Multimodal Interaction, 2020

Breathing and Speech Planning in Spontaneous Speech Synthesis.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019
Transformation of low-quality device-recorded speech to high-quality speech using improved SEGAN model.
CoRR, 2019

Analyzing Input and Output Representations for Speech-Driven Gesture Generation.
Proceedings of the 19th ACM International Conference on Intelligent Virtual Agents, 2019

Spontaneous Conversational Speech Synthesis from Found Data.
Proceedings of the Interspeech 2019, 2019

Off the Cuff: Exploring Extemporaneous Speech Delivery with TTS.
Proceedings of the Interspeech 2019, 2019

Casting to Corpus: Segmenting and Selecting Spontaneous Dialogue for Tts with a Cnn-lstm Speaker-dependent Breath Detector.
Proceedings of the IEEE International Conference on Acoustics, 2019

On the Importance of Representations for Speech-Driven Gesture Generation.
Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems, 2019

2018
Investigating different representations for modeling and controlling multiple emotions in DNN-based speech synthesis.
Speech Commun., 2018

Deep Encoder-Decoder Models for Unsupervised Learning of Controllable Speech Synthesis.
CoRR, 2018

Kernel Density Estimation-Based Markov Models with Hidden State.
CoRR, 2018

Analysing Shortcomings of Statistical Parametric Speech Synthesis.
CoRR, 2018

Cyborg Speech: Deep Multilingual Speech Synthesis for Generating Segmental Foreign Accent with Natural Prosody.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017
Consensus-based Sequence Training for Video Captioning.
CoRR, 2017

Misperceptions of the Emotional Content of Natural and Vocoded Speech in a Car.
Proceedings of the Interspeech 2017, 2017

Principles for Learning Controllable TTS from Annotated and Latent Variation.
Proceedings of the Interspeech 2017, 2017

Adapting and controlling DNN-based speech synthesis using input codes.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

2016
Bayesian Analysis of Phoneme Confusion Matrices.
IEEE ACM Trans. Audio Speech Lang. Process., 2016

Minimum Entropy Rate Simplification of Stochastic Processes.
IEEE Trans. Pattern Anal. Mach. Intell., 2016

Median-based generation of synthetic speech durations using a non-parametric approach.
Proceedings of the 2016 IEEE Spoken Language Technology Workshop, 2016

A Hierarchical Predictor of Synthetic Speech Naturalness Using Neural Networks.
Proceedings of the Interspeech 2016, 2016

A Template-Based Approach for Speech Synthesis Intonation Generation Using LSTMs.
Proceedings of the Interspeech 2016, 2016

From HMMS to DNNS: Where do the improvements come from?
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Robust TTS duration modelling using DNNS.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Testing the consistency assumption: Pronunciation variant forced alignment in read and spontaneous speech synthesis.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

2015
Are we using enough listeners? no! - an empirically-supported critique of interspeech 2014 TTS evaluations.
Proceedings of the INTERSPEECH 2015, 2015

2014
Measuring the perceptual effects of modelling assumptions in speech synthesis using stimuli constructed from repeated natural speech.
Proceedings of the INTERSPEECH 2014, 2014

A flexible front-end for HTS.
Proceedings of the INTERSPEECH 2014, 2014

2013
Probabilistic Sequence Models with Speech and Language Applications.
PhD thesis, 2013

Maximizing Phoneme Recognition Accuracy for Enhanced Speech Intelligibility in Noise.
IEEE Trans. Speech Audio Process., 2013

Picking up the pieces: Causal states in noisy data, and how to recover them.
Pattern Recognit. Lett., 2013

2012
Enhancing Subjective Speech Intelligibility Using a Statistical Model of Speech.
Proceedings of the INTERSPEECH 2012, 2012

Gaussian process dynamical models for nonparametric speech representation and synthesis.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

2011
Intermediate-State HMMs to Capture Continuously-Changing Signal Features.
Proceedings of the INTERSPEECH 2011, 2011

2010
Simplified probability models for generative tasks: A rate-distortion approach.
Proceedings of the 18th European Signal Processing Conference, 2010


  Loading...