Minghui Dong

Orcid: 0000-0001-6543-2929

According to our database1, Minghui Dong authored at least 84 papers between 2000 and 2023.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2023
A Dual Target Neural Network Method for Speech Enhancement.
Proceedings of the International Conference on Asian Language Processing, 2023

2021
End-to-End Detection-Segmentation System for Face Labeling.
IEEE Trans. Emerg. Top. Comput. Intell., 2021

2019
CLU-CNNs: Object detection for medical images.
Neurocomputing, 2019

Sparse fully convolutional network for face labeling.
Neurocomputing, 2019

Towards Good Practices for Video Object Segmentation.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshops, 2019

Temporal Feature Augmented Network for Video Instance Segmentation.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshops, 2019

Implementing Prosodic Phrasing in Chinese End-to-end Speech Synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2019

On the Study of Generative Adversarial Networks for Cross-Lingual Voice Conversion.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

SINGAN: Singing Voice Conversion with Generative Adversarial Networks.
Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

2017
Node-level parallelization for deep neural networks with conditional independent graph.
Neurocomputing, 2017

Multimodal Prediction of Affective Dimensions via Fusing Multiple Regression Techniques.
Proceedings of the Interspeech 2017, 2017

A light-weight method of building an LSTM-RNN-based bilingual tts system.
Proceedings of the 2017 International Conference on Asian Language Processing, 2017

A dual alignment scheme for improved speech-to-singing voice conversion.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

Word level prosody prediction using large audiobook dataset.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

Representing raw linguistic information in chinese text-to-speech system.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

2016
Guest Editorial: Advances in Machine Learning for Speech Processing.
J. Signal Process. Syst., 2016

High quality voice conversion using prosodic and high-resolution spectral features.
Multim. Tools Appl., 2016

Transition-based Parsing with Context Enhancement and Future Reward Reranking.
CoRR, 2016

Mandarin Prosodic Phrase Prediction based on Syntactic Trees.
Proceedings of the 9th ISCA Speech Synthesis Workshop, 2016

An Automatic Voice Conversion Evaluation Strategy Based on Perceptual Background Noise Distortion and Speaker Similarity.
Proceedings of the 9th ISCA Speech Synthesis Workshop, 2016

I2RNTU at SemEval-2016 Task 4: Classifier Fusion for Polarity Classification in Twitter.
Proceedings of the 10th International Workshop on Semantic Evaluation, 2016

Deep Bidirectional LSTM Modeling of Timbre and Prosody for Emotional Voice Conversion.
Proceedings of the Interspeech 2016, 2016

SERAPHIM Live! - Singing Synthesis for the Performer, the Composer, and the 3D Game Developer.
Proceedings of the Interspeech 2016, 2016

SERAPHIM: A Wavetable Synthesis System with 3D Lip Animation for Real-Time Speech and Singing Applications on Mobile Platforms.
Proceedings of the Interspeech 2016, 2016

Audio and face video emotion recognition in the wild using deep neural networks and small datasets.
Proceedings of the 18th ACM International Conference on Multimodal Interaction, 2016

A full training framework of cross-stream dependence modelling for HMM-based singing voice synthesis.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Exemplar-based sparse representation of timbre and prosody for voice conversion.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Combining multiple kernel models for automatic intelligibility detection of pathological speech.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

A Stack LSTM Transition-Based Dependency Parser with Context Enhancement and K-best Decoding.
Proceedings of the Chinese Lexical Semantics - 17th Workshop, 2016

2015
Regularized non-negative matrix factorization using alternating direction method of multipliers and its application to source separation.
Proceedings of the INTERSPEECH 2015, 2015

System fusion for high-performance voice conversion.
Proceedings of the INTERSPEECH 2015, 2015

An alternating optimization approach for phase retrieval.
Proceedings of the INTERSPEECH 2015, 2015

A real-time variable-q non-stationary Gabor transform for pitch shifting.
Proceedings of the INTERSPEECH 2015, 2015

Formant excursion in singing synthesis.
Proceedings of the 2015 IEEE International Conference on Digital Signal Processing, 2015

Sparse representation for frequency warping based voice conversion.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Mandarin prosodic word prediction using dependency relationships.
Proceedings of the 2015 International Conference on Asian Language Processing, 2015

Performance scoring of singing voice.
Proceedings of the 2015 International Conference on Asian Language Processing, 2015

The expression of singing emotion - contradicting the constraints of song.
Proceedings of the 2015 International Conference on Asian Language Processing, 2015

Non-negative matrix factorization using stable alternating direction method of multipliers for source separation.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015

A waveform representation framework for high-quality statistical parametric speech synthesis.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015

Mapping frames with DNN-HMM recognizer for non-parallel voice conversion.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015

Fundamental frequency modeling using wavelets for emotional voice conversion.
Proceedings of the 2015 International Conference on Affective Computing and Intelligent Interaction, 2015

2014
Soft constrained leading voice separation with music score guidance.
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

The power of special characters in prosodicword prediction for Chinese TTS.
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

Acoustic emotion recognition based on fusion of multiple feature-dependent deep Boltzmann machines.
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

A comparative study of spectral transformation techniques for singing voice synthesis.
Proceedings of the INTERSPEECH 2014, 2014

I<sup>2</sup>r speech2singing perfects everyone's singing.
Proceedings of the INTERSPEECH 2014, 2014

Intelligibility detection of pathological speech using asymmetric sparse kernel partial least squares classifier.
Proceedings of the IEEE International Conference on Acoustics, 2014

Emotion analysis of children's stories with context information.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014

Emotional facial expression transfer based on temporal restricted Boltzmann machines.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014

Ensemble Nyström method for predicting conflict level from speech.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014

2013
A dynamic Gaussian process for voice conversion.
Proceedings of the 2013 IEEE International Conference on Multimedia and Expo Workshops, 2013

2012
A study of F0 modelling and generation with lyrics and shape characterization for singing voice synthesis.
Proceedings of the 8th International Symposium on Chinese Spoken Language Processing, 2012

Generalized F0 modelling with absolute and relative pitch features for singing voice synthesis.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Template-based personalized singing voice synthesis.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

2011
Singing Voice Synthesis: Singer-Dependent Vibrato Modeling and Coherent Processing of Spectral Envelope.
Proceedings of the INTERSPEECH 2011, 2011

Solo to a capella conversion - Synthesizing vocal harmony from lead vocals.
Proceedings of the 2011 IEEE International Conference on Multimedia and Expo, 2011

Analyzing the Relationship between Formants and Pitch for Singing Voice.
Proceedings of the International Conference on Asian Language Processing, 2011

Linear Regression for Prosody Prediction via Convex Optimization.
Proceedings of the International Conference on Asian Language Processing, 2011

Speech Emotion Recognition System Based on L1 Regularized Linear Regression and Decision Fusion.
Proceedings of the Affective Computing and Intelligent Interaction, 2011

2010
Feature Integration and Dimension Reduction in Unit Selection TTS.
Int. J. Asian Lang. Process., 2010

Considering readability in text-to-speech recording script design.
Proceedings of the Seventh ISCA Tutorial and Research Workshop on Speech Synthesis, 2010

Aligning singing voice with MIDI melody using synthesized audio signal.
Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010

The psychoacoustic approach towards enhancing speech intelligibility in noise.
Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010

Generating emotional speech from neutral speech.
Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010

Phonetic segmentation of singing voice using MIDI and parallel speech.
Proceedings of the INTERSPEECH 2010, 2010

Voice conversion: From spoken vowels to singing vowels.
Proceedings of the 2010 IEEE International Conference on Multimedia and Expo, 2010

2009
Readability Consideration in Speech Synthesis Recording Script Selection.
Int. J. Asian Lang. Process., 2009

Unit selection based speech synthesis for poor channel condition.
Proceedings of the INTERSPEECH 2009, 2009

Refining Unit Boundaries for Mandarin Text-to-Speech Database.
Proceedings of the 2009 International Conference on Asian Language Processing, 2009

2008
Predicting Spectral and Prosodic Parameters for Unit Selection in Speech Synthesis.
Proceedings of the 6th International Symposium on Chinese Spoken Language Processing, 2008

Multi-speaker meeting audio segmentation.
Proceedings of the INTERSPEECH 2008, 2008

2007
Evaluating Prosody of Mandarin Speech for Language Learning.
J. Chin. Lang. Comput., 2007

Semantic Transliteration of Personal Names.
Proceedings of the ACL 2007, 2007

2006
A Unit Selection-based Speech Synthesis Approach for Mandarin Chinese.
J. Chin. Lang. Comput., 2006

Fusion of Acoustic and Tokenization Features for Speaker Recognition.
Proceedings of the Chinese Spoken Language Processing, 5th International Symposium, 2006

The IIR Submission to CSLP 2006 Speaker Recognition Evaluation.
Proceedings of the Chinese Spoken Language Processing, 5th International Symposium, 2006

Analysis and detection of speech under sleep deprivation.
Proceedings of the INTERSPEECH 2006, 2006

2005
A probabilistic approach to prosodic word prediction for Mandarin Chinese TTS.
Proceedings of the INTERSPEECH 2005, 2005

2004
Selecting Prosody Parameters for Unit Selection Based Chinese TTS.
Proceedings of the Natural Language Processing, 2004

2003
On unit analysis for Cantonese corpus-based TTS.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

2002
Automatic prosodic break labeling for Mandarin Chinese speech data.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Pitch contour model for Chinese text-to-speech using CART and statistical model.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

2000
Using prosody database in Chinese speech synthesis.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000


  Loading...