Zheng-Hua Tan

According to our database1, Zheng-Hua Tan authored at least 158 papers between 2002 and 2019.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Other 

Links

On csauthors.net:

Bibliography

2019
Time-Contrastive Learning Based Deep Bottleneck Features for Text-Dependent Speaker Verification.
IEEE/ACM Trans. Audio, Speech & Language Processing, 2019

On the Relationship Between Short-Time Objective Intelligibility and Short-Time Spectral-Amplitude Mean-Square Error for Speech Enhancement.
IEEE/ACM Trans. Audio, Speech & Language Processing, 2019

Adversarial Example Detection by Classification for Deep Speech Recognition.
CoRR, 2019

Deep Joint Embeddings of Context and Content for Recommendation.
CoRR, 2019

On Loss Functions for Supervised Monaural Time-Domain Speech Enhancement.
CoRR, 2019

Keyword Spotting for Hearing Assistive Devices Robust to External Speakers.
CoRR, 2019

rVAD: An Unsupervised Segment-Based Robust Voice Activity Detection Method.
CoRR, 2019

Deep-Learning-Based Audio-Visual Speech Enhancement in Presence of Lombard Effect.
CoRR, 2019

Time-Contrastive Learning Based Deep Bottleneck Features for Text-Dependent Speaker Verification.
CoRR, 2019

SketchSegNet+: An End-to-End Learning of RNN for Multi-Class Sketch Semantic Segmentation.
IEEE Access, 2019

Subjective Annotations for Vision-based Attention Level Estimation.
Proceedings of the 14th International Joint Conference on Computer Vision, 2019

On Training Targets and Objective Functions for Deep-learning-based Audio-visual Speech Enhancement.
Proceedings of the IEEE International Conference on Acoustics, 2019

Effects of Lombard Reflex on the Performance of Deep-learning-based Audio-visual Speech Enhancement Systems.
Proceedings of the IEEE International Conference on Acoustics, 2019

2018
Wireless Personal Communications: Machine Learning for Big Data Processing in Mobile Internet.
Wireless Personal Communications, 2018

Spoofing Detection in Automatic Speaker Verification Systems Using DNN Classifiers and Dynamic Acoustic Features.
IEEE Trans. Neural Netw. Learning Syst., 2018

Decorrelation of Neutral Vector Variables: Theory and Applications.
IEEE Trans. Neural Netw. Learning Syst., 2018

Using Closed-Set Speaker Identification Score Confidence to Enhance Audio-Based Collaborative Filtering for Multiple Users.
IEEE Trans. Consumer Electronics, 2018

Robust Voice Liveness Detection and Speaker Verification Using Throat Microphones.
IEEE/ACM Trans. Audio, Speech & Language Processing, 2018

Bias-Compensated Informed Sound Source Localization Using Relative Transfer Functions.
IEEE/ACM Trans. Audio, Speech & Language Processing, 2018

Nonintrusive Speech Intelligibility Prediction Using Convolutional Neural Networks.
IEEE/ACM Trans. Audio, Speech & Language Processing, 2018

Audio-Based Granularity-Adapted Emotion Classification.
IEEE Trans. Affective Computing, 2018

A perceptually motivated LP residual estimator in noisy and reverberant environments.
Speech Communication, 2018

Refinement and validation of the binaural short time objective intelligibility measure for spatially diverse conditions.
Speech Communication, 2018

A spatial self-similarity based feature learning method for face recognition under varying poses.
Pattern Recognition Letters, 2018

iSocioBot: A Multimodal Interactive Social Robot.
I. J. Social Robotics, 2018

Recent advances in machine learning for non-Gaussian data processing.
Neurocomputing, 2018

Latent Dirichlet mixture model.
Neurocomputing, 2018

Incorporating pass-phrase dependent background models for text-dependent speaker verification.
Computer Speech & Language, 2018

Subjective Annotations for Vision-Based Attention Level Estimation.
CoRR, 2018

Effects of Lombard Reflex on the Performance of Deep-Learning-Based Audio-Visual Speech Enhancement Systems.
CoRR, 2018

On Training Targets and Objective Functions for Deep-Learning-Based Audio-Visual Speech Enhancement.
CoRR, 2018

The Importance of Context When Recommending TV Content: Dataset and Algorithms.
CoRR, 2018

On the Equivalence between Objective Intelligibility and Mean-Squared Error for Deep Neural Network based Speech Enhancement.
CoRR, 2018

A Parallel/Distributed Algorithmic Framework for Mining All Quantitative Association Rules.
CoRR, 2018

Monaural Speech Enhancement using Deep Neural Networks by Maximizing a Short-Time Objective Intelligibility Measure.
CoRR, 2018

A Dataset for Inferring Contextual Preferences of Users Watching TV.
Proceedings of the 26th Conference on User Modeling, Adaptation and Personalization, 2018

The Sound or Silence: Investigating the Influence of Robot Noise on Proxemics.
Proceedings of the 27th IEEE International Symposium on Robot and Human Interactive Communication, 2018

Public perception of android robots: Indications from an analysis of YouTube comments.
Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2018

Effectiveness of Single-Channel BLSTM Enhancement for Language Identification.
Proceedings of the Interspeech 2018, 2018

Monaural Speech Enhancement Using Deep Neural Networks by Maximizing a Short-Time Objective Intelligibility Measure.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017
Frame Selection for Robust Speaker Identification: A Hybrid Approach.
Wireless Personal Communications, 2017

Visual Detection of Events of Interest from Urban Activity.
Wireless Personal Communications, 2017

Multitalker Speech Separation With Utterance-Level Permutation Invariant Training of Deep Recurrent Neural Networks.
IEEE/ACM Trans. Audio, Speech & Language Processing, 2017

Speech Intelligibility Potential of General and Specialized Deep Neural Network Based Speech Enhancement Systems.
IEEE/ACM Trans. Audio, Speech & Language Processing, 2017

Informed Sound Source Localization Using Relative Transfer Functions for Hearing Aid Applications.
IEEE/ACM Trans. Audio, Speech & Language Processing, 2017

Conditional Generative Adversarial Networks for Speech Enhancement and Noise-Robust Speaker Verification.
CoRR, 2017

Joint Separation and Denoising of Noisy Multi-talker Speech using Recurrent Neural Networks and Permutation Invariant Training.
CoRR, 2017

Adversarial Network Bottleneck Features for Noise Robust Speaker Verification.
CoRR, 2017

DNN Filter Bank Cepstral Coefficients for Spoofing Detection.
CoRR, 2017

Time-Contrastive Learning Based Unsupervised DNN Feature Extraction for Speaker Verification.
CoRR, 2017

Decorrelation of Neutral Vector Variables: Theory and Applications.
CoRR, 2017

Multi-talker Speech Separation and Tracing with Permutation Invariant Training of Deep Recurrent Neural Networks.
CoRR, 2017

DNN Filter Bank Cepstral Coefficients for Spoofing Detection.
IEEE Access, 2017

Joint separation and denoising of noisy multi-talker speech using recurrent neural networks and permutation invariant training.
Proceedings of the 27th IEEE International Workshop on Machine Learning for Signal Processing, 2017

Adversarial Network Bottleneck Features for Noise Robust Speaker Verification.
Proceedings of the Interspeech 2017, 2017

Improving Speaker Verification Performance in Presence of Spoofing Attacks Using Out-of-Domain Spoofed Data.
Proceedings of the Interspeech 2017, 2017

Conditional Generative Adversarial Networks for Speech Enhancement and Noise-Robust Speaker Verification.
Proceedings of the Interspeech 2017, 2017

On the Use of Band Importance Weighting in the Short-Time Objective Intelligibility Measure.
Proceedings of the Interspeech 2017, 2017

Weighted Score Based Fast Converging CO-training with Application to Audio-Visual Person Identification.
Proceedings of the 29th IEEE International Conference on Tools with Artificial Intelligence, 2017

Permutation invariant training of deep models for speaker-independent multi-talker speech separation.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

RedDots replayed: A new replay spoofing attack corpus for text-dependent speaker verification research.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

A non-intrusive Short-Time Objective Intelligibility measure.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

2016
Improved Gaussian Mixture Models for Adaptive Foreground Segmentation.
Wireless Personal Communications, 2016

Total Variability Modeling Using Source-Specific Priors.
IEEE/ACM Trans. Audio, Speech & Language Processing, 2016

Predicting the Intelligibility of Noisy and Nonlinearly Processed Binaural Speech.
IEEE/ACM Trans. Audio, Speech & Language Processing, 2016

AMORE: design and implementation of a commercial-strength parallel hybrid movie recommendation engine.
Knowl. Inf. Syst., 2016

Using Theatre to Study Interaction with Care Robots.
I. J. Social Robotics, 2016

Feature selection for neutral vector in EEG signal classification.
Neurocomputing, 2016

Permutation Invariant Training of Deep Models for Speaker-Independent Multi-talker Speech Separation.
CoRR, 2016

Incorporating Pass-Phrase Dependent Background Models for Text Dependent Speaker Verification.
CoRR, 2016

Text-Independent Speaker Identification Using the Histogram Transform Model.
IEEE Access, 2016

Effect of multi-condition training and speech enhancement methods on spoofing detection.
Proceedings of the First International Workshop on Sensing, 2016

Improving the convergence of co-training for audio-visual person identification.
Proceedings of the First International Workshop on Sensing, 2016

Background subtraction for patterns of activities in cities.
Proceedings of the First International Workshop on Sensing, 2016

Projecting emotional speech into arousal-valence space using pairwise preference learning.
Proceedings of the First International Workshop on Sensing, 2016

Speech enhancement using Long Short-Term Memory based recurrent Neural Networks for noise robust Speaker Verification.
Proceedings of the 2016 IEEE Spoken Language Technology Workshop, 2016

Further optimisations of constant Q cepstral processing for integrated utterance and text-dependent speaker verification.
Proceedings of the 2016 IEEE Spoken Language Technology Workshop, 2016

Dirichlet mixture allocation.
Proceedings of the 26th IEEE International Workshop on Machine Learning for Signal Processing, 2016

Privacy protection performance of De-identified face images with and without background.
Proceedings of the 39th International Convention on Information and Communication Technology, 2016

Speaker-Dependent Dictionary-Based Speech Enhancement for Text-Dependent Speaker Verification.
Proceedings of the Interspeech 2016, 2016

Text Dependent Speaker Verification Using Un-Supervised HMM-UBM and Temporal GMM-UBM.
Proceedings of the Interspeech 2016, 2016

Robust Speaker Recognition with Combined Use of Acoustic and Throat Microphone Speech.
Proceedings of the Interspeech 2016, 2016

Integrated Spoofing Countermeasures and Automatic Speaker Verification: An Evaluation on ASVspoof 2015.
Proceedings of the Interspeech 2016, 2016

HAPPY Team Entry to NIST OpenSAD Challenge: A Fusion of Short-Term Unsupervised and Segment i-Vector Based Speech Activity Detectors.
Proceedings of the Interspeech 2016, 2016

Utterance Verification for Text-Dependent Speaker Recognition: A Comparative Assessment Using the RedDots Corpus.
Proceedings of the Interspeech 2016, 2016

Adaptive overcurrent protection for microgrids in extensive distribution systems.
Proceedings of the IECON 2016, 2016

Informed Direction of Arrival estimation using a spherical-head model for Hearing Aid applications.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

A method for predicting the intelligibility of noisy and non-linearly enhanced binaural speech.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Concurrent localization of sound sources and dual-microphone sub-arrays using TOFs.
Proceedings of the 19th International Conference on Information Fusion, 2016

2015
Minimum Mean-Square Error Estimation of Mel-Frequency Cepstral Features-A Theoretically Consistent Approach.
IEEE/ACM Trans. Audio, Speech & Language Processing, 2015

Im2Sketch: Sketch generation by unconflicted perceptual grouping.
Neurocomputing, 2015

Binary pattern flavored feature extractors for Facial Expression Recognition: An overview.
Proceedings of the 38th International Convention on Information and Communication Technology, 2015

Assessing the Potential Use of Eye-Tracking Triangulation for Evaluating the Usability of an Online Diabetes Exercise System.
Proceedings of the MEDINFO 2015: eHealth-enabled Health, 2015

Neighbors Based Discriminative Feature Difference Learning for Kinship Verification.
Proceedings of the Advances in Visual Computing - 11th International Symposium, 2015

A heuristic approach for a social robot to navigate to a person based on audio and range information.
Proceedings of the 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2015

Comparison of forced-alignment speech recognition and humans for generating reference VAD.
Proceedings of the INTERSPEECH 2015, 2015

A binaural short time objective intelligibility measure for noisy and enhanced speech.
Proceedings of the INTERSPEECH 2015, 2015

Local feature learning for face recognition under varying poses.
Proceedings of the 2015 IEEE International Conference on Image Processing, 2015

A feature subtraction method for image based kinship verification under uncontrolled environments.
Proceedings of the 2015 IEEE International Conference on Image Processing, 2015

Source-specific informative prior for i-vector extraction.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

On the influence of microphone array geometry on HRTF-based Sound Source Localization.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Maximum likelihood approach to "informed" Sound Source Localization for Hearing Aid applications.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Informed TDoA-based direction of arrival estimation for hearing aid applications.
Proceedings of the 2015 IEEE Global Conference on Signal and Information Processing, 2015

A discriminative approach for speaker selection in speaker de-identification systems.
Proceedings of the 23rd European Signal Processing Conference, 2015

2014
Combination of Multiple Measurement Cues for Visual Face Tracking.
Wireless Personal Communications, 2014

Predictive Distribution of the Dirichlet Mixture Model by Local Variational Inference.
Signal Processing Systems, 2014

Using Audio-Derived Affective Offset to Enhance TV Recommendation.
IEEE Trans. Multimedia, 2014

Implementing a Commercial-Strength Parallel Hybrid Movie Recommendation Engine.
IEEE Intelligent Systems, 2014

Joint variable frame rate and length analysis for speech recognition under adverse conditions.
Computers & Electrical Engineering, 2014

Improving Robustness Against Environmental Sounds for Directing Attention of Social Robots.
Proceedings of the Multimodal Analyses enabling Artificial Agents in Human-Machine Interaction, 2014

Utilising Tree-Based Ensemble Learning for Speaker Segmentation.
Proceedings of the Artificial Intelligence Applications and Innovations, 2014

Cluster-based adaptation using density forest for HMM phone recognition.
Proceedings of the 22nd European Signal Processing Conference, 2014

2013
Audio-based age and gender identification to enhance the recommendation of TV content.
IEEE Trans. Consumer Electronics, 2013

A heuristic hierarchical scheme for academic search and retrieval.
Inf. Process. Manage., 2013

Multi-frame rate based multiple-model training for robust speaker identification of disguised voice.
Proceedings of the 16th International Symposium on Wireless Personal Multimedia Communications, 2013

Perceptual grouping via untangling Gestalt principles.
Proceedings of the 2013 Visual Communications and Image Processing, 2013

Fusing eye-gaze and speech recognition for tracking in an automatic reading tutor - a step in the right direction?
Proceedings of the ISCA International Workshop on Speech and Language Technology in Education, 2013

Demographic recommendation by means of group profile elicitation using speaker age and gender recognition.
Proceedings of the INTERSPEECH 2013, 2013

Developing a speaker identification system for the DARPA RATS project.
Proceedings of the IEEE International Conference on Acoustics, 2013

2012
A Joint Approach for Single-Channel Speaker Identification and Speech Separation.
IEEE Trans. Audio, Speech & Language Processing, 2012

Guest Editors' Introduction to the Special Issue on "New Trends in Signal Processing and Biomedical Engineering".
Computers & Electrical Engineering, 2012

EEG signal classification with super-Dirichlet mixture model.
Proceedings of the IEEE Statistical Signal Processing Workshop, 2012

PubSearch - A Hierarchical Heuristic Scheme for Ranking Academic Search Results.
Proceedings of the ICPRAM 2012, 2012

2011
Convex Combination of Multiple Statistical Models With Application to VAD.
IEEE Trans. Audio, Speech & Language Processing, 2011

Technology-enabled social learning: a review.
IJKL, 2011

Feature selection strategy for classification of single-trial EEG elicited by motor imagery.
Proceedings of the 14th International Symposium on Wireless Personal Multimedia Communications, 2011

Evaluating tracking accuracy of an automatic reading tutor.
Proceedings of the ISCA International Workshop on Speech and Language Technology in Education, 2011

Combining acoustic and language model miscue detection methods for adult dyslexic read speech.
Proceedings of the ISCA International Workshop on Speech and Language Technology in Education, 2011

Mobile video annotation for enhanced rich media communication during emergency handling.
Proceedings of the 4th International Symposium on Applied Sciences in Biomedical and Communication Technologies, 2011

Multi-Sensor Voice Activity Detection Based on Multiple Observation Hypothesis Testing.
Proceedings of the INTERSPEECH 2011, 2011

Sinusoidal Approach for the Single-Channel Speech Separation and Recognition Challenge.
Proceedings of the INTERSPEECH 2011, 2011

2010
Low-Complexity Variable Frame Rate Analysis for Speech Recognition and Voice Activity Detection.
J. Sel. Topics Signal Processing, 2010

Introduction to the Issue on Speech Processing for Natural Interaction With Intelligent Environments.
J. Sel. Topics Signal Processing, 2010

Improving monaural speaker identification by double-talk detection.
Proceedings of the INTERSPEECH 2010, 2010

Signal-to-Signal Ratio Independent Speaker Identification for Co-channel Speech Signals.
Proceedings of the 20th International Conference on Pattern Recognition, 2010

Joint single-channel speech separation and speaker identification.
Proceedings of the IEEE International Conference on Acoustics, 2010

Crowd analysis by using optical flow and density based clustering.
Proceedings of the 18th European Signal Processing Conference, 2010

Three-dimensional adaptive sensing of people in a multi-camera setup.
Proceedings of the 18th European Signal Processing Conference, 2010

2009
Audio and Speech Processing for Data Mining.
Proceedings of the Encyclopedia of Data Warehousing and Mining, Second Edition (4 Volumes), 2009

High-accuracy, low-complexity voice activity detection based on a posteriori SNR weighted energy.
Proceedings of the INTERSPEECH 2009, 2009

A system for detecting miscues in dyslexic read speech.
Proceedings of the INTERSPEECH 2009, 2009

2008
Robust Speech Recognition by Nonlocal Means Denoising Processing.
IEEE Signal Process. Lett., 2008

A posteriori SNR weighted energy based variable frame rate analysis for speech recognition.
Proceedings of the INTERSPEECH 2008, 2008

Speech Recognition on Mobile Devices.
Proceedings of the Mobile Multimedia Processing: Fundamentals, 2008

2007
Noise Condition-Dependent Training Based on Noise Classification and SNR Estimation.
IEEE Trans. Audio, Speech & Language Processing, 2007

Exploiting Temporal Correlation of Speech for Error Robust and Bandwidth Flexible Distributed Speech Recognition.
IEEE Trans. Audio, Speech & Language Processing, 2007

2006
Fuzzy Metagraph and Its Combination with the Indexing Approach in Rule-Based Systems.
IEEE Trans. Knowl. Data Eng., 2006

Robust speech recognition over mobile networks using combined weighted viterbi decoding and subvector based error concealment.
Proceedings of the INTERSPEECH 2006, 2006

Robust Speech Recognition From Noise-Type Based Feature Compensation and Model Interpolation in a Multiple Model Framework.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

2005
Automatic speech recognition over error-prone wireless networks.
Speech Communication, 2005

Adaptive Multi-Frame-Rate Scheme for Distributed Speech Recognition Based on a Half Frame-Rate Front-End.
Proceedings of the IEEE 7th Workshop on Multimedia Signal Processing, 2005

Robust speech recognition based on noise and SNR classification - a multiple-model framework.
Proceedings of the INTERSPEECH 2005, 2005

Robust speech recognition in ubiquitous networking and context-aware computing.
Proceedings of the INTERSPEECH 2005, 2005

2004
Spectral subtraction with full-wave rectification and likelihood controlled instantaneous noise estimation for robust speech recognition.
Proceedings of the INTERSPEECH 2004, 2004

On the integration of speech recognition into personal networks.
Proceedings of the INTERSPEECH 2004, 2004

A subvector-based error concealment algorithm for speech recognition over mobile networks.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

2003
OOV-detection and channel error protection for distributed speech recognition over wireless networks.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

2002
Channel error protection scheme for distributed speech recognition.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002


  Loading...