Jiqing Han

Expert Syst. Appl., December, 2023

A Glance is Enough: Extract Target Sentence By Looking at A keyword.

[BibT_eX]

[DOI]

CoRR, 2023

Spot keywords from very noisy and mixed speech.

[BibT_eX]

[DOI]

CoRR, 2023

Patch-level contrastive embedding learning for respiratory sound classification.

[BibT_eX]

[DOI]

Wenjie Song

Biomed. Signal Process. Control., 2023

Using Auxiliary Tasks In Multimodal Fusion of Wav2vec 2.0 And Bert for Multimodal Emotion Recognition.

[BibT_eX]

[DOI]

Dekai Sun

Yancheng He

Proceedings of the IEEE International Conference on Acoustics, 2023

Subband Dependency Modeling for Sound Event Detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Time-Weighted Frequency Domain Audio Representation with GMM Estimator for Anomalous Sound Detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Graph-Based Spectro-Temporal Dependency Modeling for Anti-Spoofing.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Sentiment Knowledge Enhanced Self-supervised Learning for Multimodal Sentiment Analysis.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

2022

Exploring Inter-Node Relations in CNNs for Environmental Sound Classification.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2022

Contrastive Regularization for Multimodal Emotion Recognition Using Audio and Text.

[BibT_eX]

[DOI]

Fan Qian

CoRR, 2022

Word-wise Sparse Attention for Multimodal Sentiment Analysis.

[BibT_eX]

[DOI]

Fan Qian

Proceedings of the Interspeech 2022, 2022

Exploring Transformer's Potential on Automatic Piano Transcription.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

CDMA: Cross-Domain Distance Metric Adaptation for Speaker Verification.

[BibT_eX]

[DOI]

Jianchen Li

Proceedings of the IEEE International Conference on Acoustics, 2022

Sparse Self-Attention for Semi-Supervised Sound Event Detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

2021

Exploring attention mechanisms based on summary information for end-to-end automatic speech recognition.

[BibT_eX]

[DOI]

Neurocomputing, 2021

Semantic feature extraction based on subspace learning with temporal constraints for acoustic event recognition.

[BibT_eX]

[DOI]

Qiuying Shi

Digit. Signal Process., 2021

Can We Trust Deep Speech Prior?

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Model-Agnostic Fast Adaptive Multi-Objective Balancing Algorithm for Multilingual Automatic Speech Recognition Model Training.

[BibT_eX]

[DOI]

Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Multimodal Sentiment Analysis with Temporal Modality Attention.

[BibT_eX]

[DOI]

Fan Qian

Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Gradient Regularization for Noise-Robust Speaker Verification.

[BibT_eX]

[DOI]

Jianchen Li

Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Capturing Temporal Dependencies Through Future Prediction for CNN-Based Audio Classifiers.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Contrastive Embeddind Learning Method for Respiratory Sound Classification.

[BibT_eX]

[DOI]

Wenjie Song

Proceedings of the IEEE International Conference on Acoustics, 2021

2020

Pyramidal Temporal Pooling With Discriminative Mapping for Audio Classification.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2020

Nonnegative Matrix Factorization Based Transfer Subspace Learning for Cross-Corpus Speech Emotion Recognition.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2020

A Joint Framework of Denoising Autoencoder and Generative Vocoder for Monaural Speech Enhancement.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2020

Learning Temporal Relations from Semantic Neighbors for Acoustic Scene Classification.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2020

Task-Driven Variability Model for Speaker Verification.

[BibT_eX]

[DOI]

Circuits Syst. Signal Process., 2020

Toward the pre-cocktail party problem with TasTas+.

[BibT_eX]

[DOI]

Anyan Shi

CoRR, 2020

La Furca: Iterative Context-Aware End-to-End Monaural Speech Separation Based on Dual-Path Deep Parallel Inter-Intra Bi-LSTM with Attention.

[BibT_eX]

[DOI]

Rujie Liu

CoRR, 2020

FurcaNeXt: End-to-End Monaural Speech Separation with Dynamic Gated Dilated Temporal Convolutional Networks.

[BibT_eX]

[DOI]

Proceedings of the MultiMedia Modeling - 26th International Conference, 2020

ATReSN-Net: Capturing Attentive Temporal Relations in Semantic Neighborhood for Acoustic Scene Classification.

[BibT_eX]

[DOI]

Proceedings of the Interspeech 2020, 2020

Speech Separation Based on Multi-Stage Elaborated Dual-Path Deep BiLSTM with Auxiliary Identity Loss.

[BibT_eX]

[DOI]

Rujie Liu

Proceedings of the Interspeech 2020, 2020

Self-Supervised Adversarial Multi-Task Learning for Vocoder-Based Monaural Speech Enhancement.

[BibT_eX]

[DOI]

Proceedings of the Interspeech 2020, 2020

Double Adversarial Network Based Monaural Speech Enhancement for Robust Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the Interspeech 2020, 2020

Structured Sparse Attention for end-to-end Automatic Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Pan: Phoneme-Aware Network for Monaural Speech Enhancement.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

TDMF: Task-Driven Multilevel Framework for End-to-End Speaker Verification.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019

A bilevel framework for joint optimization of session compensation and classification for speaker identification.

[BibT_eX]

[DOI]

Digit. Signal Process., 2019

A Multi-Task Learning Framework for Overcoming the Catastrophic Forgetting in Automatic Speech Recognition.

[BibT_eX]

[DOI]

CoRR, 2019

Hard Sample Mining for the Improved Retraining of Automatic Speech Recognition.

[BibT_eX]

[DOI]

CoRR, 2019

FurcaNeXt: End-to-end monaural speech separation with dynamic gated dilated temporal convolutional networks.

[BibT_eX]

[DOI]

CoRR, 2019

FurcaNet: An end-to-end deep gated convolutional, long short-term memory, deep neural networks for single channel speech separation.

[BibT_eX]

[DOI]

CoRR, 2019

Is CQT more suitable for monaural speech separation than STFT? an empirical study.

[BibT_eX]

[DOI]

CoRR, 2019

Abnormal heart sound detection using temporal quasi-periodic features and long short-term memory without segmentation.

[BibT_eX]

[DOI]

Biomed. Signal Process. Control., 2019

Trace Ratio Criterion Based Large Margin Subspace Learning for Feature Selection.

[BibT_eX]

[DOI]

IEEE Access, 2019

Acoustic Scene Classification by Implicitly Identifying Distinct Sound Events.

[BibT_eX]

[DOI]

Proceedings of the Interspeech 2019, 2019

Deep Attention Gated Dilated Temporal Convolutional Networks with Intra-Parallel Convolutional Modules for End-to-End Monaural Speech Separation.

[BibT_eX]

[DOI]

Proceedings of the Interspeech 2019, 2019

End-to-End Monaural Speech Separation with Multi-Scale Dynamic Weighted Gated Dilated Convolutional Pyramid Network.

[BibT_eX]

[DOI]

Proceedings of the Interspeech 2019, 2019

Subspace Pooling Based Temporal Features Extraction for Audio Event Recognition.

[BibT_eX]

[DOI]

Qiuying Shi

Proceedings of the Interspeech 2019, 2019

Cross-Corpus Speech Emotion Recognition Using Semi-Supervised Transfer Non-Negative Matrix Factorization with Adaptation Regularization.

[BibT_eX]

[DOI]

Proceedings of the Interspeech 2019, 2019

Convolutional Grid Long Short-Term Memory Recurrent Neural Network for Automatic Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the Neural Information Processing - 26th International Conference, 2019

Furcax: End-to-end Monaural Speech Separation Based on Deep Gated (De)convolutional Neural Networks with Adversarial Example Training.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

Investigation of Monaural Front-End Processing for Robust Speech Recognition Without Retraining or Joint-Training.

[BibT_eX]

[DOI]

Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

2018

Efficient general sparse denoising with non-convex sparse constraint and total variation regularization.

[BibT_eX]

[DOI]

Digit. Signal Process., 2018

Investigation of Monaural Front-End Processing for Robust ASR without Retraining or Joint-Training.

[BibT_eX]

[DOI]

CoRR, 2018

Adaptive overlapping-group sparse denoising for heart sound signals.

[BibT_eX]

[DOI]

Biomed. Signal Process. Control., 2018

Unsupervised Temporal Feature Learning Based on Sparse Coding Embedded BoAW for Acoustic Event Recognition.

[BibT_eX]

[DOI]

Proceedings of the Interspeech 2018, 2018

A Compact and Discriminative Feature Based on Auditory Summary Statistics for Acoustic Scene Classification.

[BibT_eX]

[DOI]

Proceedings of the Interspeech 2018, 2018

Deep Neural Network Based Discriminative Training for I-Vector/PLDA Speaker Verification.

[BibT_eX]

[DOI]

Guibin Zheng

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017

Heart sound classification based on scaled spectrogram and tensor decomposition.

[BibT_eX]

[DOI]

Expert Syst. Appl., 2017

Heart sound classification based on scaled spectrogram and partial least squares regression.

[BibT_eX]

[DOI]

Biomed. Signal Process. Control., 2017

Speaker Verification via Estimating Total Variability Space Using Probabilistic Partial Least Squares.

[BibT_eX]

[DOI]

Yilin Pan

Proceedings of the Interspeech 2017, 2017

Learning Deep Neural Network Based Kernel Functions for Small Sample Size Classification.

[BibT_eX]

[DOI]

Guibin Zheng

Proceedings of the Neural Information Processing - 24th International Conference, 2017

Towards Heart Sound Classification Without Segmentation Using Convolutional Neural Network.

[BibT_eX]

[DOI]

Proceedings of the Computing in Cardiology, 2017

2016

Signal Periodic Decomposition With Conjugate Subspaces.

[BibT_eX]

[DOI]

IEEE Trans. Signal Process., 2016

Sparse Decomposition for Signal Periodic Model Over Complex Exponential Dictionary.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2016

Speaker Verification via Modeling Kurtosis Using Sparse Coding.

[BibT_eX]

[DOI]

Int. J. Pattern Recognit. Artif. Intell., 2016

Optimization of learned dictionary for sparse coding in speech processing.

[BibT_eX]

[DOI]

Guanglu Sun

Neurocomputing, 2016

Towards heart sound classification without segmentation via autocorrelation feature and diffusion maps.

[BibT_eX]

[DOI]

Future Gener. Comput. Syst., 2016

Towards optimal vlad for human action recognition from still images.

[BibT_eX]

[DOI]

Lei Zhang

Xiantong Zhen

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Realistic human action recognition: When deep learning meets VLAD.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Abnormal Heart Sounds detection based on the Scaled Time-Frequency Representation and Feature Selection.

[BibT_eX]

[DOI]

Proceedings of the Computing in Cardiology, CinC 2016, Vancouver, 2016

2015

Soft Margin Based Low-Rank Audio Signal Classification.

[BibT_eX]

[DOI]

Neural Process. Lett., 2015

Dictionary evaluation and optimization for sparse coding based speech processing.

[BibT_eX]

[DOI]

Inf. Sci., 2015

Spectrum enhancement with sparse coding for robust speech recognition.

[BibT_eX]

[DOI]

Guanglu Sun

Digit. Signal Process., 2015

Ramanujan subspace pursuit for signal periodic decomposition.

[BibT_eX]

[DOI]

CoRR, 2015

Noise-robust speaker recognition based on morphological component analysis.

[BibT_eX]

[DOI]

Proceedings of the INTERSPEECH 2015, 2015

2014

Confidence Measure Based on Context Consistency Using Word Occurrence Probability and Topic Adaptation for Spoken Term Detection.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2014

A new framework for robust speech recognition in complex channel environments.

[BibT_eX]

[DOI]

Digit. Signal Process., 2014

Sparse Representation with Optimized Learned Dictionary for Robust Voice Activity Detection.

[BibT_eX]

[DOI]

Circuits Syst. Signal Process., 2014

Evaluation of dictionary for sparse coding in speech processing.

[BibT_eX]

[DOI]

Proceedings of the INTERSPEECH 2014, 2014

Learning semantic kernels for scene classification.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2014

Robust minimum statistics project coefficients feature for acoustic environment recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2014

2013

Audio classification with low-rank matrix representation features.

[BibT_eX]

[DOI]

ACM Trans. Intell. Syst. Technol., 2013

Identification of Objectionable Audio Segments Based on Pseudo and Heterogeneous Mixture Models.

[BibT_eX]

[DOI]

IEEE Trans. Speech Audio Process., 2013

Audio Segment Classification Using Online Learning Based Tensor Representation Feature Discrimination.

[BibT_eX]

[DOI]

IEEE Trans. Speech Audio Process., 2013

Statistical voice activity detection based on sparse representation over learned dictionary.

[BibT_eX]

[DOI]

Digit. Signal Process., 2013

Guarantees of Augmented Trace Norm Models in Tensor Recovery.

[BibT_eX]

[DOI]

Proceedings of the IJCAI 2013, 2013

Case based reasoning solution to the problem of sustained learning in keyword spotting.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2013

Upper and lower bounds for approximation of the Kullback-Leibler divergence between Hidden Markov models.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2013

2012

Sparse-Based auditory Model for robust speaker Recognition.

[BibT_eX]

[DOI]

Int. J. Pattern Recognit. Artif. Intell., 2012

Likelihood ratio sign test for voice activity detection.

[BibT_eX]

[DOI]

IET Signal Process., 2012

Identifiability of multivariate logistic mixture models

[BibT_eX]

[DOI]

CoRR, 2012

Guarantees of Augmented Trace Norm Models in Tensor Recovery

[BibT_eX]

[DOI]

CoRR, 2012

Low-rank Audio Signal Classification Under Soft Margin and Trace Norm Constraints.

[BibT_eX]

[DOI]

Proceedings of the INTERSPEECH 2012, 2012

A Novel Confidence Measure Based on Context Consistency for Spoken Term Detection.

[BibT_eX]

[DOI]

Proceedings of the INTERSPEECH 2012, 2012

Sparse power spectrum based robust voice activity detector.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

A solution to residual noise in speech denoising with sparse representation.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

2011

Gaussian Specific Compensation for Channel Distortion in Speech Recognition.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2011

MAP-based Audio Coding Compensation for Speaker Recognition.

[BibT_eX]

[DOI]

Tao Jiang

J. Signal Inf. Process., 2011

Voice activity detection based on conjugate subspace matching pursuit and likelihood ratio test.

[BibT_eX]

[DOI]

EURASIP J. Audio Speech Music. Process., 2011

Online Learning for Classification of Low-rank Representation Features and Its Applications in Audio Segment Classification

[BibT_eX]

[DOI]

CoRR, 2011

Trace Norm Regularized Tensor Classification and Its Online Learning Approaches

[BibT_eX]

[DOI]

CoRR, 2011

Heterogeneous mixture models using sparse representation features for applause and laugh detection.

[BibT_eX]

[DOI]

Proceedings of the 2011 IEEE International Workshop on Machine Learning for Signal Processing, 2011

Real-World Speech/Non-Speech Audio Classification Based on Sparse Representation Features and GPCs.

[BibT_eX]

[DOI]

Proceedings of the INTERSPEECH 2011, 2011

AUC Optimization Based Confidence Measure for Keyword Spotting.

[BibT_eX]

[DOI]

Haiyang Li

Proceedings of the INTERSPEECH 2011, 2011

A Novel Framework Based on Trace Norm Minimization for Audio Event Detection.

[BibT_eX]

[DOI]

Proceedings of the Neural Information Processing - 18th International Conference, 2011

A cochlear neuron based robust feature for speaker recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2011

Compensation of partly reliable components for band-limited speech recognition with missing data techniques.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2011

A modified MAP criterion based on hidden Markov model for voice activity detecion.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2011

2010

Particle-based realistic simulation of fluid-solid interaction.

[BibT_eX]

[DOI]

Hongquan Sun

Comput. Animat. Virtual Worlds, 2010

Study on the Recognition of Objectionable Audio.

[BibT_eX]

[DOI]

Int. J. Pattern Recognit. Artif. Intell., 2010

Compensation of signal with erasures via sparse representation into its significant subspace.

[BibT_eX]

[DOI]

Proceedings of the 10th International Conference on Information Sciences, 2010

Model synthesis for band-limited speech recognition.

[BibT_eX]

[DOI]

Proceedings of the INTERSPEECH 2010, 2010

Robust statistical voice activity detection using a likelihood ratio sign test.

[BibT_eX]

[DOI]

Proceedings of the INTERSPEECH 2010, 2010

Voice Activity Detection Based on Complex Exponential Atomic Decomposition and Likelihood Ratio Test.

[BibT_eX]

[DOI]

Proceedings of the 20th International Conference on Pattern Recognition, 2010

2009

Speaker identification and verification from audio coded speech in matched and mismatched conditions.

[BibT_eX]

[DOI]

Tao Jiang

Boyang Gao

Proceedings of the IEEE International Conference on Robotics and Biomimetics, 2009

A Fast Audio Retrieval Method Based on Negativity Judgment.

[BibT_eX]

[DOI]

Proceedings of the Fifth International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP 2009), 2009

2008

Text-independent Speaker Identification Based on MAP Channel Compensation and Pitch-dependent Features.

[BibT_eX]

Rongchun Gao

Proceedings of the 2008 International Conference on Information & Knowledge Engineering, 2008

2007

Automatic conversion from lexical words to prosodic words for mandarin text-to-speech system.

[BibT_eX]

[DOI]

Int. J. Speech Technol., 2007

2006

Automatic Music Transcription Based on Harmonic Structure Information.

[BibT_eX]

[DOI]

Guibin Zheng

J. Comput. Res. Dev., 2006

Improved Mandarin Speech Recognition by Lattice Rescoring with Enhanced Tone Models.

[BibT_eX]

[DOI]

Proceedings of the Chinese Spoken Language Processing, 5th International Symposium, 2006

A multi-space distribution (MSD) approach to speech recognition of tonal languages.

[BibT_eX]

[DOI]

Proceedings of the INTERSPEECH 2006, 2006

2005

Modifying Spectral Envelope to Synthetically Adjust Voice Quality and Articulation Parameters for Emotional Speech Synthesis.

[BibT_eX]

[DOI]

Proceedings of the Affective Computing and Intelligent Interaction, 2005

2002

Sharpe Ratio-Oriented Active Trading: A Learning Approach.

[BibT_eX]

[DOI]

Yang Liu

Xiaohui Yu

Proceedings of the MICAI 2002: Advances in Artificial Intelligence, 2002

2001

Robust Speech Recognition Method Based on Discriminative Environment Feature Extraction.

[BibT_eX]

[DOI]

Wen Gao

J. Comput. Sci. Technol., 2001

2000

An environment model-based robust speech recognition.

[BibT_eX]

[DOI]

Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

1999

Robust telephone speech recognition based on channel compensation.

[BibT_eX]

[DOI]