Ning Cheng

Orcid: 0000-0002-0988-5023

Affiliations:
  • Ping An Technology (Shenzhen) Co., Ltd., China
  • Chinese Academy of Sciences, Institute of Automation, Beijing, China (former)
  • Chinese Academy of Sciences, Shenzhen Institute of Advanced Technology, China (former)
  • University of the Chinese Academy of Sciences (UCAS), Beijing, China (PhD 2009)


According to our database1, Ning Cheng authored at least 97 papers between 2008 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Medical Speech Symptoms Classification via Disentangled Representation.
CoRR, 2024

Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning.
CoRR, 2024

Leveraging Biases in Large Language Models: "bias-kNN" for Effective Few-Shot Learning.
CoRR, 2024

ED-TTS: Multi-Scale Emotion Modeling using Cross-Domain Emotion Diarization for Emotional Speech Synthesis.
CoRR, 2024

Learning Disentangled Speech Representations with Contrastive Learning and Time-Invariant Retrieval.
CoRR, 2024

EmoTalker: Emotionally Editable Talking Face Generation via Diffusion Model.
CoRR, 2024

2023
DiffTalker: Co-driven audio-image diffusion for talking faces via intermediate landmarks.
CoRR, 2023

Machine Unlearning Methodology base on Stochastic Teacher Network.
CoRR, 2023

Symbolic & Acoustic: Multi-domain Music Emotion Modeling for Instrumental Music.
CoRR, 2023

From Quantity to Quality: Boosting LLM Performance with Self-Guided Data Selection for Instruction Tuning.
CoRR, 2023

Boosting Chinese ASR Error Correction with Dynamic Error Scaling Mechanism.
CoRR, 2023

Prompt Guided Copy Mechanism for Conversational Question Answering.
CoRR, 2023

EmoMix: Emotion Mixing via Diffusion Models for Emotional Speech Synthesis.
CoRR, 2023

PMVC: Data Augmentation-Based Prosody Modeling for Expressive Voice Conversion.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

SAR: Self-Supervised Anti-Distortion Representation for End-To-End Speech Model.
Proceedings of the International Joint Conference on Neural Networks, 2023

FastGraphTTS: An Ultrafast Syntax-Aware Speech Synthesis Framework.
Proceedings of the 35th IEEE International Conference on Tools with Artificial Intelligence, 2023

AOSR-Net: All-in-One Sandstorm Removal Network.
Proceedings of the 35th IEEE International Conference on Tools with Artificial Intelligence, 2023

Contrastive Latent Space Reconstruction Learning for Audio-Text Retrieval.
Proceedings of the 35th IEEE International Conference on Tools with Artificial Intelligence, 2023

Improving EEG-based Emotion Recognition by Fusing Time-Frequency and Spatial Representations.
Proceedings of the IEEE International Conference on Acoustics, 2023

Dynamic Alignment Mask CTC: Improved Mask CTC With Aligned Cross Entropy.
Proceedings of the IEEE International Conference on Acoustics, 2023

Efficient Uncertainty Estimation with Gaussian Process for Reliable Dialog Response Retrieval.
Proceedings of the IEEE International Conference on Acoustics, 2023

QI-TTS: Questioning Intonation Control for Emotional Speech Synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2023

VQ-CL: Learning Disentangled Speech Representations with Contrastive Learning and Vector Quantization.
Proceedings of the IEEE International Conference on Acoustics, 2023

Learning Speech Representations with Flexible Hidden Feature Dimensions.
Proceedings of the IEEE International Conference on Acoustics, 2023

Improving Music Genre Classification from multi-modal Properties of Music and Genre Correlations Perspective.
Proceedings of the IEEE International Conference on Acoustics, 2023

PRCA: Fitting Black-Box Large Language Models for Retrieval Question Answering via Pluggable Reward-Driven Contextual Adapter.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

DQR-TTS: Semi-supervised Text-to-speech Synthesis with Dynamic Quantized Representation.
Proceedings of the IEEE Intl Conf on Parallel & Distributed Processing with Applications, 2023

CP-EB: Talking Face Generation with Controllable Pose and Eye Blinking Embedding.
Proceedings of the IEEE Intl Conf on Parallel & Distributed Processing with Applications, 2023

CLN-VC: Text-Free Voice Conversion Based on Fine-Grained Style Control and Contrastive Learning with Negative Samples Augmentation.
Proceedings of the IEEE Intl Conf on Parallel & Distributed Processing with Applications, 2023

Symbolic and Acoustic: Multi-domain Music Emotion Modeling for Instrumental Music.
Proceedings of the Advanced Data Mining and Applications - 19th International Conference, 2023

Voice Conversion with Denoising Diffusion Probabilistic GAN Models.
Proceedings of the Advanced Data Mining and Applications - 19th International Conference, 2023

Machine Unlearning Methodology Based on Stochastic Teacher Network.
Proceedings of the Advanced Data Mining and Applications - 19th International Conference, 2023

On the Calibration and Uncertainty with Pólya-Gamma Augmentation for Dialog Retrieval Models.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Boosting Star-GANs for Voice Conversion with Contrastive Discriminator.
CoRR, 2022

Linguistic-Enhanced Transformer with CTC Embedding for Speech Recognition.
Proceedings of the 18th International Conference on Mobility, Sensing and Networking, 2022

Improving Speech Representation Learning via Speech-level and Phoneme-level Masking Approach.
Proceedings of the 18th International Conference on Mobility, Sensing and Networking, 2022

Improving Imbalanced Text Classification with Dynamic Curriculum Learning.
Proceedings of the 18th International Conference on Mobility, Sensing and Networking, 2022

Semi-Supervised Learning Based on Reference Model for Low-resource TTS.
Proceedings of the 18th International Conference on Mobility, Sensing and Networking, 2022

MetaSpeech: Speech Effects Switch Along with Environment for Metaverse.
Proceedings of the 18th International Conference on Mobility, Sensing and Networking, 2022

Adapitch: Adaption Multi-Speaker Text-to-Speech Conditioned on Pitch Disentangling with Untranscribed Data.
Proceedings of the 18th International Conference on Mobility, Sensing and Networking, 2022

Uncertainty Calibration for Deep Audio Classifiers.
Proceedings of the Interspeech 2022, 2022

Speech Representation Disentanglement with Adversarial Mutual Information Learning for One-shot Voice Conversion.
Proceedings of the Interspeech 2022, 2022

Tiny-Sepformer: A Tiny Time-Domain Transformer Network For Speech Separation.
Proceedings of the Interspeech 2022, 2022

MetaSID: Singer Identification with Domain Adaptation for Metaverse.
Proceedings of the International Joint Conference on Neural Networks, 2022

Singer Identification for Metaverse with Timbral and Middle-Level Perceptual Features.
Proceedings of the International Joint Conference on Neural Networks, 2022

TDASS: Target Domain Adaptation Speech Synthesis Framework for Multi-speaker Low-Resource TTS.
Proceedings of the International Joint Conference on Neural Networks, 2022

MDCNN-SID: Multi-scale Dilated Convolution Network for Singer Identification.
Proceedings of the International Joint Conference on Neural Networks, 2022

SUSing: SU-net for Singing Voice Synthesis.
Proceedings of the International Joint Conference on Neural Networks, 2022

Adaptive Activation Network for Low Resource Multilingual Speech Recognition.
Proceedings of the International Joint Conference on Neural Networks, 2022

Speech Augmentation Based Unsupervised Learning for Keyword Spotting.
Proceedings of the International Joint Conference on Neural Networks, 2022

Pre-Avatar: An Automatic Presentation Generation Framework Leveraging Talking Avatar.
Proceedings of the 34th IEEE International Conference on Tools with Artificial Intelligence, 2022

Blur the Linguistic Boundary: Interpreting Chinese Buddhist Sutra in English via Neural Machine Translation.
Proceedings of the 34th IEEE International Conference on Tools with Artificial Intelligence, 2022

Boosting StarGANs for Voice Conversion with Contrastive Discriminator.
Proceedings of the Neural Information Processing - 29th International Conference, 2022

nnSpeech: Speaker-Guided Conditional Variational Autoencoder for Zero-Shot Multi-speaker text-to-speech.
Proceedings of the IEEE International Conference on Acoustics, 2022

Self-Attention for Incomplete Utterance Rewriting.
Proceedings of the IEEE International Conference on Acoustics, 2022

VU-BERT: A Unified Framework for Visual Dialog.
Proceedings of the IEEE International Conference on Acoustics, 2022

DRVC: A Framework of Any-to-Any Voice Conversion with Self-Supervised Learning.
Proceedings of the IEEE International Conference on Acoustics, 2022

Avqvc: One-Shot Voice Conversion By Vector Quantization With Applying Contrastive Learning.
Proceedings of the IEEE International Conference on Acoustics, 2022

Supervised Contrastive Meta-learning for Few-Shot Classification.
Proceedings of the 24th IEEE Int Conf on High Performance Computing & Communications; 8th Int Conf on Data Science & Systems; 20th Int Conf on Smart City; 8th Int Conf on Dependability in Sensor, 2022

Shallow Diffusion Motion Model for Talking Face Generation from Speech.
Proceedings of the Web and Big Data - 6th International Joint Conference, 2022

2021
MelGlow: Efficient Waveform Generative Network Based On Location-Variable Convolution.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021

GraphPB: Graphical Representations of Prosody Boundary in Speech Synthesis.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021

End-To-End Silent Speech Recognition with Acoustic Sensing.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Multi-Quartznet: Multi-Resolution Convolution for Speech Recognition with Multi-Layer Feature Fusion.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Semantic Embedding Graph Convolutional Networks for Multi-label Video Segment Classification.
Proceedings of the 12th International Symposium on Parallel Architectures, 2021

Variational Information Bottleneck for Effective Low-Resource Audio Classification.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Speech2Video: Cross-Modal Distillation for Speech to Video Generation.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Dropout Regularization for Self-Supervised Learning of Transformer Encoder Speech Representation.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

CACnet: Cube Attentional CNN for Automatic Speech Recognition.
Proceedings of the International Joint Conference on Neural Networks, 2021

Semantic Extraction for Sentence Representation via Reinforcement Learning.
Proceedings of the International Joint Conference on Neural Networks, 2021

A Language Model Based Pseudo-Sample Deliberation for Semi-supervised Speech Recognition.
Proceedings of the International Joint Conference on Neural Networks, 2021

Transfer Ability of Monolingual Wav2vec2.0 for Low-resource Speech Recognition.
Proceedings of the International Joint Conference on Neural Networks, 2021

Loss Prediction: End-to-End Active Learning Approach For Speech Recognition.
Proceedings of the International Joint Conference on Neural Networks, 2021

Cross-Language Transfer Learning and Domain Adaptation for End-to-End Automatic Speech Recognition.
Proceedings of the 2021 IEEE International Conference on Multimedia and Expo, 2021

LVCNet: Efficient Condition-Dependent Modeling Network for Waveform Generation.
Proceedings of the IEEE International Conference on Acoustics, 2021

Unidirectional Memory-Self-Attention Transducer for Online Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2021

Joint Intent Detection and Slot Filling Based on Continual Learning Model.
Proceedings of the IEEE International Conference on Acoustics, 2021

Cyclegean: Cycle Generative Enhanced Adversarial Network for Voice Conversion.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

TGAVC: Improving Autoencoder Voice Conversion with Text-Guided and Adversarial Training.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

Reconstructing Dual Learning for Neural Voice Conversion Using Relatively Few Samples.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

Self-supervised Learning for Semantic Sentence Matching with Dense Transformer Inference Network.
Proceedings of the Web and Big Data - 5th International Joint Conference, 2021

A Novel Capsule Aggregation Framework for Natural Language Inference.
Proceedings of the Web and Big Data - 5th International Joint Conference, 2021

2020
Applying wav2vec2.0 to Speech Recognition in various low-resource languages.
CoRR, 2020

MLNET: An Adaptive Multiple Receptive-Field Attention Neural Network for Voice Activity Detection.
Proceedings of the Interspeech 2020, 2020

Prosody Learning Mechanism for Speech Synthesis System Without Text Length Limit.
Proceedings of the Interspeech 2020, 2020

A Real-Time Robot-Based Auxiliary System for Risk Evaluation of COVID-19 Infection.
Proceedings of the Interspeech 2020, 2020

Large-Scale Transfer Learning for Low-Resource Spoken Language Understanding.
Proceedings of the Interspeech 2020, 2020

Aligntts: Efficient Feed-Forward Text-to-Speech System Without Explicit Alignment.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

GraphTTS: Graph-to-Sequence Modelling in Neural Text-to-Speech.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Chinese Punctuation Prediction with Adaptive Attention and Dependency Tree.
Proceedings of the Knowledge Graph and Semantic Computing: Knowledge Graph and Cognitive Intelligence, 2020

Epidemic Guard: A COVID-19 Detection System for Elderly People.
Proceedings of the Web and Big Data - 4th International Joint Conference, 2020

2011
A flexible framework for HMM based noise robust speech recognition using generalized parametric space polynomial regression.
Sci. China Inf. Sci., 2011

Generalized Variable Parameter HMMs for Noise Robust Speech Recognition.
Proceedings of the INTERSPEECH 2011, 2011

2010
A novel subspace speech enhancement approach based on test of hypothesis and masking properties.
Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010

Masking property based microphone array post-filter design.
Proceedings of the INTERSPEECH 2010, 2010

2008
Microphone Array Post-Filter Based on Auditory Filtering.
Proceedings of the 6th International Symposium on Chinese Spoken Language Processing, 2008

An effective microphone array post-filter in arbitrary environments.
Proceedings of the INTERSPEECH 2008, 2008


  Loading...