Jinsong Zhang

Orcid: 0000-0002-1603-3136

Affiliations:
  • Beijing Language and Culture University, Beijing, China
  • TU Dresden, Institute of Acoustics and Speech Communication, Dresden, Germany (former)


According to our database1, Jinsong Zhang authored at least 127 papers between 1996 and 2023.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2023
LIMI-VC: A Light Weight Voice Conversion Model with Mutual Information Disentanglement.
Proceedings of the IEEE International Conference on Acoustics, 2023

2022
Modeling Unsupervised Empirical Adaptation by DPGMM and DPGMM-RNN Hybrid Model to Extract Perceptual Features for Low-Resource ASR.
IEEE ACM Trans. Audio Speech Lang. Process., 2022

Text-Aware End-to-end Mispronunciation Detection and Diagnosis.
CoRR, 2022

AdaptiveFormer: A Few-shot Speaker Adaptive Speech Synthesis Model based on FastSpeech2.
Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022

An Entropy-based Study on the Acquisition of Mandarin Initial Consonants by Korean Learners.
Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022

The Disyllabic Tone Production and Tone Context Effect in Mandarin-speaking Children with Cochlear Implants.
Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022

An Exploratory Study for Quantifying the Contextual Information for Successful Chinese L2 Speech Comprehension.
Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022

The Contribution of Phonological and Fluency Factors to Chinese L2 Comprehensibility Ratings: A Case Study of Urdu-speaking Learners.
Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022

Self-Supervised Learning with Multi-Target Contrastive Coding for Non-Native Acoustic Modeling of Mispronunciation Verification.
Proceedings of the Interspeech 2022, 2022

A VR Interactive 3D Mandarin Pronunciation Teaching Model.
Proceedings of the Interspeech 2022, 2022

A study of production error analysis for Mandarin-speaking Children with Hearing Impairment.
Proceedings of the Interspeech 2022, 2022

The Contributions of Initials and Finals to L2 Chinese Comprehensibility Based on Functional Load Principle.
Proceedings of the International Conference on Asian Language Processing, 2022

The Importance of Lexical Tone for Sentence Understanding: Utilizing Functional Load Principle to Simulate Comprehension Process.
Proceedings of the International Conference on Asian Language Processing, 2022

Solving Size and Performance Dilemma by Reversible and Invertible Recurrent Network for Speech Enhancement: Solving Size and Performance Dilemma by Reversible and Invertible Recurrent Network for Speech Enhancement.
Proceedings of the 5th International Conference on Artificial Intelligence and Pattern Recognition, 2022

Voicifier-LN: An Novel Approach to Elevate the Speaker Similarity for General Zero-shot Multi-Speaker TTS.
Proceedings of the 5th International Conference on Artificial Intelligence and Pattern Recognition, 2022

2021
Tackling Perception Bias in Unsupervised Phoneme Discovery Using DPGMM-RNN Hybrid Model and Functional Load.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

Non-native acoustic modeling for mispronunciation verification based on language adversarial representation learning.
Neural Networks, 2021

A Study of the Vowel Context Effect on Initial Stops of Mandarin Produced by Native and Nonnative Speakers.
Int. J. Asian Lang. Process., 2021

Speech Enhancement using Separable Polling Attention and Global Layer Normalization followed with PReLU.
CoRR, 2021

A Full Text-Dependent End to End Mispronunciation Detection and Diagnosis with Easy Data Augmentation Techniques.
CoRR, 2021

A Comparison Study on the Alignment of Prosodic and Semantic Units and Its Effects on F0 Shifting in L1 and L2 English Spontaneous Speech.
Proceedings of the 12th International Symposium on Chinese Spoken Language Processing, 2021

Effects of Mandarin Tones on Acoustic Cue Weighting Patterns for Prominence.
Proceedings of the 12th International Symposium on Chinese Spoken Language Processing, 2021

A Practical Way to Improve Automatic Phonetic Segmentation Performance.
Proceedings of the 12th International Symposium on Chinese Spoken Language Processing, 2021

Multi-Scale Model for Mandarin Tone Recognition.
Proceedings of the 12th International Symposium on Chinese Spoken Language Processing, 2021

Relationships Between Perceptual Distinctiveness, Articulatory Complexity and Functional Load in Speech Communication.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

A Preliminary Study on Discourse Prosody Encoding in L1 and L2 English Spontaneous Narratives.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

A Study on Fine-Tuning wav2vec2.0 Model for the Task of Mispronunciation Detection and Diagnosis.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Towards the Use of Pretrained Language Model GPT-2 for Testing the Hypothesis of Communicative Efficiency in the Lexicon.
Proceedings of the International Conference on Asian Language Processing, 2021

A Preliminary Study on the Gender Differences of Mandarin Focal Accent.
Proceedings of the International Conference on Asian Language Processing, 2021

2020
Improving Pronunciation Erroneous Tendency Detection with Multi-Model Soft Targets.
J. Signal Process. Syst., 2020

A Study on the Robustness of Pitch-Range Estimation from Brief Speech Segments.
Int. J. Asian Lang. Process., 2020

Pronunciation Erroneous Tendency Detection with Language Adversarial Represent Learning.
Proceedings of the Interspeech 2020, 2020

A Mandarin L2 Learning APP with Mispronunciation Detection and Feedback.
Proceedings of the Interspeech 2020, 2020

Joint Detection of Sentence Stress and Phrase Boundary for Prosody.
Proceedings of the Interspeech 2020, 2020

Automatic Scoring at Multi-Granularity for L2 Pronunciation.
Proceedings of the Interspeech 2020, 2020

An Investigation of the Target Approximation Model for Tone Modeling and Recognition in Continuous Mandarin Speech.
Proceedings of the Interspeech 2020, 2020

Perception and Production of Mandarin Initial Stops by Native Urdu Speakers.
Proceedings of the Interspeech 2020, 2020

Formant Tracking Using Dilated Convolutional Networks Through Dense Connection with Gating Mechanism.
Proceedings of the Interspeech 2020, 2020

Production of L2 Mandarin Mono- and Di-syllabic Tones by Kazakh Learners of Different Levels.
Proceedings of the International Conference on Asian Language Processing, 2020

The Effect of Vowel Contexts on Voice Onset Time of Mandarin Word-Initial Stops.
Proceedings of the International Conference on Asian Language Processing, 2020

Gated Bilinear Networks for Vowel Formant Estimation.
Proceedings of the International Conference on Asian Language Processing, 2020

2019
Capturing L1 Influence on L2 Pronunciation by Simulating Perceptual Space Using Acoustic Features.
Proceedings of the Interspeech 2019, 2019

The Production of Chinese Affricates /ts/ and /ts<sup>h</sup>/ by Native Urdu Speakers.
Proceedings of the Interspeech 2019, 2019

Articulatory Features Based TDNN Model for Spoken Language Recognition.
Proceedings of the International Conference on Asian Language Processing, 2019

Improving Mandarin Prosody Boundary Detection by Using Phonetic Information and Deep LSTM Model.
Proceedings of the International Conference on Asian Language Processing, 2019

Are Scoring Feedback of CAPT Systems Helpful for Pronunciation Correction? -An Exception of Mandarin Nasal Finals.
Proceedings of the International Conference on Asian Language Processing, 2019

Oral Motor Exercises For CSL Learners to Master Productions of Retroflex And Non-Retroflex Consonants.
Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

Multi-Task Based Mispronunciation Detection of Children Speech Using Multi-Lingual Information.
Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

A Study on Mispronunciation Detection Based on Fine-grained Speech Attribute.
Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

Acquisition of L2 Mandarin Rhythm By Russian and Japanese Learners.
Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

2018
Improving Mandarin Tone Recognition Based on DNN by Combining Acoustic and Articulatory Features Using Extended Recognition Networks.
J. Signal Process. Syst., 2018

Optimizing DPGMM Clustering in Zero Resource Setting Based on Functional Load.
Proceedings of the 6th Intl. Workshop on Spoken Language Technologies for Under-Resourced Languages, 2018

Acoustic Comparison of Vowel Articulation When Combined with Different Tone Categories in Mandarin.
Proceedings of the 2018 Oriental COCOSDA, 2018

LSTM-Based Pitch Range Estimation from Spectral Information of Brief Speech Input.
Proceedings of the 11th International Symposium on Chinese Spoken Language Processing, 2018

A Preliminary Study on Quantitative Calculation of Prosodic Strength in Mandarin Speech.
Proceedings of the 11th International Symposium on Chinese Spoken Language Processing, 2018

A Study on Landmark Verification of Mandarin Alveolar-palatal Consonants.
Proceedings of the 11th International Symposium on Chinese Spoken Language Processing, 2018

L2 Mispronunciation Verification Based on Acoustic Phone Embedding and Siamese Networks.
Proceedings of the 11th International Symposium on Chinese Spoken Language Processing, 2018

Improve the Accuracy of Non-native Speech Annotation with a Semi-automatic Approach.
Proceedings of the 11th International Symposium on Chinese Spoken Language Processing, 2018

A Multi-modal Soft Targets Approach for Pronunciation Erroneous Tendency Detection.
Proceedings of the 11th International Symposium on Chinese Spoken Language Processing, 2018

Emotional Prosody Perception in Mandarin-speaking Congenital Amusics.
Proceedings of the Interspeech 2018, 2018

Improving Mandarin Tone Recognition Using Convolutional Bidirectional Long Short-Term Memory with Attention.
Proceedings of the Interspeech 2018, 2018

Analysis of L2 Learners' Progress of Distinguishing Mandarin Tone 2 and Tone 3.
Proceedings of the Interspeech 2018, 2018

A Preliminary Study on Tonal Coarticulation in Continuous Speech.
Proceedings of the Interspeech 2018, 2018

Interactions between Vowels and Nasal Codas in Mandarin Speakers' Perception of Nasal Finals.
Proceedings of the Interspeech 2018, 2018

2017
Articulatory Modeling for Pronunciation Error Detection without Non-Native Training Data Based on DNN Transfer Learning.
IEICE Trans. Inf. Syst., 2017

Reanalyze Fundamental Frequency Peak Delay in Mandarin.
Proceedings of the Interspeech 2017, 2017

The Influence on Realization and Perception of Lexical Tones from Affricate's Aspiration.
Proceedings of the Interspeech 2017, 2017

Effective articulatory modeling for pronunciation error detection of L2 learner without non-native training data.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Improving pronunciation erroneous tendency detection with convolutional long short-term memory.
Proceedings of the 2017 International Conference on Asian Language Processing, 2017

A study on quantitative computation for prosodie strength of Mandarin speech.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

A study of automatic annotation of PETs with articulatory features.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

A study on landmark detection based on CTC and its application to pronunciation error detection.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

A comparison study of information contributions of phonemic contrasts in Mandarin.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

2016
Automatic detection of rhythmic patterns in native and L2 speech: Chinese, Japanese, and Japanese L2 Chinese.
Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016

Acoustic correlates and gender effects in production and perception of Japanese polite speech.
Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016

Senone log-likelihood ratios based articulatory features in pronunciation erroneous tendency detecting.
Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016

Automatic Mandarin prosody boundary detecting based on tone nucleus features and DNN model.
Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016

Improving Mandarin tone recognition based on DNN by combining acoustic and articulatory features.
Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016

A study on perceptual training of Japanese CSL learners to discriminate Mandarin lexical tones.
Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016

The perceptual cues for nasal finals in standard Chinese.
Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016

Pronunciation error detection using DNN articulatory model based on multi-lingual and multi-task learning.
Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016

A study on functional load of Chinese prosodic boundaries under reduction of syllable information.
Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016

The preliminary study of influence on tone perception from segments.
Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016

Analysis of Chinese Syllable Durations in Running Speech of Japanese L2 Learners.
Proceedings of the Interspeech 2016, 2016

Automatic Pronunciation Evaluation of Non-Native Mandarin Tone by Using Multi-Level Confidence Measures.
Proceedings of the Interspeech 2016, 2016

Landmark of Mandarin nasal codas and its application in pronunciation error detection.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

DNN based detection of pronunciation erroneous tendency in data sparse condition.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2016

Multi-lingual and multi-task DNN learning for articulatory error detection.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2016

2015
Phoneme Set Design for Speech Recognition of English by Japanese.
IEICE Trans. Inf. Syst., 2015

A comparison study on contextual modeling for estimating functional loads of phonological contrasts.
Proceedings of the 2015 International Conference Oriental COCOSDA held jointly with 2015 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA/CASLRE), 2015

Coda's duration on perception of mandarin syllables with alveolar/velar nasal endings by Japanese CSL learners.
Proceedings of the 2015 International Conference Oriental COCOSDA held jointly with 2015 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA/CASLRE), 2015

A study on robust detection of pronunciation erroneous tendency based on deep neural network.
Proceedings of the INTERSPEECH 2015, 2015

2014
Phoneme Set Design Using English Speech Database by Japanese for Dialogue-Based English CALL Systems.
Proceedings of the Ninth International Conference on Language Resources and Evaluation, 2014

The training of the tone of Mandarin two-syllable words based on pitch projection synthesis speech.
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

Influences of vowels on perception of nasal codas in Mandarin for Japanese learners and Chinese.
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

Cross-language comparison of F0 range in speakers of native Chinese, native Japanese and Chinese L2 of Japanese: Preliminary results of a corpus-based analysis.
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

A Study on the long-term retention effects of Japanese C2L learners to distinguish Mandarin Chinese Tone 2 and Tone 3 after perceptual training.
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

Automatic mispronunciation detection for Mandarin Chinese based on articulation place and articulation manner.
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

A preliminary study on acoustic correlates of tone2+tone2 disyllabic word stress in Mandarin.
Proceedings of the INTERSPEECH 2014, 2014

A preliminary study on ASR-based detection of Chinese mispronunciation by Japanese learners.
Proceedings of the INTERSPEECH 2014, 2014

A clustering analysis of Chinese consonants based on functional load.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014

2013
Improve Japanese C2L learners' capability to distinguish Chinese tone 2 and tone 3 through perceptual training.
Proceedings of the 2013 International Conference Oriental COCOSDA held jointly with 2013 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA/CASLRE), 2013

Using Mutual Information Criterion to Design an Effective Lexicon for Chinese Pinyin-to-Character Conversion.
Proceedings of the 2013 International Conference on Asian Language Processing, 2013

2012
A comparative study of perception of tone 2 and tone 3 in Mandarin by native speakers and Japanese learners.
Proceedings of the 8th International Symposium on Chinese Spoken Language Processing, 2012

2010
A distinctive feature based method for evaluating the phonetic transcription of a non-native speech database.
Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010

A study on Functional Loads of phonetic contrasts under context based on Mutual Information of Chinese text and phonemes.
Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010

Developing a Chinese L2 speech database of Japanese learners with narrow-phonetic labels for computer assisted pronunciation training.
Proceedings of the INTERSPEECH 2010, 2010

2008
An Improved Greedy Search Algorithm for the Development of a Phonetically Rich Speech Corpus.
IEICE Trans. Inf. Syst., 2008

Using Mutual Information Criterion to Design an Efficient Phoneme Set for Chinese Speech Recognition.
IEICE Trans. Inf. Syst., 2008

Tone Recognition of Continuous Mandarin Speech Based on Tone Nucleus Model and Neural Network.
IEICE Trans. Inf. Syst., 2008

Utilization of Huge Written Text Corpora for Conversational Speech Recognition.
Proceedings of the 6th International Symposium on Chinese Spoken Language Processing, 2008

2006
The ATR multilingual speech-to-speech translation system.
IEEE Trans. Speech Audio Process., 2006

Automatic Derivation of a Phoneme Set with Tone Information for Chinese Speech Recognition Based on Mutual Information Criterion.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

2005
Tone nucleus-based multi-level robust acoustic tonal modeling of sentential F0 variations for Chinese continuous speech tone recognition.
Speech Commun., 2005

2004
Tone nucleus modeling for Chinese lexical tone recognition.
Speech Commun., 2004

Multi-lingual speech recognition system for speech-to-speech translation.
Proceedings of the 2004 International Workshop on Spoken Language Translation, 2004

Efficient tone classification of speaker independent continuous Chinese speech using anchoring based discriminating features.
Proceedings of the INTERSPEECH 2004, 2004

A study on robust segmentation and location of tone nuclei in Chinese continuous speech.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

2003
A multilevel framework to model the inherently confounding nature of sentential F0sentential F0 contours contours for recognizing Chinese lexical tones.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

2002
Modeling varying pauses to develop robust acoustic models for recognizing noisy conversational speech.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Weighted graph based decision tree optimization for high accuracy acoustic modeling.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

2001
A hybrid approach to enhance task portability of acoustic models in Chinese speech recognition.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

2000
Discriminating Chinese lexical tones by anchoring F0 features.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Anchoring hypothesis and its application to tone recognition of Chinese continuous speech.
Proceedings of the IEEE International Conference on Acoustics, 2000

1999
Tone recognition of Chinese continuous speech using tone critical segments.
Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

1998
A robust tone recognition method of Chinese based on sub-syllabic F0 contours.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

1996
Adaptive recognition method based on posterior use of distribution pattern of output probabilities.
Proceedings of the 4th International Conference on Spoken Language Processing, 1996


  Loading...