Ming Tu

This page is a disambiguation page, it actually contains multiple papers from persons of the same or a similar name.

Bibliography

2026

FlowPortrait: Reinforcement Learning for Audio-Driven Portrait Video Generation.

[BibT_eX]

[DOI]

CoRR, March, 2026

Identity-Preserving Covert Communication With Generative Perturbation.

[BibT_eX]

[DOI]

IEEE Trans. Netw. Sci. Eng., 2026

2025

Process-Supervised Reinforcement Learning for Interactive Multimodal Tool-Use Agents.

[BibT_eX]

[DOI]

CoRR, September, 2025

Concealing Radio Frequency Fingerprints via Active Adversarial Perturbation.

[BibT_eX]

[DOI]

IEEE Trans. Netw. Sci. Eng., 2025

2024

Seed-ASR: Understanding Diverse Speech and Contexts with LLM-based Speech Recognition.

[BibT_eX]

[DOI]

CoRR, 2024

Erasing Radio Frequency Fingerprints via Active Adversarial Perturbation.

[BibT_eX]

[DOI]

CoRR, 2024

VoiceShop: A Unified Speech-to-Speech Framework for Identity-Preserving Zero-Shot Voice Editing.

[BibT_eX]

[DOI]

CoRR, 2024

Physical Layer Overshadowing Attack on Semantic Communication System.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Communications, 2024

The Hybrid Diagnosability of Hypercube Under the rmHMM<sup>*</sup> (Hybrid rmMM<sup>*</sup>) Model.

[BibT_eX]

[DOI]

Proceedings of the Computing and Combinatorics - 30th International Conference, 2024

2023

Language-Universal Phonetic Representation in Multilingual Speech Pretraining for Low-Resource Speech Recognition.

[BibT_eX]

[DOI]

CoRR, 2023

Efficient Neural Music Generation.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Language-Universal Phonetic Representation in Multilingual Speech Pretraining for Low-Resource Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Language-universal Phonetic Encoder for Low-resource Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Memory Augmented Lookup Dictionary Based Language Modeling for Automatic Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Streaming Voice Conversion via Intermediate Bottleneck Features and Non-Streaming Teacher Guidance.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

2022

Cloning One's Voice Using Very Limited Data in the Wild.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

2020

Graph Sequential Network for Reasoning over Sequences.

[BibT_eX]

[DOI]

CoRR, 2020

Linear-Quadratic Tracking Control of a Commercial Vehicle Air Brake System.

[BibT_eX]

[DOI]

IEEE Access, 2020

Speaker-Invariant Affective Representation Learning via Adversarial Training.

[BibT_eX]

[DOI]

Panayiotis G. Georgiou

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Self-Supervised Audio-Visual Representation Learning for in-the-wild Videos.

[BibT_eX]

[DOI]

Ashok K. Krishnamurthy

Proceedings of the 2020 IEEE International Conference on Big Data (IEEE BigData 2020), 2020

Select, Answer and Explain: Interpretable Multi-Hop Reading Comprehension over Multiple Documents.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019

Articulation constrained learning with application to speech emotion recognition.

[BibT_eX]

[DOI]

EURASIP J. Audio Speech Music. Process., 2019

Multiple instance learning with graph neural networks.

[BibT_eX]

[DOI]

CoRR, 2019

I4U Submission to NIST SRE 2018: Leveraging from a Decade of Shared Experiences.

[BibT_eX]

[DOI]

CoRR, 2019

Towards adversarial learning of speaker-invariant representation for speech emotion recognition.

[BibT_eX]

[DOI]

CoRR, 2019

Multi-hop Reading Comprehension across Multiple Documents by Reasoning over Heterogeneous Graphs.

[BibT_eX]

[DOI]

Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

2018

A Discriminative Acoustic-Prosodic Approach for Measuring Local Entrainment.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Investigating the Role of L1 in Automatic Pronunciation Evaluation of L2 Speech.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Simulating Dysarthric Speech for Training Data Augmentation in Clinical Speech Applications.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017

Improving efficiency in sparse learning with the feedforward inhibitory motif.

[BibT_eX]

[DOI]

Neurocomputing, 2017

Interpretable Objective Assessment of Dysarthric Speech Based on Deep Neural Networks.

[BibT_eX]

[DOI]

Ming Tu

Visar Berisha

Julie Liss

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Speech enhancement based on Deep Neural Networks with skip connections.

[BibT_eX]

[DOI]

Ming Tu

Xianxian Zhang

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Objective assessment of pathological speech using distribution regression.

[BibT_eX]

[DOI]

Ming Tu

Visar Berisha

Julie Liss

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

2016

Reducing the Model Order of Deep Neural Networks Using Information Theory.

[BibT_eX]

[DOI]

Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2016

Accent Identification by Combining Deep Neural Networks and Recurrent Neural Networks Trained on Long and Short Term Features.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Ranking the parameters of deep neural networks using the fisher information.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Online speaking rate estimation using recurrent neural networks.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Models for objective evaluation of dysarthric speech from data annotated by multiple listeners.

[BibT_eX]

[DOI]

Proceedings of the 50th Asilomar Conference on Signals, Systems and Computers, 2016

2015

Convex Weighting Criteria for Speaking Rate Estimation.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2015

Estimating speaking rate in spontaneous discourse.

[BibT_eX]

[DOI]

Proceedings of the 49th Asilomar Conference on Signals, Systems and Computers, 2015

2014

Towards improving statistical model based voice activity detection.

[BibT_eX]

[DOI]

Ming Tu

Xiang Xie

Yishan Jiao

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Computational Auditory Scene Analysis Based Voice Activity Detection.

[BibT_eX]

[DOI]

Ming Tu

Xiang Xie

Xingyu Na

Proceedings of the 22nd International Conference on Pattern Recognition, 2014

Improving voice quality of HMM-based speech synthesis using voice conversion method.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2014

2012

OpenCDS ePHR: an Open-Source, Standards-Based Decision Support Platform for Electronic Public Health Reporting.

[BibT_eX]

[DOI]

Proceedings of the AMIA 2012, 2012

Ming Tu

Bibliography

Loading...