Shansong Liu

Orcid: 0000-0001-6202-5615

According to our database1, Shansong Liu authored at least 37 papers between 2017 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
Unison: Harmonizing Motion, Speech, and Sound for Human-Centric Audio-Video Generation.
CoRR, May, 2026

High-Fidelity Generative Audio Compression at 0.275kbps.
CoRR, February, 2026

MuMu-LLaMA: Multi-modal music understanding and generation via large language models.
Expert Syst. Appl., 2026

2025
Rare Word Recognition and Translation Without Fine-Tuning via Task Vector in Speech Models.
CoRR, December, 2025

YuE: Scaling Open Foundation Models for Long-Form Music Generation.
CoRR, March, 2025

Editing Music with Melody and Text: Using ControlNet for Diffusion Transformer.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

2024
Unified Pretraining Target Based Video-Music Retrieval with Music Rhythm and Video Optical Flow Information.
Proceedings of the IEEE International Conference on Acoustics, 2024

Humtrans: A Novel Open-Source Dataset for Humming Melody Transcription and Beyond.
Proceedings of the IEEE International Conference on Acoustics, 2024

Music Understanding LLaMA: Advancing Text-to-Music Generation with Question Answering and Captioning.
Proceedings of the IEEE International Conference on Acoustics, 2024

2023
M<sup>2</sup>UGen: Multi-modal Music Understanding and Generation with the Power of Large Language Models.
CoRR, 2023

Prosody Modeling with 3D Visual Information for Expressive Video Dubbing.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

2022
Neural Architecture Search for LF-MMI Trained Time Delay Neural Networks.
IEEE ACM Trans. Audio Speech Lang. Process., 2022

A Hierarchical Speaker Representation Framework for One-shot Singing Voice Conversion.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Exploiting Cross Domain Acoustic-to-Articulatory Inverted Features for Disordered Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2022

2021
Audio-Visual Multi-Channel Integration and Recognition of Overlapped Speech.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

Recent Progress in the CUHK Dysarthric Speech Recognition System.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

Bayesian Learning of LF-MMI Trained Time Delay Neural Networks for Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

Adversarial Data Augmentation for Disordered Speech Recognition.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Spectro-Temporal Deep Features for Disordered Speech Assessment and Recognition.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Bayesian Parametric and Architectural Domain Adaptation of LF-MMI Trained TDNNs for Elderly and Dysarthric Speech Recognition.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Development of the Cuhk Elderly Speech Recognition System for Neurocognitive Disorder Detection Using the Dementiabank Corpus.
Proceedings of the IEEE International Conference on Acoustics, 2021

Bayesian Transformer Language Models for Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2021

Neural Architecture Search for LF-MMI Trained Time Delay Neural Networks.
Proceedings of the IEEE International Conference on Acoustics, 2021

2020
Neural Architecture Search for Speech Recognition.
CoRR, 2020

Exploiting Cross-Domain Visual Feature Generation for Disordered Speech Recognition.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Investigation of Data Augmentation Techniques for Disordered Speech Recognition.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Audio-Visual Recognition of Overlapped Speech for the LRS2 Dataset.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019
Exploiting Visual Features Using Bayesian Gated Neural Networks for Disordered Speech Recognition.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

On the Use of Pitch Features for Disordered Speech Recognition.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

LF-MMI Training of Bayesian and Gaussian Process Time Delay Neural Networks for Speech Recognition.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

The CUHK Dysarthric Speech Recognition Systems for English and Cantonese.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Bayesian and Gaussian Process Neural Networks for Large Vocabulary Continuous Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2019

2018
Comprehensive simulation of metagenomic sequencing data with non-uniform sampling distribution.
Quant. Biol., 2018

Development of the CUHK Dysarthric Speech Recognition System for the UA Speech Corpus.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Gaussian Process Neural Networks for Speech Recognition.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Limited-Memory BFGS Optimization of Recurrent Neural Network Language Models for Speech Recognition.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017
Reading the Underlying Information From Massive Metagenomic Sequencing Data.
Proc. IEEE, 2017


  Loading...