Alexander H. Liu

Orcid: 0000-0003-1628-0855

Affiliations:
  • Massachusetts Institute of Technology (MIT), Computer Science and Artificial Intelligence Laboratory, Cambridge, MA, USA
  • National Taiwan University, Taipei, Taiwan (former)


According to our database1, Alexander H. Liu authored at least 30 papers between 2018 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Towards audio language modeling - an overview.
CoRR, 2024

Codec-SUPERB: An In-Depth Analysis of Sound Codec Models.
CoRR, 2024

Revisiting Self-supervised Learning of Speech Representation from a Mutual Information Perspective.
CoRR, 2024

2023
Generative Pre-training for Speech with Flow Matching.
CoRR, 2023

Self-supervised Fine-tuning for Improved Content Representations by Speaker-invariant Clustering.
CoRR, 2023

Listen, Think, and Understand.
CoRR, 2023

DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Contrastive Audio-Visual Masked Autoencoder.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Joint Audio and Speech Understanding.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022
A Fully Integrated 1.7mW Attention-Based Automatic Speech Recognition Processor.
IEEE Trans. Circuits Syst. II Express Briefs, 2022

UAVM: Towards Unifying Audio and Visual Models.
IEEE Signal Process. Lett., 2022

UAVM: A Unified Model for Audio-Visual Learning.
CoRR, 2022

Towards End-to-End Unsupervised Speech Recognition.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Simple and Effective Unsupervised Speech Synthesis.
Proceedings of the Interspeech 2022, 2022

On the Interplay between Sparsity, Naturalness, Intelligibility, and Prosody in Speech Synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2022

Cross-Modal Discrete Representation Learning.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

2021
Improving Automatic Speech Recognition and Speech Translation via Word Embedding Prediction.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

Routing with Self-Attention for Multimodal Capsule Networks.
CoRR, 2021

End-to-End Whispered Speech Recognition with Frequency-Weighted Approaches and Pseudo Whisper Pre-training.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021

PARP: Prune, Adjust and Re-Prune for Self-Supervised Speech Recognition.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Non-Autoregressive Predictive Coding for Learning Speech Representations from Local Dependencies.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Spoken Moments: Learning Joint Audio-Visual Representations From Video Descriptions.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

2020
End-to-end Whispered Speech Recognition with Frequency-weighted Approaches and Layer-wise Transfer Learning.
CoRR, 2020

Semi-Supervised Learning for Multi-Speaker Text-to-Speech Synthesis Using Discrete Speech Representation.
Proceedings of the Interspeech 2020, 2020

Towards Unsupervised Speech Recognition and Synthesis with Quantized Speech Representation Learning.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Sequence-to-Sequence Automatic Speech Recognition with Word Embedding Regularization and Fused Decoding.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Worse WER, but Better BLEU? Leveraging Word Embedding as Intermediate in Multitask End-to-End Speech Translation.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

2019
Adversarial Training of End-to-end Speech Recognition Using a Criticizing Language Model.
Proceedings of the IEEE International Conference on Acoustics, 2019

Towards Scene Understanding: Unsupervised Monocular Depth Estimation With Semantic-Aware Representation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

2018
A Unified Feature Disentangler for Multi-Domain Image Translation and Manipulation.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018


  Loading...