Haohe Liu

According to our database1, Haohe Liu authored at least 45 papers between 2019 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Description on IEEE ICME 2024 Grand Challenge: Semi-supervised Acoustic Scene Classification under Domain Shift.
CoRR, 2024

Learning Temporal Resolution in Spectrogram for Audio Classification.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Learning to detect an animal sound from five examples.
Ecol. Informatics, November, 2023

Balanced SNR-Aware Distillation for Guided Text-to-Audio Generation.
CoRR, 2023

First-Shot Unsupervised Anomalous Sound Detection With Unknown Anomalies Estimated by Metadata-Assisted Audio Generation.
CoRR, 2023

Synth-AC: Enhancing Audio Captioning with Synthetic Supervision.
CoRR, 2023

Retrieval-Augmented Text-to-Audio Generation.
CoRR, 2023

AudioSR: Versatile Audio Super-resolution at Scale.
CoRR, 2023

Multimodal Fish Feeding Intensity Assessment in Aquaculture.
CoRR, 2023

AudioLDM 2: Learning Holistic Audio Generation with Self-supervised Pretraining.
CoRR, 2023

Separate Anything You Describe.
CoRR, 2023

MusicLDM: Enhancing Novelty in Text-to-Music Generation Using Beat-Synchronous Mixup Strategies.
CoRR, 2023

WavJourney: Compositional Audio Creation with Large Language Models.
CoRR, 2023

Text-Driven Foley Sound Generation With Latent Diffusion Model.
CoRR, 2023

E-PANNs: Sound Recognition Using Efficient Pre-trained Audio Neural Networks.
CoRR, 2023

Adapting Language-Audio Models as Few-Shot Audio Learners.
CoRR, 2023

Latent Diffusion Model Based Foley Sound Generation System For DCASE Challenge 2023 Task 7.
CoRR, 2023

Learning to detect an animal sound from five examples.
CoRR, 2023

Universal Source Separation with Weakly Labelled Data.
CoRR, 2023

WavCaps: A ChatGPT-Assisted Weakly-Labelled Audio Captioning Dataset for Audio-Language Multimodal Research.
CoRR, 2023

Leveraging Pre-trained AudioLDM for Text to Sound Generation: A Benchmark Study.
CoRR, 2023

AudioLDM: Text-to-Audio Generation with Latent Diffusion Models.
Proceedings of the International Conference on Machine Learning, 2023

Simple Pooling Front-Ends for Efficient Audio Classification.
Proceedings of the IEEE International Conference on Acoustics, 2023

Leveraging Pre-Trained AudioLDM for Sound Generation: A Benchmark Study.
Proceedings of the 31st European Signal Processing Conference, 2023

2022
ResGrad: Residual Denoising Diffusion Probabilistic Models for Text to Speech.
CoRR, 2022

Ontology-aware Learning and Evaluation for Audio Tagging.
CoRR, 2022

Visually-Aware Audio Captioning With Adaptive Audio-Visual Attention.
CoRR, 2022

Learning the Spectrogram Temporal Resolution for Audio Classification.
CoRR, 2022

Surrey System for DCASE 2022 Task 5: Few-shot Bioacoustic Event Detection with Segment-level Metric Learning.
CoRR, 2022

BinauralGrad: A Two-Stage Conditional Diffusion Probabilistic Model for Binaural Audio Synthesis.
CoRR, 2022

NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality.
CoRR, 2022

BinauralGrad: A Two-Stage Conditional Diffusion Probabilistic Model for Binaural Audio Synthesis.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Audio Visual Multi-Speaker Tracking with Improved GCF and PMBM Filter.
Proceedings of the Interspeech 2022, 2022

VoiceFixer: A Unified Framework for High-Fidelity Speech Restoration.
Proceedings of the Interspeech 2022, 2022

Separate What You Describe: Language-Queried Audio Source Separation.
Proceedings of the Interspeech 2022, 2022

Neural Vocoder is All You Need for Speech Super-resolution.
Proceedings of the Interspeech 2022, 2022

Leveraging Pre-trained BERT for Audio Captioning.
Proceedings of the 30th European Signal Processing Conference, 2022

Segment-Level Metric Learning for Few-Shot Bioacoustic Event Detection.
Proceedings of the 7th Workshop on Detection and Classification of Acoustic Scenes and Events 2022, 2022

2021
CWS-PResUNet: Music Source Separation with Channel-wise Subband Phase-aware ResUNet.
CoRR, 2021

VoiceFixer: Toward General Speech Restoration With Neural Vocoder.
CoRR, 2021

Joint Echo Cancellation and Noise Suppression based on Cascaded Magnitude and Complex Mask Estimation.
CoRR, 2021

Decoupling Magnitude and Phase Estimation with Deep ResUNet for Music Source Separation.
Proceedings of the 22nd International Society for Music Information Retrieval Conference, 2021

Speech Enhancement with Weakly Labelled Data from AudioSet.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

2020
Channel-Wise Subband Input for Better Voice and Accompaniment Separation on High Resolution Music.
Proceedings of the Interspeech 2020, 2020

2019
Design and Visualization of Guided GAN on MNIST dataset.
Proceedings of the 3rd International Conference on Graphics and Signal Processing, 2019


  Loading...