Bilei Zhu

According to our database1, Bilei Zhu authored at least 26 papers between 2010 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
ByteComposer: a Human-like Melody Composition Method based on Language Model Agent.
CoRR, 2024

MINT: Boosting Audio-Language Model via Multi-Target Pre-Training and Instruction Tuning.
CoRR, 2024

2023
Graph contrastive learning with implicit augmentations.
Neural Networks, 2023

Joint Music and Language Attention Models for Zero-shot Music Tagging.
CoRR, 2023

Bytecover3: Accurate Cover Song Identification On Short Queries.
Proceedings of the IEEE International Conference on Acoustics, 2023

2022
GIO: A Timbre-informed Approach for Pitch Tracking in Highly Noisy Environments.
Proceedings of the ICMR '22: International Conference on Multimedia Retrieval, Newark, NJ, USA, June 27, 2022

Latent feature augmentation for chorus detection.
Proceedings of the 23rd International Society for Music Information Retrieval Conference, 2022

S3T: Self-Supervised Pre-Training with Swin Transformer For Music Classification.
Proceedings of the IEEE International Conference on Acoustics, 2022

Bytecover2: Towards Dimensionality Reduction of Latent Embedding for Efficient Cover Song Identification.
Proceedings of the IEEE International Conference on Acoustics, 2022

HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection.
Proceedings of the IEEE International Conference on Acoustics, 2022

Zero-Shot Audio Source Separation through Query-Based Learning from Weakly-Labeled Data.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
Attention-Based Cross-Modal Fusion for Audio-Visual Voice Activity Detection in Musical Video Streams.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Rule-Embedded Network for Audio-Visual Voice Activity Detection in Live Musical Video Streams.
Proceedings of the IEEE International Conference on Acoustics, 2021

An Hrnet-Blstm Model With Two-Stage Training For Singing Melody Extraction.
Proceedings of the IEEE International Conference on Acoustics, 2021

Singing Melody Extraction from Polyphonic Music based on Spectral Correlation Modeling.
Proceedings of the IEEE International Conference on Acoustics, 2021

Bytecover: Cover Song Identification Via Multi-Loss Training.
Proceedings of the IEEE International Conference on Acoustics, 2021

2020
Contrastive Unsupervised Learning for Audio Fingerprinting.
CoRR, 2020

2019
Vocal Melody Extraction via DNN-based Pitch Estimation and Salience-based Pitch Refinement.
Proceedings of the IEEE International Conference on Acoustics, 2019

2017
Fusing transcription results from polyphonic and monophonic audio for singing melody transcription in polyphonic music.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

2015
SIFT-based local spectrogram image descriptor: a novel feature for robust music identification.
EURASIP J. Audio Speech Music. Process., 2015

Towards Solving the Bottleneck of Pitch-based Singing Voice Separation.
Proceedings of the 23rd Annual ACM Conference on Multimedia Conference, MM '15, Brisbane, Australia, October 26, 2015

Latent time-frequency component analysis: A novel pitch-based approach for singing voice separation.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

2013
Multi-Stage Non-Negative Matrix Factorization for Monaural Singing Voice Separation.
IEEE Trans. Speech Audio Process., 2013

2012
On the music content authentication.
Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

2010
A novel audio fingerprinting method robust to time scale modification and pitch shifting.
Proceedings of the 18th International Conference on Multimedia 2010, 2010

Robust hashing for music copyright protection by combining beat segmentation and chroma.
Proceedings of the 18th International Conference on Multimedia 2010, 2010


  Loading...