Wei Li

Min Sheng

CoRR, May, 2026

On Capacity and Delay of Wireless Networks with Node Failures.

[BibT_eX]

[DOI]

CoRR, May, 2026

Teaching Audio-Language Models to Reason over Time.

[BibT_eX]

[DOI]

Proceedings of the 2026 International Conference on Multimedia Retrieval, 2026

Efficient Music Denoising with Channel Attention and Multi-Scale Sequence Encoding.

[BibT_eX]

[DOI]

Seungmin Ha

Yulun Wu

Proceedings of the 2026 International Conference on Multimedia Retrieval, 2026

2025

ABC-Eval: Benchmarking Large Language Models on Symbolic Music Understanding and Instruction Following.

[BibT_eX]

[DOI]

CoRR, September, 2025

Robust Throughput Capacity of Multi-Connectivity Wireless Networks.

[BibT_eX]

[DOI]

IEEE Trans. Commun., June, 2025

CCMusic: An Open and Diverse Database for Chinese Music Information Retrieval Research.

[BibT_eX]

[DOI]

Trans. Int. Soc. Music. Inf. Retr., 2025

Adversarial Contrastive Autoencoder With Shared Attention for Audio-Visual Correlation Learning.

[BibT_eX]

[DOI]

IEEE Access, 2025

Dialogue-to-Video Retrieval via Multi-Grained Attention Network.

[BibT_eX]

[DOI]

IEEE Access, 2025

EMelodyGen: Emotion-Conditioned Melody Generation in ABC Notation with the Musical Feature Template.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, ICME 2025 - Workshops, Nantes, France, June 30, 2025

BeatFM: Improving Beat Tracking with Pre-trained Music Foundation Model.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2025

HingeNet: A Harmonic-Aware Fine-Tuning Approach for Beat Tracking.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2025

KCE-Unet: A novel music denoising method with KANConv ECA Unet.

[BibT_eX]

[DOI]

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Ultra Lightweight Singing Melody Extraction via Combination of Convolution and MLP.

[BibT_eX]

[DOI]

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

A Mamba-based Network for Semi-supervised Singing Melody Extraction Using Confidence Binary Regularization.

[BibT_eX]

[DOI]

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

2024

Technical Report for ActivityNet Challenge 2022 - Temporal Action Localization.

[BibT_eX]

[DOI]

CoRR, 2024

Technical Report for Soccernet 2023 - Dense Video Captioning.

[BibT_eX]

[DOI]

CoRR, 2024

Technical Report for SoccerNet Challenge 2022 - Replay Grounding Task.

[BibT_eX]

[DOI]

CoRR, 2024

Semi-Supervised Self-Learning Enhanced Music Emotion Recognition.

[BibT_eX]

[DOI]

CoRR, 2024

Harmonic Frequency-Separable Transformer for Instrument-Agnostic Music Transcription.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2024

Improving Drum Source Separation with Temporal-Frequency Statistical Descriptors.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2024

A Scalable Sparse Transformer Model for Singing Melody Extraction.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Mertech: Instrument Playing Technique Detection Using Self-Supervised Pretrained Model with Multi-Task Finetuning.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

2023

Stripe-Transformer: deep stripe feature learning for music source separation.

[BibT_eX]

[DOI]

EURASIP J. Audio Speech Music. Process., December, 2023

The capacity of k-connectivity d-dimensional wireless networks with node failure.

[BibT_eX]

[DOI]

Sci. China Inf. Sci., October, 2023

Multi-scale network with shared cross-attention for audio-visual correlation learning.

[BibT_eX]

[DOI]

Neural Comput. Appl., September, 2023

Melody Generation from Lyrics with Local Interpretability.

[BibT_eX]

[DOI]

ACM Trans. Multim. Comput. Commun. Appl., 2023

Variational Autoencoder with CCA for Audio-Visual Cross-modal Retrieval.

[BibT_eX]

[DOI]

ACM Trans. Multim. Comput. Commun. Appl., 2023

Intelligent Channel Prediction and Power Adaptation in LEO Constellation for 6G.

[BibT_eX]

[DOI]

IEEE Netw., 2023

A neural harmonic-aware network with gated attentive fusion for singing melody extraction.

[BibT_eX]

[DOI]

Neurocomputing, 2023

A Holistic Evaluation of Piano Sound Quality.

[BibT_eX]

[DOI]

CoRR, 2023

WikiMT++ Dataset Card.

[BibT_eX]

[DOI]

CoRR, 2023

MFAE: Masked frame-level autoencoder with hybrid-supervision for low-resource music transcription.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2023

LC-Beating: An Online System for Beat and Downbeat Tracking using Latency-Controlled Mechanism.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2023

Frame-Level Multi-Label Playing Technique Detection Using Multi-Scale Network and Self-Attention Mechanism.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

2022

A personality-guided affective brain - computer interface for implementation of emotional intelligence in machines.

[BibT_eX]

[DOI]

Frontiers Inf. Technol. Electron. Eng., 2022

SEAL: A Large-scale Video Dataset of Multi-grained Spatio-temporally Action Localization.

[BibT_eX]

[DOI]

CoRR, 2022

Faster-TAD: Towards Temporal Action Detection with Proposal Generation and Classification in a Unified Network.

[BibT_eX]

[DOI]

CoRR, 2022

Melody Generation from Lyrics Using Three Branch Conditional LSTM-GAN.

[BibT_eX]

[DOI]

Proceedings of the MultiMedia Modeling - 28th International Conference, 2022

Singing Voice Detection via Similarity-Based Semi-Supervised Learning.

[BibT_eX]

[DOI]

Xi Chen

Yongwei Gao

Christophe De Vleeschouwer

Proceedings of the 4th ACM International Conference on Multimedia in Asia, 2022

SoccerNet 2022 Challenges Results.

[BibT_eX]

[DOI]

Alexandre Alahi

Bernard Ghanem

Marc Van Droogenbroeck

Miguel Santos Marques

Proceedings of the MMSports@MM 2022: Proceedings of the 5th International ACM Workshop on Multimedia Content Analysis in Sports, 2022

HPPNet: Modeling the Harmonic Structure and Pitch Invariance in Piano Transcription.

[BibT_eX]

[DOI]

Proceedings of the 23rd International Society for Music Information Retrieval Conference, 2022

Automatic Chinese National Pentatonic Modes Recognition Using Convolutional Neural Network.

[BibT_eX]

[DOI]

Proceedings of the 23rd International Society for Music Information Retrieval Conference, 2022

Playing Technique Detection by Fusing Note Onset Information in Guzheng Performance.

[BibT_eX]

[DOI]

Proceedings of the 23rd International Society for Music Information Retrieval Conference, 2022

Multimodal Music Emotion Recognition with Hierarchical Cross-Modal Attention Network.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2022

HarmoF0: Logarithmic Scale Dilated Convolution for Pitch Estimation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2022

A Glance-and-Gaze Network for Respiratory Sound Classification.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Hierarchical Graph-Based Neural Network for Singing Melody Extraction.

[BibT_eX]

[DOI]

Shuai Yu

Xi Chen

Proceedings of the IEEE International Conference on Acoustics, 2022

Deepchorus: A Hybrid Model of Multi-Scale Convolution And Self-Attention for Chorus Detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Tonet: Tone-Octave Network for Singing Melody Extraction from Polyphonic Music.

[BibT_eX]

[DOI]

Taylor Berg-Kirkpatrick

Shlomo Dubnov

Proceedings of the IEEE International Conference on Acoustics, 2022

Robust Capacity of Wireless Networks Under Cascading Failures.

[BibT_eX]

[DOI]

Proceedings of the IEEE Global Communications Conference, 2022

MV-TAL: Mulit-view Temporal Action Localization in Naturalistic Driving.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2022

2021

HANME: Hierarchical Attention Network for Singing Melody Extraction.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2021

Musical Tempo Estimation Using a Multi-scale Network.

[BibT_eX]

[DOI]

Proceedings of the 22nd International Society for Music Information Retrieval Conference, 2021

Singer Identification Using Deep Timbre Feature Learning with KNN-NET.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Frequency-Temporal Attention Network for Singing Melody Extraction.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

An Hrnet-Blstm Model With Two-Stage Training For Singing Melody Extraction.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

2020

Computer Audition for Healthcare: Opportunities and Challenges.

[BibT_eX]

[DOI]

Frontiers Digit. Health, 2020

Music Artist Classification with WaveNet Classifier for Raw Waveform Audio Data.

[BibT_eX]

[DOI]

CoRR, 2020

Comparison for Improvements of Singing Voice Detection System Based on Vocal Separation.

[BibT_eX]

[DOI]

CoRR, 2020

Residual Attention Based Network for Automatic Classification of Phonation Modes.

[BibT_eX]

[DOI]

Xiaoheng Sun

Yiliang Jiang

Proceedings of the IEEE International Conference on Multimedia and Expo, 2020

2019

Automatic Audio Chord Recognition With MIDI-Trained Deep Feature and BLSTM-CRF Sequence Decoding Model.

[BibT_eX]

[DOI]

Yiming Wu

IEEE ACM Trans. Audio Speech Lang. Process., 2019

Large-vocabulary Chord Transcription Via Chord Structure Decomposition.

[BibT_eX]

[DOI]

Proceedings of the 20th International Society for Music Information Retrieval Conference, 2019

Vocal Melody Extraction via DNN-based Pitch Estimation and Salience-based Pitch Refinement.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

2018

Music Chord Recognition Based on Midi-Trained Deep Feature and BLSTM-CRF Hybird Decoding.

[BibT_eX]

[DOI]

Yiming Wu

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017

流行音乐主旋律提取技术综述 (Review on Main Melody Extraction from Pop Music).

[BibT_eX]

[DOI]

计算机科学, 2017

2015

SIFT-based local spectrogram image descriptor: a novel feature for robust music identification.

[BibT_eX]

[DOI]

EURASIP J. Audio Speech Music. Process., 2015

Towards Solving the Bottleneck of Pitch-based Singing Voice Separation.

[BibT_eX]

[DOI]

Bilei Zhu

Linwei Li

Proceedings of the 23rd Annual ACM Conference on Multimedia Conference, MM '15, Brisbane, Australia, October 26, 2015

Latent time-frequency component analysis: A novel pitch-based approach for singing voice separation.

[BibT_eX]

[DOI]

Xiu Zhang

Bilei Zhu

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

2013

Multi-Stage Non-Negative Matrix Factorization for Monaural Singing Voice Separation.

[BibT_eX]

[DOI]

IEEE Trans. Speech Audio Process., 2013

Low-order auditory Zernike moment: a novel approach for robust music identification in the compressed domain.

[BibT_eX]

[DOI]

Chuan Xiao

Yaduo Liu

EURASIP J. Adv. Signal Process., 2013

Music content authentication based on beat segmentation and fuzzy classification.

[BibT_eX]

[DOI]

Xiu Zhang

Zhurong Wang

EURASIP J. Audio Speech Music. Process., 2013

2012

A Double-Ranking Strategy for Long-Tail Product Recommendation.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE/WIC/ACM International Conferences on Web Intelligence, 2012

On the music content authentication.

[BibT_eX]

[DOI]

Bilei Zhu

Zhurong Wang

Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

2011

Towards content-based audio fragment authentication.

[BibT_eX]

[DOI]

Yue Yin

Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011

2010

Robust music identification based on low-order zernike moment in the compressed domain.

[BibT_eX]

[DOI]

Yaduo Liu

Proceedings of the Proceeding of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2010

Robust audio identification for MP3 popular music.

[BibT_eX]

[DOI]

Yaduo Liu

Proceedings of the Proceeding of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2010

A novel audio fingerprinting method robust to time scale modification and pitch shifting.

[BibT_eX]

[DOI]

Proceedings of the 18th International Conference on Multimedia 2010, 2010

Robust hashing for music copyright protection by combining beat segmentation and chroma.

[BibT_eX]

[DOI]

Proceedings of the 18th International Conference on Multimedia 2010, 2010

2009

A Robust Mesh Watermarking Scheme Based on PCA.

[BibT_eX]

[DOI]

Bin Yang

Proceedings of the Fifth International Conference on Image and Graphics, 2009

2008

Audio Quality-Based Authentication Using Wavelet Packet Decomposition and Best Tree Selection.

[BibT_eX]

[DOI]

Fang Chen

Proceedings of the 4th International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP 2008), 2008

2006

Localized audio watermarking technique robust against time-scale modification.

[BibT_eX]

[DOI]

Peizhong Lu

IEEE Trans. Multim., 2006

2004

Multilingual Collection Retrieving Via Ontology Alignment.

[BibT_eX]

[DOI]

Proceedings of the Digital Libraries: International Collaboration and Cross-Fertilization, 2004

2003

An Audio Watermarking Technique That Is Robust Against Random Cropping.

[BibT_eX]

[DOI]

Comput. Music. J., 2003

Audio Watermarking Based on Statistical Feature in Wavelet Domain.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International World Wide Web Conference - Posters, 2003

Content Based Localized Robust Audio Watermarking.

[BibT_eX]

[DOI]

Proceedings of the Interactive Multimedia on Next Generation Networks, 2003

Audio Watermarking Based on Music Content Analysis: Robust against Time Scale Modification.

[BibT_eX]

[DOI]

Proceedings of the Digital Watermarking, Second International Workshop, 2003

A Novel Feature-Based Robust Audio Watermarking for Copyright Protection.

[BibT_eX]

[DOI]

Proceedings of the 2003 International Symposium on Information Technology (ITCC 2003), 2003

Multi-channel Data Hiding Scheme for Color Images.

[BibT_eX]

[DOI]

Proceedings of the 2003 International Symposium on Information Technology (ITCC 2003), 2003

An Optimized Multi-bits Blind Watermarking Scheme.

[BibT_eX]

[DOI]

Proceedings of the Information and Communications Security, 5th International Conference, 2003

Robust Spatial Data Hiding for Color Images.

[BibT_eX]

[DOI]

Proceedings of the Communications and Multimedia Security, 2003

2000

New approaches without postprocessing to FIR system identification using selected order cumulants.

[BibT_eX]

[DOI]

Wan-Chi Siu

IEEE Trans. Signal Process., 2000

Speech enhancement using the constrained-optimization technique.

[BibT_eX]

[DOI]

Wan-Chi Siu

IEEE Signal Process. Lett., 2000

1999

Recovery of single source signal from noisy and reverberant environments using second-order statistics.

[BibT_eX]

[DOI]