We stand with Ukraine

We stand with Ukraine

Haonan Cheng

Orcid: 0000-0003-3407-4318

According to our database¹, Haonan Cheng authored at least 53 papers between 2017 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

EnvTriCascade: An Environment-Aware Tri-Stage Cascaded Framework for ESDD2 2026 Challenge.

[DOI]

,

,

,

,

,

,

,

CoRR, May, 2026

AT-ADD: All-Type Audio Deepfake Detection Challenge Evaluation Plan.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

CoRR, April, 2026

Implement Referring Expression Comprehension by Extending Auto-focus Lens to Locked Vision Model.

[DOI]

,

,

,

,

,

ACM Trans. Multim. Comput. Commun. Appl., February, 2026

Towards Explicit Acoustic Evidence Perception in Audio LLMs for Speech Deepfake Detection.

[DOI]

,

,

,

,

,

,

,

CoRR, January, 2026

Interpretable All-Type Audio Deepfake Detection with Audio LLMs via Frequency-Time Reinforcement Learning.

[DOI]

,

,

,

,

,

,

,

,

CoRR, January, 2026

Anchor-Based Multimodal Verification: A Dynamic Query Framework for Fake News Forensics in Short Videos.

[DOI]

,

,

,

,

,

IEEE Trans. Inf. Forensics Secur., 2026

Lightweight music recommendation via multi-physiological feature fusion.

[DOI]

,

,

,

,

Inf. Fusion, 2026

Knowledge-enhanced Chinese multimodal hate speech detection.

[DOI]

,

,

,

,

,

,

Expert Syst. Appl., 2026

Detect All-Type Deepfake Audio: Wavelet Prompt Tuning for Enhanced Auditory Perception.

[DOI]

,

,

,

,

,

,

,

Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

2025

EnvSSLAM-FFN: Lightweight Layer-Fused System for ESDD 2026 Challenge.

[DOI]

,

,

,

,

,

,

,

CoRR, December, 2025

Realistic garment texture transfer via 3D reconstruction and rendering.

[DOI]

,

Keizo Shinomori

,

Tanner DeLawyer

,

,

,

,

Signal Image Video Process., November, 2025

Neural Codec Source Tracing: Toward Comprehensive Attribution in Open-Set Condition.

[DOI]

,

,

,

,

,

,

,

,

,

CoRR, January, 2025

CLFormer: a cross-lingual transformer framework for temporal forgery localization.

[DOI]

,

,

,

Vis. Intell., 2025

Noise-Informed Diffusion-Generated Image Detection With Anomaly Attention.

[DOI]

,

,

,

,

,

IEEE Trans. Inf. Forensics Secur., 2025

DEVICE: Depth and Visual Concepts Aware Transformer for OCR-based image captioning.

[DOI]

,

,

,

,

,

Pattern Recognit., 2025

Visual primitives as words: Alignment and interaction for compositional zero-shot learning.

[DOI]

,

,

,

,

,

,

Pattern Recognit., 2025

Generalization enhancement strategy based on ensemble learning for open domain image manipulation detection.

[DOI]

,

,

,

J. Vis. Commun. Image Represent., 2025

Exploring news intent and its application: A theory-driven approach.

[DOI]

,

,

,

,

,

Inf. Process. Manag., 2025

FG-Midiformer: A Symbolic Music Understanding Model towards Fine-Grained Learning of Multi-Attributes.

[DOI]

,

,

,

Proceedings of the 33rd ACM International Conference on Multimedia, 2025

Pop-Diffuseq: Controllable Symbolic Music Multi-Instrument Infilling and Accompaniment Generation with Long-Axis Attention.

[DOI]

,

,

,

Proceedings of the IEEE International Conference on Multimedia and Expo, 2025

FSD 2.0: Improved Fake Song Dataset for Unknown-domain Deepfake Detection.

[DOI]

,

,

,

,

,

,

,

,

Proceedings of the IEEE International Conference on Multimedia and Expo, ICME 2025 - Workshops, Nantes, France, June 30, 2025

Look Around Before Locating: Considering Content and Structure Information for Visual Grounding.

[DOI]

,

,

,

,

,

,

Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

2024

DiffuseRoll: multi-track multi-attribute music generation based on diffusion model.

[DOI]

,

,

,

Multim. Syst., February, 2024

Domain Generalization via Aggregation and Separation for Audio Deepfake Detection.

[DOI]

,

,

,

IEEE Trans. Inf. Forensics Secur., 2024

MusicECAN: An Automatic Denoising Network for Music Recordings With Efficient Channel Attention.

[DOI]

,

,

,

,

IEEE ACM Trans. Audio Speech Lang. Process., 2024

Mapping Human Pressure for Nature Conservation: A Review.

[DOI]

,

,

,

Remote. Sens., 2024

Visual-guided scene-aware audio generation method based on hierarchical feature codec and rendering decision.

[DOI]

,

,

,

Displays, 2024

Artifact feature purification for cross-domain detection of AI-generated images.

[DOI]

,

,

,

,

Comput. Vis. Image Underst., 2024

Temporal Variability and Multi-Viewed Self-Supervised Representations to Tackle the ASVspoof5 Deepfake Challenge.

[DOI]

,

,

,

,

,

,

CoRR, 2024

The Codecfake Dataset and Countermeasures for the Universally Detection of Deepfake Audio.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

CoRR, 2024

EnvFake: An Initial Environmental-Fake Audio Dataset for Scene-Consistency Detection.

[DOI]

,

,

,

Proceedings of the 14th IEEE International Symposium on Chinese Spoken Language Processing, 2024

Generalized Source Tracing: Detecting Novel Audio Deepfake Algorithm with Real Emphasis and Fake Dispersion Strategy.

[DOI]

,

,

,

,

,

,

,

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

FSD: An Initial Chinese Dataset for Fake Song Detection.

[DOI]

,

,

,

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2024

An Efficient Temporary Deepfake Location Approach Based Embeddings for Partially Spoofed Audio Detection.

[DOI]

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2024

Binauralmusic: A Diverse Dataset for Improving Cross-Modal Binaural Audio Generation.

[DOI]

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2024

DNIT: Enhancing Day-Night Image-to-Image Translation through Fine-Grained Feature Handling (Student Abstract).

[DOI]

,

,

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023

SemDM: Task-oriented masking strategy for self-supervised visual learning.

[DOI]

,

,

Displays, September, 2023

PQG-A2SA: Performance Quantification Guided Audio-to-Score Alignment for Orchestral Music.

[DOI]

,

,

IEEE ACM Trans. Audio Speech Lang. Process., 2023

Behaviourally-based Synthesis of Scene-aware Footstep Sound.

[DOI]

,

,

,

,

Proceedings of the IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops, 2023

Lightweight Scene-aware Rain Sound Simulation for Interactive Virtual Environments.

[DOI]

,

,

Proceedings of the IEEE Conference Virtual Reality and 3D User Interfaces, 2023

RD-FGFS: A Rule-Data Hybrid Framework for Fine-Grained Footstep Sound Synthesis from Visual Guidance.

[DOI]

,

,

,

,

Proceedings of the 31st ACM International Conference on Multimedia, 2023

Learning A Self-Supervised Domain-Invariant Feature Representation for Generalized Audio Deepfake Detection.

[DOI]

,

,

,

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

MABC-Net: Multimodal Mixed Attentional Network with Balanced Class for Temporal Forgery Localization.

[DOI]

,

,

,

Proceedings of the Digital Multimedia Communications, 2023

CACEE: Computational Aesthetic Classification of Expressive Effects Based on Emotional Consistency.

[DOI]

,

,

,

Proceedings of the 4th International Workshop on Human-centric Multimedia Analysis, 2023

Single Domain Generalization for Audio Deepfake Detection.

[DOI]

,

,

,

Proceedings of the Workshop on Deepfake Audio Detection and Analysis co-located with 32th International Joint Conference on Artificial Intelligence (IJCAI 2023), 2023

2022

Towards an End-to-End Visual-to-Raw-Audio Generation With GAN.

[DOI]

,

,

IEEE Trans. Circuits Syst. Video Technol., 2022

Emotional Acceptance Measure (EAM): An Objective Evaluation Method Towards Information Communication Effect.

[DOI]

,

,

,

,

Proceedings of the IEEE International Conference on Multimedia and Expo Workshops, 2022

Global-Local Similarity Function for Automatic Playlist Generation.

[DOI]

,

,

,

Proceedings of the IEEE International Conference on Multimedia and Expo, 2022

Persong: Multi-Modality Driven Music Recommendation System.

[DOI]

,

,

,

Proceedings of the IEEE International Conference on Multimedia and Expo Workshops, 2022

2019

Physically-based statistical simulation of rain sound.

[DOI]

,

,

ACM Trans. Graph., 2019

Liquid-solid interaction sound synthesis.

[DOI]

,

Graph. Model., 2019

Haptic Force Guided Sound Synthesis in Multisensory Virtual Reality (VR) Simulation for Rigid-Fluid Interaction.

[DOI]

,

Proceedings of the IEEE Conference on Virtual Reality and 3D User Interfaces, 2019

2017

Efficient sound synthesis for natural scenes.

[DOI]

,

,

Proceedings of the 2017 IEEE Virtual Reality, 2017

Loading...