Zikai Song

Orcid: 0009-0006-6651-2027

According to our database1, Zikai Song authored at least 52 papers between 2018 and 2026.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
CurEvo: Curriculum-Guided Self-Evolution for Video Understanding.
CoRR, April, 2026

GateMOT: Q-Gated Attention for Dense Object Tracking.
CoRR, April, 2026

OmniTrend: Content-Context Modeling for Scalable Social Popularity Prediction.
CoRR, April, 2026

HotComment: A Benchmark for Evaluating Popularity of Online Comments.
CoRR, April, 2026

Seeing Further and Wider: Joint Spatio-Temporal Enlargement for Micro-Video Popularity Prediction.
CoRR, April, 2026

Hypergraph-State Collaborative Reasoning for Multi-Object Tracking.
CoRR, April, 2026

IntervenSim: Intervention-Aware Social Network Simulation for Opinion Dynamics.
CoRR, April, 2026

Coupling Macro Dynamics and Micro States for Long-Horizon Social Simulation.
CoRR, April, 2026

Large Language Model as Token Compressor and Decompressor.
CoRR, March, 2026

Logical Phase Transitions: Understanding Collapse in LLM Logical Reasoning.
CoRR, January, 2026

ELAI-SGCN: An explainable lightweight adaptive information-perceiving spiking graph convolutional network for EEG-based emotion recognition.
Neural Networks, 2026

2025
Efficient Cipher-Image Coding via Compressive Sensing and Auxiliary-Information-Guided Mapping for Secure Cloud Storage in Consumer Electronics.
IEEE Trans. Consumer Electron., November, 2025

TimeJudge: empowering video-LLMs as zero-shot judges for temporal consistency in video captions.
Frontiers Inf. Technol. Electron. Eng., November, 2025

From Ambiguity to Verdict: A Semiotic-Grounded Multi-Perspective Agent for LLM Logical Reasoning.
CoRR, September, 2025

HyperFusion: Hierarchical Multimodal Ensemble Learning for Social Media Popularity Prediction.
CoRR, July, 2025

LoRA-Mixer: Coordinate Modular LoRA Experts Through Serial Attention Routing.
CoRR, July, 2025

GA-S<sup>3</sup>: Comprehensive Social Network Simulation with Group Agents.
CoRR, June, 2025

Can MLLMs Guide Me Home? A Benchmark Study on Fine-Grained Visual Reasoning from Transit Maps.
CoRR, May, 2025

EfficientGS: Streamlining Gaussian Splatting for Large-Scale High-Resolution Scene Representation.
IEEE Multim., 2025

Exploiting Appearance Re-Emergence for Robust Visual Tracking.
Proceedings of the MMAsia '25 Workshops: Proceedings of the 7th ACM International Conference on Multimedia in Asia, 2025

MVP: Winning Solution to SMP Challenge 2025 Video Track.
Proceedings of the 33rd ACM International Conference on Multimedia, 2025

MCA-RG: Enhancing LLMs with Medical Concept Alignment for Radiology Report Generation.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2025, 2025

Cross-Modality Masked Learning for Survival Prediction in ICI Treated NSCLC Patients.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2025, 2025

Optimized View and Geometry Distillation from Multi-view Diffuser.
Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence, 2025

CA-Diff: Collaborative Anatomy Diffusion for Brain Tissue Segmentation.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2025

DEEM: Diffusion models serve as the eyes of large language models for image perception.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Ref-GS: Directional Factorization for 2D Gaussian Splatting.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

SF2T: Self-supervised Fragment Finetuning of Video-LLMs for Fine-Grained Understanding.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

GA-S³: Comprehensive Social Network Simulation with Group Agents.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

Temporal Coherent Object Flow for Multi-Object Tracking.
Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

Video Anomaly Detection with Motion and Appearance Guided Patch Diffusion Model.
Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

2024
IP-MOT: Instance Prompt Learning for Cross-Domain Multi-Object Tracking.
CoRR, 2024

Coupled Mamba: Enhanced Multi-modal Fusion with Coupled State Space Model.
CoRR, 2024

DEEM: Diffusion Models Serve as the Eyes of Large Language Models for Image Perception.
CoRR, 2024

EfficientGS: Streamlining Gaussian Splatting for Large-Scale High-Resolution Scene Representation.
CoRR, 2024

Coupled Mamba: Enhanced Multimodal Fusion with Coupled State Space Model.
Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Autogenic Language Embedding for Coherent Point Tracking.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Study Selectively: An Adaptive Knowledge Distillation based on a Voting Network for Heart Sound Classification.
Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Agnostic Feature Compression with Semantic Guided Channel Importance Analysis.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2024

Weight Light, Hear Right: Heart Sound Classification with a Low-Complexity Model.
Proceedings of the 32nd European Signal Processing Conference, 2024

Progressive Text-to-Image Diffusion with Soft Latent Direction.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

DiffusionTrack: Diffusion Model for Multi-Object Tracking.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

AMD: Anatomical Motion Diffusion with Interpretable Motion Decomposition and Fusion.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Optimized View and Geometry Distillation from Multi-view Diffuser.
CoRR, 2023

Fine-grained Appearance Transfer with Diffusion Models.
CoRR, 2023

Cutting Weights of Deep Learning Models for Heart Sound Classification: Introducing a Knowledge Distillation Approach.
Proceedings of the 45th Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2023

Compact Transformer Tracker with Correlative Masked Modeling.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Transformer Tracking with Cyclic Shifting Window Attention.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021
Distractor-Aware Tracker with a Domain-Special Optimized Benchmark for Soccer Player Tracking.
Proceedings of the ICMR '21: International Conference on Multimedia Retrieval, 2021

2020
SSET: a dataset for shot segmentation, event detection, player tracking in soccer videos.
Multim. Tools Appl., 2020

Fine-Grain Level Sports Video Search Engine.
Proceedings of the MultiMedia Modeling - 26th International Conference, 2020

2018
Comprehensive Dataset of Broadcast Soccer Videos.
Proceedings of the IEEE 1st Conference on Multimedia Information Processing and Retrieval, 2018


  Loading...