Xiaokun Feng

According to our database1, Xiaokun Feng authored at least 22 papers between 2023 and 2026.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
Extending One-Step Image Generation from Class Labels to Text via Discriminative Text Representation.
CoRR, April, 2026

Omni-WorldBench: Towards a Comprehensive Interaction-Centric Evaluation for World Models.
CoRR, March, 2026

Latent Temporal Discrepancy as Motion Prior: A Loss-Weighting Strategy for Dynamic Fidelity in T2V.
CoRR, January, 2026

ImagerySearch: Adaptive Test-Time Search for Video Generation Beyond Semantic Dependency Constraints.
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

Omni-Effects: Unified and Spatially-Controllable Visual Effects Generation.
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

2025
S<sup>2</sup>-Guidance: Stochastic Self Guidance for Training-Free Enhancement of Diffusion Models.
CoRR, August, 2025

NarrLV: Towards a Comprehensive Narrative-Centric Evaluation for Long Video Generation Models.
CoRR, July, 2025

Unveiling Chain of Step Reasoning for Vision-Language Models with Fine-grained Rewards.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

CSTrack: Enhancing RGB-X Tracking via Compact Spatiotemporal Features.
Proceedings of the Forty-second International Conference on Machine Learning, 2025

VMBench: A Benchmark for Perception-Aligned Video Motion Generation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

ATCTrack: Aligning Target-Context Cues with Dynamic Target States for Robust Vision-Language Tracking.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

Enhancing Vision-Language Tracking by Effectively Converting Textual Cues into Visual Cues.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

2024
How Texts Help? A Fine-grained Evaluation to Reveal the Role of Language in Vision-Language Tracking.
CoRR, 2024

DTVLT: A Multi-modal Diverse Text Benchmark for Visual Language Tracking Based on LLM.
CoRR, 2024

Visual Language Tracking with Multi-modal Interaction: A Robust Benchmark.
CoRR, 2024

Beyond Accuracy: Tracking more like Human via Visual Search.
Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

MemVLT: Vision-Language Tracking with Adaptive Memory-based Prompts.
Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Revealing the Dark Secrets of Extremely Large Kernel ConvNets on Robustness.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

DTLLM-VLT: Diverse Text Generation for Visual Language Tracking Based on LLM.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023
See Your Heart: Psychological states Interpretation through Visual Creations.
CoRR, 2023

A Hierarchical Theme Recognition Model for Sandplay Therapy.
Proceedings of the Pattern Recognition and Computer Vision - 6th Chinese Conference, 2023

A Multi-modal Global Instance Tracking Benchmark (MGIT): Better Locating Target in Complex Spatio-temporal and Causal Relationship.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023


  Loading...