Kun Yan

Orcid: 0000-0001-8290-5169

Affiliations:
  • StepFun, Shanghai, China


According to our database1, Kun Yan authored at least 18 papers between 2021 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

Online presence:

On csauthors.net:

Bibliography

2026
Dynamic Sparsity in Large-Scale Video DiT Training.
Proceedings of the 31st ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2026

2025
Qwen-Image-Layered: Towards Inherent Editability via Layer Decomposition.
CoRR, December, 2025

Resolving Ambiguity in Gaze-Facilitated Visual Assistant Interaction Paradigm.
CoRR, September, 2025

Qwen-Image Technical Report.
CoRR, August, 2025

DSV: Exploiting Dynamic Sparsity to Accelerate Large-Scale Video DiT Training.
CoRR, February, 2025

Taming Teacher Forcing for Masked Autoregressive Video Generation.
CoRR, January, 2025

Taming Teacher Forcing for Masked Autoregressive Video Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

2024
G-VOILA: Gaze-Facilitated Information Querying in Daily Scenarios.
Proc. ACM Interact. Mob. Wearable Ubiquitous Technol., May, 2024

Voila-A: Aligning Vision-Language Models with User's Gaze Attention.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

HORIZON: High-Resolution Semantically Controlled Panorama Synthesis.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
GroundNLQ @ Ego4D Natural Language Queries Challenge 2023.
CoRR, 2023

KU-DMIS-MSRA at RadSum23: Pre-trained Vision-Language Model for Radiology Report Summarization.
Proceedings of the 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks, 2023

CONE: An Efficient COarse-to-fiNE Alignment Framework for Long Video Temporal Grounding.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2022
An Efficient COarse-to-fiNE Alignment Framework @ Ego4D Natural Language Queries Challenge 2022.
CoRR, 2022

HORIZON: A High-Resolution Panorama Synthesis Framework.
CoRR, 2022

Learning Temporal Video Procedure Segmentation from an Automatically Collected Large Dataset.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2022

Trace Controlled Text to Image Generation.
Proceedings of the Computer Vision - ECCV 2022, 2022

2021
Control Image Captioning Spatially and Temporally.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021


  Loading...