Kun Yuan

Orcid: 0000-0002-6030-8862

Affiliations:
  • Technical University of Munich, Center for Machine Learning, Munich, Germany
  • University of Strasbourg, CNRS, INSERM, ICube, UMR7357, Strasbourg, France


According to our database1, Kun Yuan authored at least 22 papers between 2020 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2026
EndoChat: Grounded multimodal large language model for endoscopic surgery.
Medical Image Anal., 2026

2025
Multi-modal Representations for Fine-grained Multi-label Critical View of Safety Recognition.
CoRR, July, 2025

SurgTPGS: Semantic 3D Surgical Scene Understanding with Text Promptable Gaussian Splatting.
CoRR, June, 2025

Recognizing Surgical Phases Anywhere: Few-Shot Test-time Adaptation and Task-graph Guided Refinement.
CoRR, June, 2025

SurgVidLM: Towards Multi-grained Surgical Video Understanding with Large Language Model.
CoRR, June, 2025

Text-driven adaptation of foundation models for few-shot surgical workflow analysis.
Int. J. Comput. Assist. Radiol. Surg., June, 2025

EndoVLA: Dual-Phase Vision-Language-Action Model for Autonomous Tracking in Endoscopy.
CoRR, May, 2025

ORQA: A Benchmark and Foundation Model for Holistic Operating Room Modeling.
CoRR, May, 2025

Can DeepSeek Reason Like a Surgeon? An Empirical Evaluation for Vision-Language Understanding in Robotic-Assisted Surgery.
CoRR, March, 2025

Rethinking data imbalance in class incremental surgical instrument segmentation.
Medical Image Anal., 2025

Learning multi-modal representations by watching hundreds of surgical video lectures.
Medical Image Anal., 2025

MM-OR: A Large Multimodal Operating Room Dataset for Semantic Understanding of High-Intensity Surgical Environments.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Medical Multimodal Model Stealing Attacks via Adversarial Domain Alignment.
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

2024
Advancing surgical VQA with scene graph knowledge.
Int. J. Comput. Assist. Radiol. Surg., July, 2024

OphCLIP: Hierarchical Retrieval-Augmented Learning for Ophthalmic Surgical Video-Language Pretraining.
CoRR, 2024

Procedure-Aware Surgical Video-language Pretraining with Hierarchical Knowledge Augmentation.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

HecVL: Hierarchical Video-Language Pretraining for Zero-Shot Surgical Phase Recognition.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2024, 2024

Enhancing Gait Video Analysis in Neurodegenerative Diseases by Knowledge Augmentation in Vision Language Model.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2024, 2024

2023
CholecTriplet2022: Show me a tool and tell me the triplet - An endoscopic vision challenge for surgical action triplet detection.
Medical Image Anal., October, 2023

Learning Multi-modal Representations by Watching Hundreds of Surgical Video Lectures.
CoRR, 2023

2020
An Efficient Hybrid Model for Kidney Tumor Segmentation in CT Images.
Proceedings of the 17th IEEE International Symposium on Biomedical Imaging, 2020

Towards Content-Independent Multi-Reference Super-Resolution: Adaptive Pattern Matching and Feature Aggregation.
Proceedings of the Computer Vision - ECCV 2020, 2020


  Loading...