We stand with Ukraine

We stand with Ukraine

Yutong Ban

Orcid: 0000-0001-5396-9251

According to our database¹, Yutong Ban authored at least 55 papers between 2016 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

DSSP: Diffusion State Space Policy with Full-History Encoding.

[DOI]

,

,

,

,

,

,

,

CoRR, May, 2026

Fine-Grained Action Segmentation for Renorrhaphy in Robot-Assisted Partial Nephrectomy.

[DOI]

,

,

,

,

,

,

,

,

,

CoRR, April, 2026

SutureAgent: Learning Surgical Trajectories via Goal-conditioned Offline RL in Pixel Space.

[DOI]

,

,

,

,

,

,

,

,

,

,

CoRR, March, 2026

SCISSR: Scribble-Conditioned Interactive Surgical Segmentation and Refinement.

[DOI]

,

,

,

,

,

CoRR, March, 2026

SurgΣ: A Spectrum of Large-Scale Multimodal Data and Foundation Models for Surgical Intelligence.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

CoRR, March, 2026

Generalized Recognition of Basic Surgical Actions Enables Skill Assessment and Vision-Language-Model-based Surgical Planning.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

Russell H. Taylor

,

,

,

CoRR, March, 2026

Surg-R1: A Hierarchical Reasoning Foundation Model for Scalable and Interpretable Surgical Decision Support with Multi-Center Clinical Validation.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

Ozanan R. Meireles

,

,

,

,

,

,

CoRR, March, 2026

SSP: Safety-guaranteed Surgical Policy via Joint Optimization of Behavioral and Spatial Constraints.

[DOI]

,

,

,

Kantaphat Leelakunwet

,

,

,

,

CoRR, March, 2026

2025

OMP: One-step Meanflow Policy with Directional Alignment.

[DOI]

,

,

,

,

,

CoRR, December, 2025

See in Depth: Training-Free Surgical Scene Segmentation with Monocular Depth Priors.

[DOI]

,

,

,

CoRR, December, 2025

OSGym: Super-Scalable Distributed Data Engine for Generalizable Computer Agents.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

CoRR, November, 2025

Generalizable Coarse-to-Fine Robot Manipulation via Language-Aligned 3D Keypoints.

[DOI]

,

,

,

,

,

,

CoRR, September, 2025

The SAGES Critical View of Safety Challenge: A Global Benchmark for AI-Assisted Surgical Quality Assessment.

[DOI]

,

Jennifer A. Eckhoff

,

,

,

Jean-Paul Mazellier

,

,

,

2024 CVS Challenge Consortium

,

,

Filippo Filicori

,

,

Pietro Mascagni

,

Daniel A. Hashimoto

,

,

Ozanan R. Meireles

,

CoRR, September, 2025

Holistic Surgical Phase Recognition with Hierarchical Input Dependent State Space Models.

[DOI]

,

Tsun-Hsuan Wang

,

Mathias Lechner

,

Ramin M. Hasani

,

Jennifer A. Eckhoff

,

,

Ozanan R. Meireles

,

,

,

CoRR, June, 2025

GenHOI: Generalizing Text-driven 4D Human-Object Interaction Synthesis for Unseen Objects.

[DOI]

,

,

,

,

CoRR, June, 2025

SurgVLM: A Large Vision-Language Model and Systematic Evaluation Benchmark for Surgical Intelligence.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

CoRR, June, 2025

Diffusion Stabilizer Policy for Automated Surgical Robot Manipulations.

[DOI]

,

,

,

,

CoRR, March, 2025

State-novelty guided action persistence in deep reinforcement learning.

[DOI]

,

,

Mach. Learn., January, 2025

ASAP: Learning Generalizable Online Bin Packing via Adaptive Selection After Pruning.

[DOI]

,

,

CoRR, January, 2025

Is Segment Anything Model 2 All You Need for Surgery Video Segmentation? A Systematic Evaluation.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

CoRR, January, 2025

Understanding and Reducing the Class-Dependent Effects of Data Augmentation with A Two-Player Game Approach.

[DOI]

,

,

Trans. Mach. Learn. Res., 2025

Time Reversal Symmetry for Efficient Robotic Manipulations in Deep Reinforcement Learning.

[DOI]

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

Temporal Propagation of Asymmetric Feature Pyramid for Surgical Scene Segmentation.

[DOI]

,

Proceedings of the Collaborative Intelligence and Autonomy in Image-Guided Surgery, 2025

Surgical Scene Segmentation by Transformer with Asymmetric Feature Enhancement.

[DOI]

,

Proceedings of the 22nd IEEE International Symposium on Biomedical Imaging, 2025

Tracking-Aware Deformation Field Estimation for Non-rigid 3D Reconstruction in Robotic Surgeries.

[DOI]

,

,

,

Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2025

Hypergraph-Transformer (HGT) for Interaction Event Prediction in Laparoscopic and Robotic Surgery.

[DOI]

,

,

Jennifer A. Eckhoff

,

Ozanan R. Meireles

,

,

Proceedings of the IEEE International Conference on Robotics and Automation, 2025

2024

Concept Graph Neural Networks for Surgical Video Understanding.

[DOI]

,

Jennifer A. Eckhoff

,

,

Daniel A. Hashimoto

,

Ozanan R. Meireles

,

,

IEEE Trans. Medical Imaging, January, 2024

Enhancing Class Fairness in Classification with A Two-Player Game Approach.

[DOI]

,

,

CoRR, 2024

Hypergraph-Transformer (HGT) for Interactive Event Prediction in Laparoscopic and Robotic Surgery.

[DOI]

,

,

Jennifer A. Eckhoff

,

Ozanan R. Meireles

,

,

CoRR, 2024

Drive Anywhere: Generalizable End-to-end Autonomous Driving with Multi-modal Foundation Models.

[DOI]

Tsun-Hsuan Wang

,

,

,

,

Alexander Amini

,

,

,

Proceedings of the IEEE International Conference on Robotics and Automation, 2024

INViT: A Generalizable Routing Problem Solver with Invariant Nested View Transformer.

[DOI]

,

,

,

Proceedings of the Forty-first International Conference on Machine Learning, 2024

2023

TransCenter: Transformers With Dense Representations for Multiple-Object Tracking.

[DOI]

,

,

Guillaume Delorme

,

,

,

Xavier Alameda-Pineda

IEEE Trans. Pattern Anal. Mach. Intell., June, 2023

Infrastructure-based End-to-End Learning and Prevention of Driver Failure.

[DOI]

,

,

Mathias Lechner

,

,

Ramin M. Hasani

,

,

Proceedings of the IEEE International Conference on Robotics and Automation, 2023

On the Forward Invariance of Neural ODEs.

[DOI]

,

Tsun-Hsuan Wang

,

Ramin M. Hasani

,

Mathias Lechner

,

,

,

Proceedings of the International Conference on Machine Learning, 2023

2022

SUPR-GAN: SUrgical PRediction GAN for Event Anticipation in Laparoscopic and Robotic Surgery.

[DOI]

,

,

Jennifer A. Eckhoff

,

,

Daniel A. Hashimoto

,

,

,

Ozanan R. Meireles

,

IEEE Robotics Autom. Lett., 2022

Enhancing direct-path relative transfer function using deep neural network for robust sound source localization.

[DOI]

,

,

,

,

CAAI Trans. Intell. Technol., 2022

A Deep Concept Graph Network for Interaction-Aware Trajectory Prediction.

[DOI]

,

,

,

Igor Gilitschenski

,

Ozanan R. Meireles

,

,

Proceedings of the 2022 International Conference on Robotics and Automation, 2022

2021

Variational Bayesian Inference for Audio-Visual Tracking of Multiple Speakers.

[DOI]

,

Xavier Alameda-Pineda

,

,

IEEE Trans. Pattern Anal. Mach. Intell., 2021

SUrgical PRediction GAN for Events Anticipation.

[DOI]

,

,

,

Daniel A. Hashimoto

,

,

,

Ozanan R. Meireles

,

CoRR, 2021

TransCenter: Transformers with Dense Queries for Multiple-Object Tracking.

[DOI]

,

,

Guillaume Delorme

,

,

,

Xavier Alameda-Pineda

CoRR, 2021

Aggregating Long-Term Context for Learning Laparoscopic and Robot-Assisted Surgical Workflows.

[DOI]

,

,

,

Daniel A. Hashimoto

,

,

,

Ozanan R. Meireles

,

Proceedings of the IEEE International Conference on Robotics and Automation, 2021

2020

Aggregating Long-Term Context for Learning Surgical Workflows.

[DOI]

,

,

,

Daniel A. Hashimoto

,

,

,

Ozanan R. Meireles

,

CoRR, 2020

ODANet: Online Deep Appearance Network for Identity-Consistent Multi-person Tracking.

[DOI]

Guillaume Delorme

,

,

Guillaume Sarrazin

,

Xavier Alameda-Pineda

Proceedings of the Pattern Recognition. ICPR International Workshops and Challenges, 2020

How to Train Your Deep Multi-Object Tracker.

[DOI]

,

,

,

,

Laura Leal-Taixé

,

Xavier Alameda-Pineda

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019

Audio-Visual Multiple-Speaker Tracking for Robot Perception. (Suivi multi-locuteurs avec des informations audio-visuelles pour la perception des robots).

[DOI]

PhD thesis, 2019

Tracking Multiple Audio Sources With the von Mises Distribution and Variational EM.

[DOI]

,

Xavier Alameda-Pineda

,

Christine Evers

,

IEEE Signal Process. Lett., 2019

Online Localization and Tracking of Multiple Moving Speakers in Reverberant Environments.

[DOI]

,

,

,

Xavier Alameda-Pineda

,

IEEE J. Sel. Top. Signal Process., 2019

DeepMOT: A Differentiable Framework for Training Multiple Object Trackers.

[DOI]

,

,

Xavier Alameda-Pineda

,

CoRR, 2019

Audio-Visual Variational Fusion for Multi-Person Tracking with Robots.

[DOI]

Xavier Alameda-Pineda

,

,

,

Guillaume Delorme

,

,

,

,

Bastien Mourgue

,

Guillaume Sarrazin

Proceedings of the 27th ACM International Conference on Multimedia, 2019

2018

A cascaded multiple-speaker localization and tracking system.

[DOI]

,

,

,

Xavier Alameda-Pineda

,

CoRR, 2018

A Deep Network for Arousal-Valence Emotion Prediction with Acoustic-Visual Cues.

[DOI]

,

,

,

,

CoRR, 2018

Accounting for Room Acoustics in Audio-Visual Multi-Speaker Tracking.

[DOI]

,

,

Xavier Alameda-Pineda

,

,

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017

Tracking a varying number of people with a visually-controlled robotic head.

[DOI]

,

Xavier Alameda-Pineda

,

,

,

Proceedings of the 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2017

Exploiting the Complementarity of Audio and Visual Data in Multi-speaker Tracking.

[DOI]

,

,

Xavier Alameda-Pineda

,

Proceedings of the 2017 IEEE International Conference on Computer Vision Workshops, 2017

2016

Tracking Multiple Persons Based on a Variational Bayesian Model.

[DOI]

,

,

Xavier Alameda-Pineda

,

Proceedings of the Computer Vision - ECCV 2016 Workshops, 2016

Loading...