Shiyi Lan

According to our database1, Shiyi Lan authored at least 41 papers between 2016 and 2025.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
Play to Generalize: Learning to Reason Through Game Play.
CoRR, June, 2025

Generalized Trajectory Scoring for End-to-end Multimodal Planning.
CoRR, June, 2025

DriveSuprim: Towards Precise Trajectory Selection for End-to-End Planning.
CoRR, June, 2025

Cosmos-Transfer1: Conditional World Generation with Adaptive Multimodal Control.
CoRR, March, 2025

Hydra-MDP++: Advancing End-to-End Driving via Expert-Guided Hydra-Distillation.
CoRR, March, 2025

Hydra-NeXt: Robust Closed-Loop Driving with Open-Loop Training.
CoRR, March, 2025

Centaur: Robust End-to-End Autonomous Driving with Test-Time Training.
CoRR, March, 2025

Enhancing Autonomous Driving Safety with Collision Scenario Integration.
CoRR, March, 2025

Eagle 2: Building Post-Training Data Strategies from Scratch for Frontier Vision-Language Models.
CoRR, January, 2025

Cosmos World Foundation Model Platform for Physical AI.
CoRR, January, 2025

OmniDrive: A Holistic Vision-Language Dataset for Autonomous Driving with Counterfactual Reasoning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

MDP: Multidimensional Vision Model Pruning with Latency Constraint.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

2024
StreamChat: Chatting with Streaming Video.
CoRR, 2024

Exploring Camera Encoder Designs for Autonomous Driving Perception.
CoRR, 2024

Multi-Dimensional Pruning: Joint Channel, Layer and Block Pruning with Latency Constraint.
CoRR, 2024

Hydra-MDP: End-to-end Multimodal Planning with Multi-target Hydra-Distillation.
CoRR, 2024

OmniDrive: A Holistic LLM-Agent Framework for Autonomous Driving with 3D Perception, Reasoning and Planning.
CoRR, 2024

EVA-GAN: Enhanced Various Audio Generation via Scalable Generative Adversarial Networks.
CoRR, 2024

SF3D: SlowFast Temporal 3D Object Detection.
Proceedings of the IEEE Intelligent Vehicles Symposium, 2024

A Semantic Space is Worth 256 Language Descriptions: Make Stronger Segmentation Models with Descriptive Properties.
Proceedings of the Computer Vision - ECCV 2024, 2024

SegIC: Unleashing the Emergent Correspondence for In-Context Segmentation.
Proceedings of the Computer Vision - ECCV 2024, 2024

Synthesize, Diagnose, and Optimize: Towards Fine-Grained Vision-Language Understanding.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Is Ego Status All You Need for Open-Loop End-to-End Autonomous Driving?
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

BEVNeXt: Reviving Dense BEV Frameworks for 3D Object Detection.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

What is Point Supervision Worth in Video Instance Segmentation?
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023
A Semantic Space is Worth 256 Language Descriptions: Make Stronger Segmentation Models with Descriptive Properties.
CoRR, 2023

FB-OCC: 3D Occupancy Prediction based on Forward-Backward View Transformation.
CoRR, 2023

Fully Attentional Networks with Self-emerging Token Labeling.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

FocalFormer3D : Focusing on Hard Instance for 3D Object Detection.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Re-ViLM: Retrieval-Augmented Visual Language Model for Zero and Few-Shot Image Captioning.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Vision Transformers are Good Mask Auto-Labelers.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022
Object Detection and Instance Segmentation for Real-world Applications.
PhD thesis, 2022

1st Place Solution of The Robust Vision Challenge (RVC) 2022 Semantic Segmentation Track.
CoRR, 2022

M3DETR: Multi-representation, Multi-scale, Mutual-relation 3D Object Detection with Transformers.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2022

AdaViT: Adaptive Vision Transformers for Efficient Image Recognition.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021
DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

2020
InfoFocus: 3D Object Detection for Autonomous Driving with Dynamic Information Modeling.
Proceedings of the Computer Vision - ECCV 2020, 2020

SaccadeNet: A Fast and Accurate Object Detector.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019
Modeling Local Geometric Structure of 3D Point Clouds Using Geo-CNN.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

2017
FastMask: Segment Multi-scale Object Candidates in One Shot.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

2016
FastMask: Segment Object Multi-scale Candidates in One Shot.
CoRR, 2016


  Loading...