Yunze Man

Orcid: 0000-0002-2357-2883

According to our database1, Yunze Man authored at least 25 papers between 2018 and 2026.

Collaborative distances:

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
Capturing Visual Environment Structure Correlates with Control Performance.
CoRR, February, 2026

Fast-ThinkAct: Efficient Vision-Language-Action Reasoning via Verbalizable Latent Planning.
CoRR, January, 2026

2025
PPTArena: A Benchmark for Agentic PowerPoint Editing.
CoRR, December, 2025

LocateAnything3D: Vision-Language 3D Detection with Chain-of-Sight.
CoRR, November, 2025

OSGym: Super-Scalable Distributed Data Engine for Generalizable Computer Agents.
CoRR, November, 2025

AgMMU: A Comprehensive Agricultural Multimodal Understanding and Reasoning Benchmark.
CoRR, April, 2025

RandAR: Decoder-only Autoregressive Visual Generation in Random Orders.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Floating No More: Object-Ground Reconstruction from a Single Image.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Argus: Vision-Centric Reasoning with Grounded Chain-of-Thought.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

2024
PaintScene4D: Consistent 4D Scene Generation from Text Prompts.
CoRR, 2024

SceneCraft: Layout-Guided 3D Scene Generation.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Understanding.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Frozen Transformers in Language Models Are Effective Visual Encoder Layers.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Situational Awareness Matters in 3D Vision Language Reasoning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023
DualCross: Cross-Modality Cross-Domain Adaptation for Monocular BEV Perception.
IROS, 2023

BEV-Guided Multi-Modality Fusion for Driving Perception.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022
Fast Graph Neural Tangent Kernel via Kronecker Sketching.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
Multi-Echo LiDAR for 3D Object Detection.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Multi-Modality Task Cascade for 3D Object Detection.
Proceedings of the 32nd British Machine Vision Conference 2021, 2021

2020
Graph Neural Networks for 3D Multi-Object Tracking.
CoRR, 2020

GNN3DMOT: Graph Neural Network for 3D Multi-Object Tracking with Multi-Feature Learning.
CoRR, 2020

GNN3DMOT: Graph Neural Network for 3D Multi-Object Tracking With 2D-3D Multi-Feature Learning.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019
Deep Q Learning Driven CT Pancreas Segmentation With Geometry-Aware U-Net.
IEEE Trans. Medical Imaging, 2019

GroundNet: Monocular Ground Plane Normal Estimation with Geometric Consistency.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

2018
GroundNet: Segmentation-Aware Monocular Ground Plane Estimation with Geometric Consistency.
CoRR, 2018


  Loading...