Yichen Zhu

Orcid: 0000-0001-5126-838X

Affiliations:

Midea Group, AI Lab, Shanghai, Guangdong, China
University of Toronto, Department of Statistical Sciences, Canada

According to our database¹, Yichen Zhu authored at least 54 papers between 2018 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Bibliography

2026

Hi-WM: Human-in-the-World-Model for Scalable Robot Post-Training.

[BibT_eX]

[DOI]

CoRR, April, 2026

PointVLA: Injecting the 3D World Into Vision-Language-Action Models.

[BibT_eX]

[DOI]

IEEE Robotics Autom. Lett., March, 2026

2025

HumanoidExo: Scalable Whole-Body Humanoid Manipulation via Wearable Exoskeleton.

[BibT_eX]

[DOI]

CoRR, October, 2025

ActiveUMI: Robotic Manipulation with Active Perception from Robot-Free Human Demonstrations.

[BibT_eX]

[DOI]

CoRR, October, 2025

dVLA: Diffusion Vision-Language-Action Model with Multimodal Chain-of-Thought.

[BibT_eX]

[DOI]

CoRR, September, 2025

ChatVLA-2: Vision-Language-Action Model with Open-World Embodied Reasoning from Pretrained Knowledge.

[BibT_eX]

[DOI]

CoRR, May, 2025

WorldEval: World Model as Real-World Robot Policies Evaluator.

[BibT_eX]

[DOI]

CoRR, May, 2025

TinyVLA: Toward Fast, Data-Efficient Vision-Language-Action Models for Robotic Manipulation.

[BibT_eX]

[DOI]

IEEE Robotics Autom. Lett., April, 2025

PointVLA: Injecting the 3D World into Vision-Language-Action Models.

[BibT_eX]

[DOI]

CoRR, March, 2025

ObjectVLA: End-to-End Open-World Object Manipulation Without Demonstration.

[BibT_eX]

[DOI]

CoRR, February, 2025

ChatVLA: Unified Multimodal Understanding and Robot Control with Vision-Language-Action Model.

[BibT_eX]

[DOI]

CoRR, February, 2025

DexVLA: Vision-Language Model with Plug-In Diffusion Expert for General Robot Control.

[BibT_eX]

[DOI]

CoRR, February, 2025

LaTP: LiDAR-aided multimodal token pruning for efficient trajectory prediction of autonomous driving.

[BibT_eX]

[DOI]

Neural Networks, 2025

Let Me Show You: Learning by Retrieving from Egocentric Video for Robotic Manipulation.

[BibT_eX]

[DOI]

Yichen Zhu

Feifei Feng

Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2025

Scaling Diffusion Policy in Transformer to 1 Billion Parameters for Robotic Manipulation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Robotics and Automation, 2025

Discrete Policy: Learning Disentangled Action Space for Multi-Task Robotic Manipulation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Robotics and Automation, 2025

DiffusionVLA: Scaling Robot Foundation Models via Unified Diffusion and Autoregression.

[BibT_eX]

[DOI]

Proceedings of the Forty-second International Conference on Machine Learning, 2025

CoA-VLA: Improving Vision-Language-Action Models via Visual-Textual Chain-of-Affordance.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

ChatVLA: Unified Multimodal Understanding and Robot Control with Vision-Language-Action Model.

[BibT_eX]

[DOI]

Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

A Comprehensive Overhaul of Multimodal Assistant with Small Language Models.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

2024

Improving Vision-Language-Action Models via Chain-of-Affordance.

[BibT_eX]

[DOI]

CoRR, 2024

Diffusion-VLA: Scaling Robot Foundation Models via Unified Diffusion and Autoregression.

[BibT_eX]

[DOI]

CoRR, 2024

Discrete Policy: Learning Disentangled Action Space for Multi-Task Robotic Manipulation.

[BibT_eX]

[DOI]

CoRR, 2024

TinyVLA: Towards Fast, Data-Efficient Vision-Language-Action Models for Robotic Manipulation.

[BibT_eX]

[DOI]

CoRR, 2024

MMRo: Are Multimodal LLMs Eligible as the Brain for In-Home Robotics?

[BibT_eX]

[DOI]

CoRR, 2024

Mipha: A Comprehensive Overhaul of Multimodal Assistant with Small Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

Visual Robotic Manipulation with Depth-Aware Pretraining.

[BibT_eX]

[DOI]

CoRR, 2024

LLaVA-Phi: Efficient Multi-Modal Assistant with Small Language Model.

[BibT_eX]

[DOI]

CoRR, 2024

Visual Robotic Manipulation with Depth-Aware Pretraining.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Robotics and Biomimetics, 2024

Any2Policy: Learning Visuomotor Policy with Any-Modality.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

EDT: An Efficient Diffusion Transformer Framework Inspired by Human-like Sketching.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Safety of Multimodal Large Language Models on Images and Text.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024

Language-Conditioned Robotic Manipulation with Fast and Slow Thinking.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Robotics and Automation, 2024

Object-Centric Instruction Augmentation for Robotic Manipulation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Robotics and Automation, 2024

MM-SafetyBench: A Benchmark for Safety Evaluation of Multimodal Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

Retrieval-Augmented Embodied Agents.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Exploring Gradient Explosion in Generative Adversarial Imitation Learning: A Probabilistic Perspective.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

EPSD: Early Pruning with Self-Distillation for Efficient Model Compression.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023

LogSummary: Unstructured Log Summarization for Software Systems.

[BibT_eX]

[DOI]

IEEE Trans. Netw. Serv. Manag., September, 2023

Query-Relevant Images Jailbreak Large Multi-Modal Models.

[BibT_eX]

[DOI]

CoRR, 2023

Biglog: Unsupervised Large-scale Pre-training for a Unified Log Representation.

[BibT_eX]

[DOI]

Proceedings of the 31st IEEE/ACM International Symposium on Quality of Service, 2023

ScaleKD: Distilling Scale-Aware Knowledge in Small Object Detector.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022

LogStamp: Automatic Online Log Parsing Based on Sequence Labelling.

[BibT_eX]

[DOI]

SIGMETRICS Perform. Evaluation Rev., 2022

BNNAS++: Towards Unbiased Neural Architecture Search With Batch Normalization.

[BibT_eX]

[DOI]

Yichen Zhu

Xiaowei Fu

IEEE Access, 2022

Teach Less, Learn More: On the Undistillable Classes in Knowledge Distillation.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Label-Guided Auxiliary Training Improves 3D Object Detector.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

2021

UniLog: Deploy One Model and Specialize it for All Log Analysis Tasks.

[BibT_eX]

[DOI]

CoRR, 2021

Make A Long Image Short: Adaptive Token Length for Vision Transformers.

[BibT_eX]

[DOI]

CoRR, 2021

Training BatchNorm Only in Neural Architecture Search and Beyond.

[BibT_eX]

[DOI]

CoRR, 2021

Student Customized Knowledge Distillation: Bridging the Gap Between Student and Teacher.

[BibT_eX]

[DOI]

Yichen Zhu

Yi Wang

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

2020

Summarizing Unstructured Logs in Online Services.

[BibT_eX]

[DOI]

CoRR, 2020

LogParse: Making Log Parsing Adaptive through Word Classification.

[BibT_eX]

[DOI]

Proceedings of the 29th International Conference on Computer Communications and Networks, 2020

2019

LogAnomaly: Unsupervised Detection of Sequential and Quantitative Anomalies in Unstructured Logs.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

2018

A Multi-scale Pyramid of Fully Convolutional Networks for Automatic Cell Detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine, 2018

Yichen Zhu

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...