Zhenfei Yin

Orcid: 0000-0002-8666-1103

According to our database1, Zhenfei Yin authored at least 49 papers between 2020 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
Bamboo: Building Mega-Scale Vision Dataset Continually with Human-Machine Synergy.
Int. J. Comput. Vis., August, 2025

aiXiv: A Next-Generation Open Access Ecosystem for Scientific Discovery Generated by AI Scientists.
CoRR, August, 2025

VeriGUI: Verifiable Long-Chain GUI Dataset.
CoRR, August, 2025

When Autonomy Goes Rogue: Preparing for Risks of Multi-Agent Collusion in Social Systems.
CoRR, July, 2025

BMMR: A Large-Scale Bilingual Multimodal Multi-Discipline Reasoning Dataset.
CoRR, July, 2025

Position: Intelligent Science Laboratory Requires the Integration of Cognitive and Embodied AI.
CoRR, June, 2025

VIKI-R: Coordinating Embodied Multi-Agent Cooperation via Reinforcement Learning.
CoRR, June, 2025

A Comprehensive Evaluation of Multi-Modal Large Language Models for Endoscopy Analysis.
CoRR, May, 2025

LabUtopia: High-Fidelity Simulation and Hierarchical Benchmark for Scientific Embodied Agents.
CoRR, May, 2025

X-MAS: Towards Building Multi-Agent Systems with Heterogeneous LLMs.
CoRR, May, 2025

MASLab: A Unified and Comprehensive Codebase for LLM-based Multi-Agent Systems.
CoRR, May, 2025

CompBench: Benchmarking Complex Instruction-guided Image Editing.
CoRR, May, 2025

AI-Driven Automation Can Become the Foundation of Next-Era Science of Science Research.
CoRR, May, 2025

Towards Physically Plausible Video Generation via VLM Planning.
CoRR, March, 2025

RoboFactory: Exploring Embodied Agent Collaboration with Compositional Constraints.
CoRR, March, 2025

MAS-GPT: Training LLMs to Build LLM-based Multi-Agent Systems.
CoRR, March, 2025

ReSo: A Reward-driven Self-organizing LLM-based Multi-Agent System for Reasoning Tasks.
CoRR, March, 2025

Robust face anti-spoofing with Dual Probabilistic Modeling.
Pattern Recognit., 2025

SPA-VL: A Comprehensive Safety Preference Alignment Dataset for Vision Language Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Many Heads Are Better Than One: Improved Scientific Idea Generation by A LLM-Based Multi-Agent System.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

2024
B-VLLM: A Vision Large Language Model with Balanced Spatio-Temporal Tokens.
CoRR, 2024

Are We There Yet? Revealing the Risks of Utilizing Large Language Models in Scholarly Peer Review.
CoRR, 2024

OASIS: Open Agent Social Interaction Simulations with One Million Agents.
CoRR, 2024

WorldSimBench: Towards Video Generation Models as World Simulators.
CoRR, 2024

Two Heads Are Better Than One: A Multi-Agent System Has the Potential to Improve Scientific Idea Generation.
CoRR, 2024

GenderBias-VL: Benchmarking Gender Bias in Vision Language Models via Counterfactual Probing.
CoRR, 2024

SPA-VL: A Comprehensive Safety Preference Alignment Dataset for Vision Language Model.
CoRR, 2024

RH20T-P: A Primitive-Level Robotic Dataset Towards Composable Generalization Agents.
CoRR, 2024

Assessment of Multimodal Large Language Models in Alignment with Human Values.
CoRR, 2024

MineDreamer: Learning to Follow Instructions via Chain-of-Imagination for Simulated-World Control.
CoRR, 2024

Uni3D-LLM: Unifying Point Cloud Perception, Generation and Editing with Large Language Models.
CoRR, 2024

From GPT-4 to Gemini and Beyond: Assessing the Landscape of MLLMs on Generalizability, Trustworthiness and Causality through Four Modalities.
CoRR, 2024

3D Point Cloud Pre-Training with Knowledge Distilled from 2D Images.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2024

Octavius: Mitigating Task Interference in MLLMs via LoRA-MoE.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Depicting Beyond Scores: Advancing Image Quality Assessment Through Multi-modal Language Models.
Proceedings of the Computer Vision - ECCV 2024, 2024

MP5: A Multi-modal Open-ended Embodied System in Minecraft via Active Perception.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Towards Tracing Trustworthiness Dynamics: Revisiting Pre-training Period of Large Language Models.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

2023
ChEF: A Comprehensive Evaluation Framework for Standardized Assessment of Multimodal Large Language Models.
CoRR, 2023

Octavius: Mitigating Task Interference in MLLMs via MoE.
CoRR, 2023

Latent Distribution Adjusting for Face Anti-Spoofing.
CoRR, 2023

LAMM: Language-Assisted Multi-Modal Instruction-Tuning Dataset, Framework, and Benchmark.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

2022
3D Point Cloud Pre-training with Knowledge Distillation from 2D Images.
CoRR, 2022

Benchmarking Omni-Vision Representation Through the Lens of Visual Realms.
Proceedings of the Computer Vision - ECCV 2022, 2022

X-Learner: Learning Cross Sources and Tasks for Universal Visual Representation.
Proceedings of the Computer Vision - ECCV 2022, 2022

2021
One to Transfer All: A Universal Transfer Framework for Vision Foundation Model with Few Data.
CoRR, 2021

INTERN: A New Learning Paradigm Towards General Vision.
CoRR, 2021

Few-Shot Domain Expansion for Face Anti-Spoofing.
CoRR, 2021

CelebA-Spoof Challenge 2020 on Face Anti-Spoofing: Methods and Results.
CoRR, 2021

2020
CelebA-Spoof: Large-Scale Face Anti-spoofing Dataset with Rich Annotations.
Proceedings of the Computer Vision - ECCV 2020, 2020


  Loading...