Hongyi Cai

Orcid: 0009-0003-6243-8243

According to our database1, Hongyi Cai authored at least 26 papers between 2011 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
Once-For-All: A Train-Once and Select-Anytime Framework for Multimodal Instruction Tuning.
CoRR, May, 2026

Evo-Depth: A Lightweight Depth-Enhanced Vision-Language-Action Model.
CoRR, May, 2026

VisNec: Measuring and Leveraging Visual Necessity for Multimodal Instruction Tuning.
CoRR, March, 2026

When Vision Meets Texts in Listwise Reranking.
CoRR, January, 2026

Cross-modal local and global alignment for Chinese Character Recognition.
Pattern Recognit., 2026

2025
Pistachio: Towards Synthetic, Balanced, and Long-Form Video Anomaly Benchmarks.
CoRR, November, 2025

VLA-Pruner: Temporal-Aware Dual-Level Visual Token Pruning for Efficient Vision-Language-Action Inference.
CoRR, November, 2025

Evo-1: Lightweight Vision-Language-Action Model with Preserved Semantic Alignment.
CoRR, November, 2025

A Vision for Access Control in LLM-based Agent Systems.
CoRR, October, 2025

AutoDebias: Automated Framework for Debiasing Text-to-Image Models.
CoRR, August, 2025

MergeIT: From Selection to Merging for Efficient Instruction Tuning.
CoRR, March, 2025

CFPFormer: Cross Feature-Pyramid Transformer Decoder for Medical Image Segmentation.
Proceedings of the International Joint Conference on Neural Networks, 2025

To Trust or Not to Trust? Enhancing Large Language Models' Situated Faithfulness to External Contexts.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Agent Behavior: The Regulatory Object of the Agent-Centric Online Ecosystem in Digital Age.
Proceedings of the Engineering of Complex Computer Systems - 29th International Conference, 2025

A Vision for Access Control in LLM Agent Systems.
Proceedings of the Engineering of Complex Computer Systems - 29th International Conference, 2025

How Effective is In-Context Learning with Large Language Models for Rare Cell Identification in Single-Cell Expression Data?
Proceedings of the IEEE International Conference on Data Mining, 2025

AgileIR: Memory-Efficient Group Shifted Windows Attention for Lightweight Image Restoration.
Proceedings of the Artificial Neural Networks and Machine Learning - ICANN 2025, 2025

Low-Confidence Gold: Refining Low-Confidence Samples for Efficient Instruction Tuning.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025

2024
Enhancing Large Language Models' Situated Faithfulness to External Contexts.
CoRR, 2024

AgileIR: Memory-Efficient Group Shifted Windows Attention for Agile Image Restoration.
CoRR, 2024

CFPFormer: Feature-pyramid like Transformer Decoder for Segmentation and Detection.
CoRR, 2024

AccidentBlip2: Accident Detection With Multi-View MotionBlip2.
CoRR, 2024

Cross-Modal Alignment of Local and Global Features for Zero-Shot Chinese Character Recognition.
Proceedings of the IEEE International Conference on Image Processing, 2024

2023
An Adaptive Gradient Privacy-Preserving Algorithm for Federated XGBoost.
Proceedings of the 2023 2nd Asia Conference on Algorithms, Computing and Machine Learning, 2023

2020
Improving High Dynamic Range Image Based Light Measurement.
Proceedings of the 2020 IEEE International Conference on Systems, Man, and Cybernetics, 2020

2011
A Camera-Aided Legibility Assessment Protocol of Displays for Enhanced Human-Computer Interaction.
Proceedings of the Design, User Experience, and Usability. Theory, Methods, Tools and Practice, 2011


  Loading...