Yuanhuiyi Lyu

Orcid: 0009-0004-1450-811X

According to our database¹, Yuanhuiyi Lyu authored at least 44 papers between 2023 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

Perceptual Flow Network for Visually Grounded Reasoning.

[BibT_eX]

[DOI]

CoRR, May, 2026

TC-AE: Unlocking Token Capacity for Deep Compression Autoencoders.

[BibT_eX]

[DOI]

CoRR, April, 2026

SAP: Segment Any 4K Panorama.

[BibT_eX]

[DOI]

CoRR, March, 2026

EgoIntent: An Egocentric Step-level Benchmark for Understanding What, Why, and Next.

[BibT_eX]

[DOI]

CoRR, March, 2026

StruVis: Enhancing Reasoning-based Text-to-Image Generation via Thinking with Structured Vision.

[BibT_eX]

[DOI]

CoRR, March, 2026

Unlocking Multimodal Document Intelligence: From Current Triumphs to Future Frontiers of Visual Document Retrieval.

[BibT_eX]

[DOI]

CoRR, February, 2026

BrightDreamer: Generic 3D Gaussian Generative Framework for Fast Text-to-3D Synthesis.

[BibT_eX]

[DOI]

Trans. Mach. Learn. Res., 2026

T-Rex-Omni: Integrating Negative Visual Prompt in Generic Object Detection.

[BibT_eX]

[DOI]

Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

2025

DualCamCtrl: Dual-Branch Diffusion Model for Geometry-Aware Camera-Controlled Video Generation.

[BibT_eX]

[DOI]

CoRR, November, 2025

Multimodal Spatial Reasoning in the Large Model Era: A Survey and Benchmarks.

[BibT_eX]

[DOI]

CoRR, October, 2025

AI for Service: Proactive Assistance with AI Glasses.

[BibT_eX]

[DOI]

CoRR, October, 2025

PhysToolBench: Benchmarking Physical Tool Understanding for MLLMs.

[BibT_eX]

[DOI]

CoRR, October, 2025

Are We Using the Right Benchmark: An Evaluation Framework for Visual Token Compression Methods.

[BibT_eX]

[DOI]

CoRR, October, 2025

Don't Just Chase "Highlighted Tokens" in MLLMs: Revisiting Visual Holistic Context Retention.

[BibT_eX]

[DOI]

CoRR, October, 2025

Understanding-in-Generation: Reinforcing Generative Capability of Unified Model via Infusing Understanding into Generation.

[BibT_eX]

[DOI]

CoRR, September, 2025

PANORAMA: The Rise of Omnidirectional Vision in the Embodied AI Era.

[BibT_eX]

[DOI]

CoRR, September, 2025

MLLMs are Deeply Affected by Modality Bias.

[BibT_eX]

[DOI]

CoRR, May, 2025

Are Multimodal Large Language Models Ready for Omnidirectional Spatial Reasoning?

[BibT_eX]

[DOI]

CoRR, May, 2025

DiMeR: Disentangled Mesh Reconstruction Model.

[BibT_eX]

[DOI]

CoRR, April, 2025

Retrieval Augmented Generation and Understanding in Vision: A Survey and New Outlook.

[BibT_eX]

[DOI]

CoRR, March, 2025

MemorySAM: Memorize Modalities and Semantics with Segment Anything Model 2 for Multi-modal Semantic Segmentation.

[BibT_eX]

[DOI]

CoRR, March, 2025

SpeCa: Accelerating Diffusion Transformers with Speculative Feature Caching.

[BibT_eX]

[DOI]

Proceedings of the 33rd ACM International Conference on Multimedia, 2025

RealRAG: Retrieval-augmented Realistic Image Generation via Self-reflective Contrastive Learning.

[BibT_eX]

[DOI]

Proceedings of the Forty-second International Conference on Machine Learning, 2025

Look Twice Before You Answer: Memory-Space Visual Retracing for Hallucination Mitigation in Multimodal Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the Forty-second International Conference on Machine Learning, 2025

OmniSAM: Omnidirectional Segment Anything Model for UDA in Panoramic Semantic Segmentation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

Reducing Unimodal Bias in Multi-Modal Semantic Segmentation With Multi-Scale Functional Entropy Regularization.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

From Reusing to Forecasting: Accelerating Diffusion Models With Taylorseers.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

A Survey of Mathematical Reasoning in the Era of Multimodal Large Language Model: Benchmark, Method & Challenges.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2025

MMUnlearner: Reformulating Multimodal Machine Unlearning in the Era of Multimodal Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2025

2024

MAGIC++: Efficient and Resilient Modality-Agnostic Semantic Segmentation via Hierarchical Modality Selection.

[BibT_eX]

[DOI]

CoRR, 2024

A Survey of Mathematical Reasoning in the Era of Multimodal Large Language Model: Benchmark, Method & Challenges.

[BibT_eX]

[DOI]

CoRR, 2024

Learning Robust Anymodal Segmentor with Unimodal and Cross-modal Distillation.

[BibT_eX]

[DOI]

CoRR, 2024

EIT-1M: One Million EEG-Image-Text Pairs for Human Visual-textual Recognition and More.

[BibT_eX]

[DOI]

CoRR, 2024

OmniBind: Teach to Build Unequal-Scale Modality Interaction for Omni-Bind of All.

[BibT_eX]

[DOI]

CoRR, 2024

ExACT: Language-guided Conceptual Reasoning and Uncertainty Estimation for Event-based Action Recognition and More.

[BibT_eX]

[DOI]

CoRR, 2024

UniBind: LLM-Augmented Unified and Balanced Representation Space to Bind Them All.

[BibT_eX]

[DOI]

CoRR, 2024

Image Anything: Towards Reasoning-coherent and Training-free Multi-modal Image Generation.

[BibT_eX]

[DOI]

Yuanhuiyi Lyu

Xu Zheng

Lin Wang

CoRR, 2024

Chasing Day and Night: Towards Robust and Efficient All-Day Object Detection Guided by an Event Camera.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Robotics and Automation, 2024

EventBind: Learning a Unified Representation to Bind Them All for Event-Based Open-World Understanding.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

Centering the Value of Every Modality: Towards Efficient and Resilient Modality-Agnostic Semantic Segmentation.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

Learning Modality-Agnostic Representation for Semantic Segmentation from Any Modalities.

[BibT_eX]

[DOI]

Xu Zheng

Yuanhuiyi Lyu

Lin Wang

Proceedings of the Computer Vision - ECCV 2024, 2024

ExACT: Language-Guided Conceptual Reasoning and Uncertainty Estimation for Event-Based Action Recognition and More.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

UniBind: LLM-Augmented Unified and Balanced Representation Space to Bind Them All.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023

E-CLIP: Towards Label-efficient Event-based Open-world Understanding by CLIP.

[BibT_eX]

[DOI]

CoRR, 2023

Yuanhuiyi Lyu

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...