Yibo Yan

This page is a disambiguation page, it actually contains multiple papers from persons of the same or a similar name.

Known people with the same name:

Bibliography

2026
Decoding by Perturbation: Mitigating MLLM Hallucinations via Dynamic Textual Perturbation.
CoRR, April, 2026

Visual Late Chunking: An Empirical Study of Contextual Chunking for Efficient Visual Document Retrieval.
CoRR, April, 2026

Unveiling Language Routing Isolation in Multilingual MoE Models for Interpretable Subnetwork Adaptation.
CoRR, April, 2026

AndroTMem: From Interaction Trajectories to Anchored Memory in Long-Horizon GUI Agents.
CoRR, March, 2026

Temporal Gains, Spatial Costs: Revisiting Video Fine-Tuning in Multimodal Large Language Models.
CoRR, March, 2026

Beyond the Grid: Layout-Informed Multi-Vector Retrieval with Parsed Visual Document Representations.
CoRR, March, 2026

Transfer learning for high-dimensional data with heavy-tailed noise: A sparse convoluted rank regression method.
Stat. Comput., February, 2026

Unlocking Multimodal Document Intelligence: From Current Triumphs to Future Frontiers of Visual Document Retrieval.
CoRR, February, 2026

Sculpting the Vector Space: Towards Efficient Multi-Vector Visual Document Retrieval via Prune-then-Merge Framework.
CoRR, February, 2026

CausalEmbed: Auto-Regressive Multi-Vector Generation in Latent Space for Visual Document Embedding.
CoRR, January, 2026

A Visual Semantic Adaptive Watermark grounded by Prefix-Tuning for Large Vision-Language Model.
CoRR, January, 2026

Vision-Language Introspection: Mitigating Overconfident Hallucinations in MLLMs via Interpretable Bi-Causal Steering.
CoRR, January, 2026

Distilling the Thought, Watermarking the Answer: A Principle Semantic Guided Watermark for Large Reasoning Models.
CoRR, January, 2026

Sharp Eyes and Memory for VideoLLMs: Information-Aware Visual Token Pruning for Efficient and Reliable VideoLLM Reasoning.
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

2025
EffiReason-Bench: A Unified Benchmark for Evaluating and Advancing Efficient Reasoning in Large Language Models.
CoRR, November, 2025

Don't Just Chase "Highlighted Tokens" in MLLMs: Revisiting Visual Holistic Context Retention.
CoRR, October, 2025

DocPruner: A Storage-Efficient Framework for Multi-Vector Visual Document Retrieval via Adaptive Patch-Level Embedding Pruning.
CoRR, September, 2025

MOSS-ChatV: Reinforcement Learning with Process Reasoning Reward for Video Temporal Reasoning.
CoRR, September, 2025

PMark: Towards Robust and Distortion-free Semantic-level Watermarking with Channel Constraints.
CoRR, September, 2025

High-Dimensional Differentially Private Quantile Regression: Distributed Estimation and Statistical Inference.
CoRR, August, 2025

GM-PRM: A Generative Multimodal Process Reward Model for Multimodal Mathematical Reasoning.
CoRR, August, 2025

A knowledge distillation-based network compression framework for lifecycle management of lithium-ion batteries.
J. Supercomput., July, 2025

VLA-Mark: A cross modal watermark for large vision-language alignment model.
CoRR, July, 2025

A Survey on Proactive Defense Strategies Against Misinformation in Large Language Models.
CoRR, July, 2025

Unveiling Instruction-Specific Neurons & Experts: An Analytical Framework for LLM's Instruction-Following Capabilities.
CoRR, May, 2025

AudioTrust: Benchmarking the Multifaceted Trustworthiness of Audio Large Language Models.
CoRR, May, 2025

CAFES: A Collaborative Multi-Agent Framework for Multi-Granular Multimodal Essay Scoring.
CoRR, May, 2025

RTV-Bench: Benchmarking MLLM Continuous Perception, Understanding and Reasoning through Real-Time Video.
CoRR, May, 2025

A Comprehensive Survey in LLM(-Agent) Full Stack Safety: Data, Training and Deployment.
CoRR, April, 2025

Aligning Multimodal LLM with Human Preference: A Survey.
CoRR, March, 2025

LLM Agents for Education: Advances and Applications.
CoRR, March, 2025

EssayJudge: A Multi-Granular Benchmark for Assessing Automated Essay Scoring Capabilities of Multimodal Large Language Models.
CoRR, February, 2025

Position: LLMs Can be Good Tutors in Foreign Language Education.
CoRR, February, 2025

Position: Multimodal Large Language Models Can Significantly Advance Scientific Reasoning.
CoRR, February, 2025

OneForecast: A Universal Framework for Global and Regional Weather Forecasting.
CoRR, February, 2025

Deep learning for cross-domain data fusion in urban computing: Taxonomy, advances, and outlook.
Inf. Fusion, 2025

RealRAG: Retrieval-augmented Realistic Image Generation via Self-reflective Contrastive Learning.
Proceedings of the Forty-second International Conference on Machine Learning, 2025

OneForecast: A Universal Framework for Global and Regional Weather Forecasting.
Proceedings of the Forty-second International Conference on Machine Learning, 2025

Look Twice Before You Answer: Memory-Space Visual Retracing for Hallucination Mitigation in Multimodal Large Language Models.
Proceedings of the Forty-second International Conference on Machine Learning, 2025

Mitigating Modality Prior-Induced Hallucinations in Multimodal Large Language Models via Deciphering Attention Causality.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Space-aware Socioeconomic Indicator Inference with Heterogeneous Graphs.
Proceedings of the 33rd ACM International Conference on Advances in Geographic Information Systems, 2025

Position: LLMs Can be Good Tutors in English Education.
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

VLA-Mark: A cross modal watermark for large vision-language alignment models.
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

Pierce the Mists, Greet the Sky: Decipher Knowledge Overshadowing via Knowledge Circuit Analysis.
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

Exploring Response Uncertainty in MLLMs: An Empirical Evaluation under Misleading Scenarios.
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

PhysicsArena: The First Multimodal Physics Reasoning Benchmark Exploring Variable, Process, and Solution Dimensions.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025

LLM Agents for Education: Advances and Applications.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025

Towards Microsecond-Scale VM Core Provisioning Agility on Serverless Platforms.
Proceedings of the 16th ACM SIGOPS Asia-Pacific Workshop on Systems, 2025

Reefknot: A Comprehensive Benchmark for Relation Hallucination Evaluation, Analysis and Mitigation in Multimodal Large Language Models.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

A Survey of Mathematical Reasoning in the Era of Multimodal Large Language Model: Benchmark, Method & Challenges.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

MathAgent: Leveraging a Mixture-of-Math-Agent Framework for Real-World Multimodal Mathematical Error Detection.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 6: Industry Track), 2025

EssayJudge: A Multi-Granular Benchmark for Assessing Automated Essay Scoring Capabilities of Multimodal Large Language Models.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

A Survey on Proactive Defense Strategies Against Misinformation in Large Language Models.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

MMUnlearner: Reformulating Multimodal Machine Unlearning in the Era of Multimodal Large Language Models.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

Unlocking Speech Instruction Data Potential with Query Rewriting.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

SafeEraser: Enhancing Safety in Multimodal Large Language Models through Multimodal Machine Unlearning.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

UrbanVLP: Multi-Granularity Vision-Language Pretraining for Urban Socioeconomic Indicator Prediction.
Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

2024
Improving Needle Tip Tracking and Detection in Ultrasound-Based Navigation System Using Deep Learning-Enabled Approach.
IEEE J. Biomed. Health Informatics, May, 2024

A Survey of Mathematical Reasoning in the Era of Multimodal Large Language Model: Benchmark, Method & Challenges.
CoRR, 2024

Explainable and Interpretable Multimodal Large Language Models: A Comprehensive Survey.
CoRR, 2024

Optimizing Multispectral Object Detection: A Bag of Tricks and Comprehensive Benchmarks.
CoRR, 2024

Learning Robust Anymodal Segmentor with Unimodal and Cross-modal Distillation.
CoRR, 2024

Exploring Response Uncertainty in MLLMs: An Empirical Evaluation under Misleading Scenarios.
CoRR, 2024

MINER: Mining the Underlying Pattern of Modality-Specific Neurons in Multimodal Large Language Models.
CoRR, 2024

Mitigating Modality Prior-Induced Hallucinations in Multimodal Large Language Models via Deciphering Attention Causality.
CoRR, 2024

ErrorRadar: Benchmarking Complex Mathematical Reasoning of Multimodal Large Language Models Via Error Detection.
CoRR, 2024

Look Twice Before You Answer: Memory-Space Visual Retracing for Hallucination Mitigation in Multimodal Large Language Models.
CoRR, 2024

Reefknot: A Comprehensive Benchmark for Relation Hallucination Evaluation, Analysis and Mitigation in Multimodal Large Language Models.
CoRR, 2024

Learning Geospatial Region Embedding with Heterogeneous Graph.
CoRR, 2024

UrbanVLP: A Multi-Granularity Vision-Language Pre-Trained Foundation Model for Urban Indicator Prediction.
CoRR, 2024

Deep Learning for Cross-Domain Data Fusion in Urban Computing: Taxonomy, Advances, and Outlook.
CoRR, 2024

UrbanCLIP: Learning Text-enhanced Urban Region Profiling with Contrastive Language-Image Pretraining from the Web.
Proceedings of the ACM on Web Conference 2024, 2024

UrbanCross: Enhancing Satellite Image-Text Retrieval with Cross-Domain Adaptation.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

MMNeuron: Discovering Neuron-Level Domain-Specific Interpretation in Multimodal Large Language Model.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

FastMem: Fast Memorization of Prompt Improves Context Awareness of Large Language Models.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

GeoReasoner: Reasoning On Geospatially Grounded Context For Natural Language Understanding.
Proceedings of the 33rd ACM International Conference on Information and Knowledge Management, 2024

2023
Adversarial Multi-task Learning for Efficient Chinese Named Entity Recognition.
ACM Trans. Asian Low Resour. Lang. Inf. Process., 2023

Confidence Intervals and Hypothesis Testing for High-dimensional Quantile Regression: Convolution Smoothing and Debiasing.
J. Mach. Learn. Res., 2023

When Urban Region Profiling Meets Large Language Models.
CoRR, 2023

GitHub OSS Governance File Dataset.
Proceedings of the 20th IEEE/ACM International Conference on Mining Software Repositories, 2023

2022
Open Source Software Sustainability: Combining Institutional Analysis and Socio-Technical Networks.
Proc. ACM Hum. Comput. Interact., 2022

2020
SF-Sketch: A Two-Stage Sketch for Data Streams.
IEEE Trans. Parallel Distributed Syst., 2020

Monitoring the Spatial and Temporal Variations in The Water Surface and Floating Algal Bloom Areas in Dongting Lake Using a Long-Term MODIS Image Time Series.
Remote. Sens., 2020

Priority-Aware Per-flow Measurement using Cuckoo Sketch.
Proceedings of the 2020 IFIP Networking Conference, 2020

2019
Longest Prefix Matching with Pruning.
Proceedings of the 20th IEEE International Conference on High Performance Switching and Routing, 2019

Shifting Hash Table: An Efficient Hash Table with Delicate Summary.
Proceedings of the 2019 IEEE Globecom Workshops, Waikoloa, HI, USA, December 9-13, 2019, 2019

2018
Single Hash: Use One Hash Function to Build Faster Hash Based Data Structures.
Proceedings of the 2018 IEEE International Conference on Big Data and Smart Computing, 2018

2017
SF-sketch: slim-fat-sketch with GPU assistance.
CoRR, 2017

SF-sketch: A Fast, Accurate, and Memory Efficient Data Structure to Store Frequencies of Data Items.
Proceedings of the 33rd IEEE International Conference on Data Engineering, 2017


  Loading...