Haoyu Lu

Orcid: 0009-0004-3541-557X

According to our database¹, Haoyu Lu authored at least 51 papers between 2019 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

AITQE: An Adaptive Image-Text Quality Enhancer for Scalable MLLM Pretraining.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. Video Technol., June, 2026

GUI-CEval: A Hierarchical and Comprehensive Chinese Benchmark for Mobile GUI Agents.

[BibT_eX]

[DOI]

CoRR, March, 2026

WorldVQA: Measuring Atomic World Knowledge in Multimodal Large Language Models.

[BibT_eX]

[DOI]

CoRR, February, 2026

Towards Pixel-Level VLM Perception via Simple Points Prediction.

[BibT_eX]

[DOI]

CoRR, January, 2026

BabyVision: Visual Reasoning Beyond Language.

[BibT_eX]

[DOI]

CoRR, January, 2026

2025

HyperVL: An Efficient and Dynamic Multimodal Large Language Model for Edge Devices.

[BibT_eX]

[DOI]

CoRR, December, 2025

Physics-Constrained Diffusion Reconstruction with Posterior Correction for Quantitative and Fast PET Imaging.

[BibT_eX]

[DOI]

CoRR, August, 2025

Sequential Monte Carlo with Gaussian Mixture Approximation for Infinite-Dimensional Statistical Inverse Problems.

[BibT_eX]

[DOI]

Haoyu Lu

Junxiong Jia

Deyu Meng

CoRR, March, 2025

RIS-Assisted Spectral Efficiency Enhancement for Vertical Sectorization Network.

[BibT_eX]

[DOI]

Haoyu Lu

Sihan Zhuang

Hongcheng Zhuang

IEEE Wirel. Commun. Lett., February, 2025

Adaptive Multi-Beamforming for Integrated Sensing and Communication System.

[BibT_eX]

[DOI]

Proceedings of the 101st IEEE Vehicular Technology Conference, 2025

Needle In A Video Haystack: A Scalable Synthetic Evaluator for Video MLLMs.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Exploring the Design Space of Visual Context Representation in Video MLLMs.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

R1-Onevision: Advancing Generalized Multimodal Reasoning Through Cross-Modal Formalization.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

Efficient Motion-Aware Video MLLM.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Enhancing Sound-Based Sleep Quality Assessment by Multi-modal Knowledge Distillation.

[BibT_eX]

[DOI]

Haoyu Lu

Takafumi Kato

Ken-ichi Fukui

Proceedings of the Artificial Intelligence Applications and Innovations, 2025

2024

BotCL: a social bot detection model based on graph contrastive learning.

[BibT_eX]

[DOI]

Knowl. Inf. Syst., September, 2024

Functional normalizing flow for statistical inverse problems of partial differential equations.

[BibT_eX]

[DOI]

CoRR, 2024

Beyond Filtering: Adaptive Image-Text Quality Enhancement for MLLM Pretraining.

[BibT_eX]

[DOI]

CoRR, 2024

Towards Event-oriented Long Video Understanding.

[BibT_eX]

[DOI]

CoRR, 2024

Needle In A Video Haystack: A Scalable Synthetic Framework for Benchmarking Video MLLMs.

[BibT_eX]

[DOI]

CoRR, 2024

DeepSeek-VL: Towards Real-World Vision-Language Understanding.

[BibT_eX]

[DOI]

CoRR, 2024

DeepSeek LLM: Scaling Open-Source Language Models with Longtermism.

[BibT_eX]

[DOI]

CoRR, 2024

Gridding Based Reconfigurable Intelligent Surface-aided Wireless Network Optimization.

[BibT_eX]

[DOI]

Proceedings of the 35th IEEE International Symposium on Personal, 2024

VDT: General-purpose Video Diffusion Transformers via Mask Modeling.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

UniAdapter: Unified Parameter-Efficient Transfer Learning for Cross-modal Modeling.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Multi-Level Contrastive Learning For Hybrid Cross-Modal Retrieval.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Progressive Image Synthesis from Semantics to Details with Denoising Diffusion GAN.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

VEMO: A Versatile Elastic Multi-modal Model for Search-Oriented Multi-task Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Information Retrieval, 2024

2023

Cross-modal Contrastive Learning for Generalizable and Efficient Image-text Retrieval.

[BibT_eX]

[DOI]

Mach. Intell. Res., August, 2023

VDT: An Empirical Study on Video Diffusion with Transformers.

[BibT_eX]

[DOI]

CoRR, 2023

Shot Retrieval and Assembly with Text Script for Video Montage Generation.

[BibT_eX]

[DOI]

Proceedings of the 2023 ACM International Conference on Multimedia Retrieval, 2023

A Topology Based Denoising Approach for 2D Scalar Fields.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Image Processing, 2023

BotCS: A Lightweight Model for Large-Scale Twitter Bot Detection Comparable to GNN-Based Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Communications, 2023

Speech and Noise Dual-Stream Spectrogram Refine Network With Speech Distortion Loss For Robust Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

PSMiner: A Pattern-Aware Accelerator for High-Performance Streaming Graph Pattern Mining.

[BibT_eX]

[DOI]

Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023

2022

Image fragile watermarking algorithm based on deneighbourhood mapping.

[BibT_eX]

[DOI]

IET Image Process., 2022

Monolingual Recognizers Fusion for Code-switching Speech Recognition.

[BibT_eX]

[DOI]

CoRR, 2022

Multimodal foundation models are better simulators of the human brain.

[BibT_eX]

[DOI]

CoRR, 2022

Image Fragile Watermarking Algorithm Based on Deneighborhood Mapping.

[BibT_eX]

[DOI]

CoRR, 2022

MHCRoBERTa: pan-specific peptide-MHC class I binding prediction through transfer learning with label-agnostic protein sequences.

[BibT_eX]

[DOI]

Briefings Bioinform., 2022

LGDN: Language-Guided Denoising Network for Video-Language Modeling.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

BMU-MoCo: Bidirectional Momentum Update for Continual Video-Language Modeling.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Learning Versatile Neural Architectures by Propagating Network Codes.

[BibT_eX]

[DOI]

Proceedings of the Tenth International Conference on Learning Representations, 2022

COTS: Collaborative Two-Stream Vision-Language Pre-Training Model for Cross-Modal Retrieval.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021

WenLan 2.0: Make AI Imagine via a Multimodal Foundation Model.

[BibT_eX]

[DOI]

CoRR, 2021

WenLan: Bridging Vision and Language by Large-Scale Multi-Modal Pre-Training.

[BibT_eX]

[DOI]

CoRR, 2021

Compressed Video Contrastive Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Self-Supervised Video Representation Learning with Constrained Spatiotemporal Jigsaw.

[BibT_eX]

[DOI]

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021

2020

PremPS: Predicting the impact of missense mutations on protein stability.

[BibT_eX]

[DOI]

PLoS Comput. Biol., 2020

2019

Affine invariant image watermarking scheme based on ASIFT and Delaunay tessellation.

[BibT_eX]

[DOI]

Multim. Tools Appl., 2019

Deep neural network-based image copyright protection scheme.

[BibT_eX]

[DOI]

J. Electronic Imaging, 2019

Haoyu Lu

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...