Bozhou Li

Orcid: 0009-0001-7519-5733

According to our database¹, Bozhou Li authored at least 21 papers between 2021 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

LongAV-Compass: Towards Unified Evaluation of Minute-Scale Audio-Visual Generation Across T2AV, I2AV, and V2AV.

[BibT_eX]

[DOI]

CoRR, May, 2026

LatentOmni: Rethinking Omni-Modal Understanding via Unified Audio-Visual Latent Reasoning.

[BibT_eX]

[DOI]

CoRR, May, 2026

Artifact-Bench: Evaluating MLLMs on Detecting and Assessing the Artifacts of AI-Generated Videos.

[BibT_eX]

[DOI]

CoRR, May, 2026

OpenWorldLib: A Unified Codebase and Definition of Advanced World Models.

[BibT_eX]

[DOI]

CoRR, April, 2026

OmniSIFT: Modality-Asymmetric Token Compression for Efficient Omni-modal Large Language Models.

[BibT_eX]

[DOI]

CoRR, February, 2026

Semantic Routing: Exploring Multi-Layer LLM Feature Weighting for Diffusion Transformers.

[BibT_eX]

[DOI]

CoRR, February, 2026

Research on World Models Is Not Merely Injecting World Knowledge into Specific Tasks.

[BibT_eX]

[DOI]

CoRR, February, 2026

DiaDem: Advancing Dialogue Descriptions in Audiovisual Video Captioning for Multimodal Large Language Models.

[BibT_eX]

[DOI]

CoRR, January, 2026

2025

GRAN-TED: Generating Robust, Aligned, and Nuanced Text Embedding for Diffusion Models.

[BibT_eX]

[DOI]

CoRR, December, 2025

The Unseen Bias: How Norm Discrepancy in Pre-Norm MLLMs Leads to Visual Information Loss.

[BibT_eX]

[DOI]

CoRR, December, 2025

AVoCaDO: An Audiovisual Video Captioner Driven by Temporal Orchestration.

[BibT_eX]

[DOI]

CoRR, October, 2025

RealUnify: Do Unified Models Truly Benefit from Unification? A Comprehensive Benchmark.

[BibT_eX]

[DOI]

CoRR, September, 2025

Text2VectorSQL: Bridging Text-to-SQL and Vector Search for Unified Natural Language Queries.

[BibT_eX]

[DOI]

CoRR, June, 2025

ID-Align: RoPE-Conscious Position Remapping for Dynamic High-Resolution Adaptation in Vision-Language Models.

[BibT_eX]

[DOI]

Bozhou Li

Wentao Zhang

CoRR, May, 2025

The First Prompt Counts the Most! An Evaluation of Large Language Models on Iterative Example-Based Code Generation.

[BibT_eX]

[DOI]

Proc. ACM Softw. Eng., 2025

SynthVLM: Towards High-Quality and Efficient Synthesis of Image-Caption Datasets for Vision-Language Models.

[BibT_eX]

[DOI]

Proceedings of the 33rd ACM International Conference on Multimedia, 2025

An Adaptive Attention-Aware Method for Occluded Multi-Pedestrian Tracking.

[BibT_eX]

[DOI]

Proceedings of the 28th International Conference on Computer Supported Cooperative Work in Design, 2025

2024

Gradual Learning: Optimizing Fine-Tuning with Partially Mastered Knowledge in Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

Are Bigger Encoders Always Better in Vision Large Models?

[BibT_eX]

[DOI]

CoRR, 2024

A Survey of Multimodal Large Language Model from A Data-centric Perspective.

[BibT_eX]

[DOI]

CoRR, 2024

2021

Cluster-Based Distribution Alignment For Generalizable Person Re-Identification.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE International Conference on Multimedia & Expo Workshops, 2021

Bozhou Li

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...