Zhengyang Liang

According to our database¹, Zhengyang Liang authored at least 15 papers between 2021 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2025

TimeScope: Towards Task-Oriented Temporal Grounding In Long Videos.

[BibT_eX]

[DOI]

CoRR, September, 2025

Video-XL-2: Towards Very Long-Video Understanding Through Task-Aware KV Sparsification.

[BibT_eX]

[DOI]

CoRR, June, 2025

MomentSeeker: A Comprehensive Benchmark and A Strong Baseline For Moment Retrieval Within Long Videos.

[BibT_eX]

[DOI]

CoRR, February, 2025

Any Information Is Just Worth One Single Screenshot: Unifying Search With Visualized Information Retrieval.

[BibT_eX]

[DOI]

CoRR, February, 2025

Video-XL: Extra-Long Vision Language Model for Hour-Scale Video Understanding.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Unveiling the Ignorance of MLLMs: Seeing Clearly, Answering Incorrectly.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

MLVU: Benchmarking Multi-task Long Video Understanding.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Any Information Is Just Worth One Single Screenshot: Unifying Search With Visualized Information Retrieval.

[BibT_eX]

[DOI]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

2024

SwinGNN: Rethinking Permutation Invariance in Diffusion Models for Graph Generation.

[BibT_eX]

[DOI]

Trans. Mach. Learn. Res., 2024

Scaling Laws For Diffusion Transformers.

[BibT_eX]

[DOI]

CoRR, 2024

Seeing Clearly, Answering Incorrectly: A Multimodal Robustness Benchmark for Evaluating MLLMs on Leading Questions.

[BibT_eX]

[DOI]

CoRR, 2024

Dynamic Self-adaptive Multiscale Distillation from Pre-trained Multimodal Large Model for Efficient Cross-modal Representation Learning.

[BibT_eX]

[DOI]

CoRR, 2024

AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Self-Supervised Multi-Modal Knowledge Graph Contrastive Hashing for Cross-Modal Search.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2021

A Hypothesis for the Aesthetic Appreciation in Neural Networks.

[BibT_eX]

[DOI]

CoRR, 2021

Zhengyang Liang

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...