Zhengyang Liang

Orcid: 0009-0008-0205-0163

According to our database¹, Zhengyang Liang authored at least 21 papers between 2021 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

Explicit Critic Guidance for Aligning Diffusion Models.

[BibT_eX]

[DOI]

Zhengyang Liang

Qihang Zhang

Ceyuan Yang

CoRR, May, 2026

DeepXiv-SDK: An Agentic Data Interface for Scientific Literature.

[BibT_eX]

[DOI]

CoRR, March, 2026

VideoCreator: An Agentic System for Multi-turn Video Production.

[BibT_eX]

[DOI]

Proceedings of the 2026 International Conference on Multimedia Retrieval, 2026

2025

Video-BrowseComp: Benchmarking Agentic Video Research on Open Web.

[BibT_eX]

[DOI]

CoRR, December, 2025

UniVA: Universal Video Agent towards Open-Source Next-Generation Video Generalist.

[BibT_eX]

[DOI]

CoRR, November, 2025

TimeScope: Towards Task-Oriented Temporal Grounding In Long Videos.

[BibT_eX]

[DOI]

CoRR, September, 2025

Video-XL-2: Towards Very Long-Video Understanding Through Task-Aware KV Sparsification.

[BibT_eX]

[DOI]

CoRR, June, 2025

MomentSeeker: A Comprehensive Benchmark and A Strong Baseline For Moment Retrieval Within Long Videos.

[BibT_eX]

[DOI]

CoRR, February, 2025

Any Information Is Just Worth One Single Screenshot: Unifying Search With Visualized Information Retrieval.

[BibT_eX]

[DOI]

CoRR, February, 2025

Dynamic Self-adaptive Multiscale Distillation from Pre-trained Multimodal Large Model for Efficient Cross-modal Retrieval.

[BibT_eX]

[DOI]

Proceedings of the 33rd ACM International Conference on Multimedia, 2025

Video-XL: Extra-Long Vision Language Model for Hour-Scale Video Understanding.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Unveiling the Ignorance of MLLMs: Seeing Clearly, Answering Incorrectly.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

MLVU: Benchmarking Multi-task Long Video Understanding.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Any Information Is Just Worth One Single Screenshot: Unifying Search With Visualized Information Retrieval.

[BibT_eX]

[DOI]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

2024

SwinGNN: Rethinking Permutation Invariance in Diffusion Models for Graph Generation.

[BibT_eX]

[DOI]

Trans. Mach. Learn. Res., 2024

Scaling Laws For Diffusion Transformers.

[BibT_eX]

[DOI]

CoRR, 2024

Seeing Clearly, Answering Incorrectly: A Multimodal Robustness Benchmark for Evaluating MLLMs on Leading Questions.

[BibT_eX]

[DOI]

CoRR, 2024

Dynamic Self-adaptive Multiscale Distillation from Pre-trained Multimodal Large Model for Efficient Cross-modal Representation Learning.

[BibT_eX]

[DOI]

CoRR, 2024

AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Self-Supervised Multi-Modal Knowledge Graph Contrastive Hashing for Cross-Modal Search.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2021

A Hypothesis for the Aesthetic Appreciation in Neural Networks.

[BibT_eX]

[DOI]

CoRR, 2021

Zhengyang Liang

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...