Zhuoyi Yang

Orcid: 0000-0003-2620-7790

According to our database1, Zhuoyi Yang authored at least 34 papers between 2019 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
InstrAct: Towards Action-Centric Understanding in Instructional Videos.
CoRR, April, 2026

Fine-Tuning vs. RAG for Multi-Hop Question Answering with Novel Knowledge.
CoRR, January, 2026

VisionReward: Fine-Grained Multi-Dimensional Human Preference Learning for Image and Video Generation.
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

2025
SCAIL: Towards Studio-Grade Character Animation via In-Context Learning of 3D-Consistent Pose Representations.
CoRR, December, 2025

Delving into Latent Spectral Biasing of Video VAEs for Superior Diffusability.
CoRR, December, 2025

Test-time Scaling of LLMs: A Survey from A Subproblem Structure Perspective.
CoRR, November, 2025

Kaleido: Open-Sourced Multi-Subject Reference Video Generation Model.
CoRR, October, 2025

AMAQ: Adaptive Mixed-bit Activation Quantization for Collaborative Parameter Efficient Fine-tuning.
CoRR, October, 2025

Stabilizing Information Flow Entropy: Regularization for Safe and Interpretable Autonomous Driving Perception.
CoRR, September, 2025

Empirical Investigation of Digital Collectibles Purchase Intention: The Roles of Value, Risks, Identification, and Scarcity.
Int. J. Hum. Comput. Interact., August, 2025

GeoFAN: Point Pattern Recognition in Spatial Vector Data.
ISPRS Int. J. Geo Inf., 2025

TempA-VLP: Temporal-Aware Vision-Language Pretraining for Longitudinal Exploration in Chest X-Ray Image.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2025

LogLLaMA: Transformer-based log anomaly detection with LLaMA.
Proceedings of the International Joint Conference on Neural Networks, 2025

CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Concat-ID: Towards Universal Identity-Preserving Video Synthesis.
Proceedings of the IEEE/CVF International Conference on Computer Vision, ICCV 2025, 2025

VPO: Aligning Text-to-Video Generation Models with Prompt Optimization.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

MotionBench: Benchmarking and Improving Fine-grained Video Motion Understanding for Vision Language Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

2024
VisionReward: Fine-Grained Multi-Dimensional Human Preference Learning for Image and Video Generation.
CoRR, 2024

Entropy Loss: An Interpretability Amplifier of 3D Object Detection Network for Intelligent Driving.
CoRR, 2024

CogVLM2: Visual Language Models for Image and Video Understanding.
CoRR, 2024

CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer.
CoRR, 2024

CogVLM: Visual Expert for Pretrained Language Models.
Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Relay Diffusion: Unifying diffusion process across resolutions for image synthesis.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

CogView3: Finer and Faster Text-to-Image Generation via Relay Diffusion.
Proceedings of the Computer Vision - ECCV 2024, 2024

Inf-DiT: Upsampling Any-Resolution Image with Memory-Efficient Diffusion Transformer.
Proceedings of the Computer Vision - ECCV 2024, 2024

2023
CogVLM: Visual Expert for Pretrained Language Models.
CoRR, 2023

Eloss in the way: A Sensitive Input Quality Metrics for Intelligent Driving.
CoRR, 2023

GLM-130B: An Open Bilingual Pre-trained Model.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

2022
GLM-130B: An Open Bilingual Pre-trained Model.
CoRR, 2022

The Behavior and Impact of Heterogeneous Investors in China's Stock Index Futures Market: An Agent-Based Model on Cross-Market Trades.
Complex., 2022

Parameter-Efficient Tuning Makes a Good Classification Head.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

2021
CogView: Mastering Text-to-Image Generation via Transformers.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

2020
Distributed High-dimensional Regression Under a Quantile Loss Function.
J. Mach. Learn. Res., 2020

2019
Distributed Inference for Linear Support Vector Machine.
J. Mach. Learn. Res., 2019


  Loading...