Zhiyang Chen

Orcid: 0000-0001-9006-9180

Affiliations:
  • Westlake University, Machine Perception & Learning (MAPLE) Lab, Hangzhou, China
  • Chinese Academy of Sciences, Institute of Automation, National Laboratory of Pattern Recognition, Beijing, China (PhD 2024)


According to our database1, Zhiyang Chen authored at least 23 papers between 2021 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

Online presence:

On csauthors.net:

Bibliography

2026
Self-Guidance: Boosting Flow and Diffusion Generation on Their Own.
IEEE Trans. Pattern Anal. Mach. Intell., January, 2026

2025
When Images Speak Louder: Mitigating Language Bias-induced Hallucinations in VLMs through Cross-Modal Guidance.
CoRR, October, 2025

From Seeing to Predicting: A Vision-Language Framework for Trajectory Forecasting and Controlled Video Generation.
CoRR, October, 2025

Don't Settle Too Early: Self-Reflective Remasking for Diffusion Language Models.
CoRR, September, 2025

C-Evolve: Consensus-based Evolution for Prompt Groups.
CoRR, September, 2025

InfLVG: Reinforce Inference-Time Consistent Long Video Generation with GRPO.
CoRR, May, 2025

SLOT: Sample-specific Language Model Optimization at Test-time.
CoRR, May, 2025

Reinforcing the Diffusion Chain of Lateral Thought with Diffusion Language Models.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

Schedule On the Fly: Diffusion Time Prediction for Faster and Better Image Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

2024
Efficient Masked Autoencoders With Self-Consistency.
IEEE Trans. Pattern Anal. Mach. Intell., December, 2024

Relation-Associated Instructions & Hallucination Benchmark.
Dataset, July, 2024

EFCPose: End-to-End Multi-Person Pose Estimation With Fully Convolutional Heads.
IEEE Trans. Circuits Syst. Video Technol., 2024

Openstory++: A Large-scale Dataset and Benchmark for Instance-aware Open-domain Visual Storytelling.
CoRR, 2024

The Devil is in Details: Delving Into Lite FFN Design for Vision Transformers.
Proceedings of the IEEE International Conference on Acoustics, 2024

Griffon: Spelling Out All Object Locations at Any Granularity with Large Language Models.
Proceedings of the Computer Vision - ECCV 2024, 2024

OpenStory: A Large-Scale Open-Domain Dataset for Subject-Driven Visual Storytelling.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Self-Supervised Representation Learning from Arbitrary Scenarios.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023
Mitigating Hallucination in Visual Language Models with Visual Supervision.
CoRR, 2023

Efficient Masked Autoencoders with Self-Consistency.
CoRR, 2023

2022
Obj2Seq: Formatting Objects as Sequences with Class Prompt for Visual Tasks.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

UniVIP: A Unified Framework for Self-Supervised Visual Pre-training.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021
MST: Masked Self-Supervised Transformer for Visual Representation.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

DPT: Deformable Patch-based Transformer for Visual Recognition.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021


  Loading...