Zhuoyang Zhang

Orcid: 0000-0002-3312-6246

According to our database¹, Zhuoyang Zhang authored at least 16 papers between 2023 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2025

DC-AR: Efficient Masked Autoregressive Image Generation with Deep Compression Hybrid Tokenizer.

[BibT_eX]

[DOI]

CoRR, July, 2025

Locality-aware Parallel Decoding for Efficient Autoregressive Image Generation.

[BibT_eX]

[DOI]

CoRR, July, 2025

VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

HART: Efficient Visual Generation with Hybrid Autoregressive Transformer.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

CoT-VLA: Visual Chain-of-Thought Reasoning for Vision-Language-Action Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

NVILA: Efficient Frontier Visual Language Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

2024

Gaze-assisted visual grounding via knowledge distillation for referred object grasping with under-specified object referring.

[BibT_eX]

[DOI]

Eng. Appl. Artif. Intell., 2024

NVILA: Efficient Frontier Visual Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

Condition-Aware Neural Network for Controlled Image Generation.

[BibT_eX]

[DOI]

CoRR, 2024

Sparse Refinement for Efficient High-Resolution Semantic Segmentation.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss.

[BibT_eX]

[DOI]

Zhuoyang Zhang

Han Cai

Song Han

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

One-2-3-45++: Fast Single Image to 3D Objects with Consistent Multi-View Generation and 3D Diffusion.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023

GVGNet: Gaze-Directed Visual Grounding for Learning Under-Specified Object Referring Intention.

[BibT_eX]

[DOI]

IEEE Robotics Autom. Lett., September, 2023

Zero123++: a Single Image to Consistent Multi-view Diffusion Base Model.

[BibT_eX]

[DOI]

CoRR, 2023

NSM4D: Neural Scene Model Based Online 4D Point Cloud Sequence Understanding.

[BibT_eX]

[DOI]

CoRR, 2023

Complete-to-Partial 4D Distillation for Self-Supervised Point Cloud Sequence Representation Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Zhuoyang Zhang

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...