Zhaoyang Zhang

Orcid: 0009-0003-5583-6454

Affiliations:

Wuhan University, China
SenseTime Research, Wuhan, China

According to our database¹, Zhaoyang Zhang authored at least 35 papers between 2018 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Bibliography

2026

iScript: A Domain-Adapted Large Language Model and Benchmark for Physical Design Tcl Script Generation.

[BibT_eX]

[DOI]

CoRR, March, 2026

2025

IC-Custom: Diverse Image Customization via In-Context Learning.

[BibT_eX]

[DOI]

CoRR, July, 2025

HaploOmni: Unified Single Transformer for Multimodal Video Understanding and Generation.

[BibT_eX]

[DOI]

CoRR, June, 2025

BlobCtrl: A Unified and Flexible Framework for Element-level Image Generation and Editing.

[BibT_eX]

[DOI]

CoRR, March, 2025

BlobCtrl: Taming Controllable Blob for Element-level Image Editing.

[BibT_eX]

[DOI]

Proceedings of the SIGGRAPH Asia 2025 Conference Papers, 2025

Cobra: Efficient Line Art COlorization with BRoAder References.

[BibT_eX]

[DOI]

Proceedings of the Special Interest Group on Computer Graphics and Interactive Techniques Conference, 2025

FlexiAct: Towards Flexible Action Control in Heterogeneous Scenarios.

[BibT_eX]

[DOI]

Proceedings of the Special Interest Group on Computer Graphics and Interactive Techniques Conference, 2025

VideoPainter: Any-length Video Inpainting and Editing with Plug-and-Play Context Control.

[BibT_eX]

[DOI]

Proceedings of the Special Interest Group on Computer Graphics and Interactive Techniques Conference, 2025

NVComposer: Boosting Generative Novel View Synthesis with Multiple Sparse and Unposed Images.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

DiTCtrl: Exploring Attention Control in Multi-Modal Diffusion Transformer for Tuning-Free Multi-Prompt Longer Video Generation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Image Conductor: Precision Control for Interactive Video Synthesis.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

MotionCraft: Crafting Whole-Body Motion with Plug-and-Play Multimodal Controls.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

2024

FAT: Frequency-Aware Transformation for Bridging Full-Precision and Low-Precision Deep Representations.

[BibT_eX]

[DOI]

IEEE Trans. Neural Networks Learn. Syst., February, 2024

Consistent Human Image and Video Generation with Spatially Conditioned Diffusion.

[BibT_eX]

[DOI]

CoRR, 2024

ColorFlow: Retrieval-Augmented Image Sequence Colorization.

[BibT_eX]

[DOI]

CoRR, 2024

BrushEdit: All-In-One Image Inpainting and Editing.

[BibT_eX]

[DOI]

CoRR, 2024

Adding Multi-modal Controls to Whole-body Human Motion Generation.

[BibT_eX]

[DOI]

CoRR, 2024

Image Inpainting Models are Effective Tools for Instruction-guided Image Editing.

[BibT_eX]

[DOI]

CoRR, 2024

ReVideo: Remake a Video with Motion and Content Control.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

MiraData: A Large-Scale Video Dataset with Long Durations and Structured Captions.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

OmniQuant: Omnidirectionally Calibrated Quantization for Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Cached Transformers: Improving Transformers with Differentiable Memory Cachde.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023

Cached Transformers: Improving Transformers with Differentiable Memory Cache.

[BibT_eX]

[DOI]

CoRR, 2023

Real-Time Controllable Denoising for Image and Video.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022

Dynamic Token Normalization improves Vision Transformers.

[BibT_eX]

[DOI]

Proceedings of the Tenth International Conference on Learning Representations, 2022

Not All Models Are Equal: Predicting Model Transferability in a Self-challenging Fisher Space.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

2021

Dynamic Token Normalization Improves Vision Transformer.

[BibT_eX]

[DOI]

CoRR, 2021

BWCP: Probabilistic Learning-to-Prune Channels for ConvNets via Batch Whitening.

[BibT_eX]

[DOI]

CoRR, 2021

FAT: Learning Low-Bitwidth Parametric Representation via Frequency-Aware Transformation.

[BibT_eX]

[DOI]

CoRR, 2021

Differentiable Dynamic Quantization with Mixed Precision and Adaptive Resolution.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

STAR: A Structure-aware Lightweight Transformer for Real-time Image Enhancement.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

2020

AdaX: Adaptive Gradient Descent with Exponential Long Term Memory.

[BibT_eX]

[DOI]

CoRR, 2020

2019

Differentiable Learning-to-Group Channels via Groupable Convolutional Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

2018

Temporal Sequence Distillation: Towards Few-Frame Action Recognition in Videos.

[BibT_eX]

[DOI]

Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Boosting up Scene Text Detectors with Guided CNN.

[BibT_eX]

[DOI]

Proceedings of the British Machine Vision Conference 2018, 2018

Zhaoyang Zhang

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...