Junhao Zhang

Affiliations:

National University of Singapore, Show Lab, Department of Electrical and Computer Engineering, Singapore

According to our database¹, Junhao Zhang authored at least 35 papers between 2021 and 2026.

Collaborative distances:

Dijkstra number² of three.
Erdős number³ of three.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Bibliography

2026

MultiGen: Level-Design for Editable Multiplayer Worlds in Diffusion Game Engines.

[BibT_eX]

[DOI]

CoRR, March, 2026

2025

Draw-In-Mind: Rebalancing Designer-Painter Roles in Unified Multimodal Models Benefits Image Editing.

[BibT_eX]

[DOI]

CoRR, September, 2025

MoonShot: Towards Controllable Video Generation and Editing with Motion-Aware Multimodal Conditions.

[BibT_eX]

[DOI]

Int. J. Comput. Vis., June, 2025

Follow-Your-Creation: Empowering 4D Creation through Video Inpainting.

[BibT_eX]

[DOI]

CoRR, June, 2025

DD-Ranking: Rethinking the Evaluation of Dataset Distillation.

[BibT_eX]

[DOI]

Baharan Mirzasoleiman

Manolis Kellis

Konstantinos N. Plataniotis

CoRR, May, 2025

Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation.

[BibT_eX]

[DOI]

Int. J. Comput. Vis., April, 2025

Show-o: One Single Transformer to Unify Multimodal Understanding and Generation.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

MixEval-X: Any-to-any Evaluations from Real-world Data Mixture.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

2024

MixEval-X: Any-to-Any Evaluations from Real-World Data Mixtures.

[BibT_eX]

[DOI]

CoRR, 2024

Towards A Better Metric for Text-to-Video Generation.

[BibT_eX]

[DOI]

CoRR, 2024

Moonshot: Towards Controllable Video Generation and Editing with Multimodal Conditions.

[BibT_eX]

[DOI]

CoRR, 2024

EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language Models.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

MotionDirector: Motion Customization of Text-to-Video Diffusion Models.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

Free-ATM: Harnessing Free Attention Masks for Representation Learning on Diffusion-Generated Images.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

DragAnything: Motion Control for Anything Using Entity Representation.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

VideoSwap: Customized Video Subject Swapping with Interactive Semantic Point Correspondence.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023

UniFormer: Unifying Convolution and Self-Attention for Visual Recognition.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., October, 2023

Magi-Net: Meta Negative Network for Early Activity Prediction.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2023

MotionDirector: Motion Customization of Text-to-Video Diffusion Models.

[BibT_eX]

[DOI]

CoRR, 2023

Dataset Condensation via Generative Model.

[BibT_eX]

[DOI]

CoRR, 2023

Free-ATM: Exploring Unsupervised Learning on Diffusion-Generated Images with Free Attention Masks.

[BibT_eX]

[DOI]

CoRR, 2023

Label-Efficient Online Continual Object Detection in Streaming Video.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Too Large; Data Reduction for Vision-Language Pre-Training.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Making Vision Transformers Efficient from A Token Sparsification View.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022

Pose-guided Generative Adversarial Net for Novel View Action Synthesis.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2022

DeVRF: Fast Deformable Voxel Radiance Fields for Dynamic Scenes.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Look Less Think More: Rethinking Compositional Action Recognition.

[BibT_eX]

[DOI]

Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

MorphMLP: An Efficient MLP-Like Backbone for Spatial-Temporal Representation Learning.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

Dual-AI: Dual-path Actor Interaction Learning for Group Activity Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021

Learning Dynamical Human-Joint Affinity for 3D Pose Estimation in Videos.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2021

MorphMLP: A Self-Attention Free, MLP-Like Backbone for Image and Video.

[BibT_eX]

[DOI]

CoRR, 2021

Learning Geometry-Disentangled Representation for Complementary Understanding of 3D Object Point Cloud.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

Investigate Indistinguishable Points in Semantic Segmentation of 3D Point Cloud.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

PC-HMR: Pose Calibration for 3D Human Mesh Recovery from 2D Images/Videos.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

Junhao Zhang

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...