Junhao Zhang

Affiliations:
  • National University of Singapore, Show Lab, Department of Electrical and Computer Engineering, Singapore


According to our database1, Junhao Zhang authored at least 30 papers between 2021 and 2026.

Collaborative distances:

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

Online presence:

On csauthors.net:

Bibliography

2026
MultiGen: Level-Design for Editable Multiplayer Worlds in Diffusion Game Engines.
CoRR, March, 2026

2025
MoonShot: Towards Controllable Video Generation and Editing with Motion-Aware Multimodal Conditions.
Int. J. Comput. Vis., June, 2025

Follow-Your-Creation: Empowering 4D Creation through Video Inpainting.
CoRR, June, 2025

DD-Ranking: Rethinking the Evaluation of Dataset Distillation.
CoRR, May, 2025

Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation.
Int. J. Comput. Vis., April, 2025

Show-o: One Single Transformer to Unify Multimodal Understanding and Generation.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

MixEval-X: Any-to-any Evaluations from Real-world Data Mixture.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

2024
MixEval-X: Any-to-Any Evaluations from Real-World Data Mixtures.
CoRR, 2024

Towards A Better Metric for Text-to-Video Generation.
CoRR, 2024

Moonshot: Towards Controllable Video Generation and Editing with Multimodal Conditions.
CoRR, 2024

EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language Models.
Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

MotionDirector: Motion Customization of Text-to-Video Diffusion Models.
Proceedings of the Computer Vision - ECCV 2024, 2024

Free-ATM: Harnessing Free Attention Masks for Representation Learning on Diffusion-Generated Images.
Proceedings of the Computer Vision - ECCV 2024, 2024

DragAnything: Motion Control for Anything Using Entity Representation.
Proceedings of the Computer Vision - ECCV 2024, 2024

VideoSwap: Customized Video Subject Swapping with Interactive Semantic Point Correspondence.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023
UniFormer: Unifying Convolution and Self-Attention for Visual Recognition.
IEEE Trans. Pattern Anal. Mach. Intell., October, 2023

Magi-Net: Meta Negative Network for Early Activity Prediction.
IEEE Trans. Image Process., 2023

MotionDirector: Motion Customization of Text-to-Video Diffusion Models.
CoRR, 2023

Dataset Condensation via Generative Model.
CoRR, 2023

Free-ATM: Exploring Unsupervised Learning on Diffusion-Generated Images with Free Attention Masks.
CoRR, 2023

Label-Efficient Online Continual Object Detection in Streaming Video.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Too Large; Data Reduction for Vision-Language Pre-Training.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Making Vision Transformers Efficient from A Token Sparsification View.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022
Pose-guided Generative Adversarial Net for Novel View Action Synthesis.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2022

DeVRF: Fast Deformable Voxel Radiance Fields for Dynamic Scenes.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Look Less Think More: Rethinking Compositional Action Recognition.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

MorphMLP: An Efficient MLP-Like Backbone for Spatial-Temporal Representation Learning.
Proceedings of the Computer Vision - ECCV 2022, 2022

Dual-AI: Dual-path Actor Interaction Learning for Group Activity Recognition.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021
MorphMLP: A Self-Attention Free, MLP-Like Backbone for Image and Video.
CoRR, 2021


  Loading...