Chongjian Ge

Orcid: 0000-0003-1142-9171

According to our database1, Chongjian Ge authored at least 43 papers between 2021 and 2026.

Collaborative distances:

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
Coarse-to-Real: Generative Rendering for Populated Dynamic Scenes.
CoRR, January, 2026

S2I-DiT: Unlocking the semantic-to-image transferability by fine-tuning large diffusion transformer models.
Pattern Recognit., 2026

FlashVideo: Flowing Fidelity to Detail for Efficient High-Resolution Video Generation.
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

2025
Both Semantics and Reconstruction Matter: Making Representation Encoders Ready for Text-to-Image Generation and Editing.
CoRR, December, 2025

DiffusionBrowser: Interactive Diffusion Previews via Multi-Branch Decoders.
CoRR, December, 2025

CreativeVR: Diffusion-Prior-Guided Approach for Structure and Motion Restoration in Generative and Real Videos.
CoRR, December, 2025

Rethinking Training Dynamics in Scale-wise Autoregressive Generation.
CoRR, December, 2025

RELIC: Interactive Video World Model with Long-Horizon Memory.
CoRR, December, 2025

Character Mixing for Video Generation.
CoRR, October, 2025

Advantage Weighted Matching: Aligning RL with Pretraining in Diffusion Models.
CoRR, September, 2025

A Generative Foundation Model for Chest Radiography.
CoRR, September, 2025

PixelFlow: Pixel-Space Generative Models with Flow.
CoRR, April, 2025

FlashVideo: Flowing Fidelity to Detail for Efficient High-Resolution Video Generation.
CoRR, February, 2025

WOMD-Reasoning: A Large-Scale Dataset for Interaction Reasoning in Driving.
Proceedings of the Forty-second International Conference on Machine Learning, 2025

Prompt-A-Video: Prompt your Video Diffusion Model via Preference-Aligned LLM.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

CompGS: Unleashing 2D Compositionality for Compositional Text-to-3D via Dynamically Optimizing 3D Gaussians.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Goku: Flow Based Video Generative Foundation Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

2024
Rethinking Attentive Object Detection via Neural Attention Learning.
IEEE Trans. Image Process., 2024

WOMD-Reasoning: A Large-Scale Language Dataset for Interaction and Driving Intentions Reasoning.
CoRR, 2024

RoboCodeX: Multimodal Code Generation for Robotic Behavior Synthesis.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Large Language Models as Automated Aligners for benchmarking Vision-Language Models.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

InstructDET: Diversifying Referring Object Detection with Generalized Instructions.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

PIXART-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation.
Proceedings of the Computer Vision - ECCV 2024, 2024

DeepAccident: A Motion and Accident Prediction Benchmark for V2X Autonomous Driving.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
CycleMLP: A MLP-Like Architecture for Dense Visual Predictions.
IEEE Trans. Pattern Anal. Mach. Intell., December, 2023

Advancing Vision Transformers with Group-Mix Attention.
CoRR, 2023

Large Language Models as Automated Aligners for benchmarking Vision-Language Models.
CoRR, 2023

PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis.
CoRR, 2023

Speed Co-Augmentation for Unsupervised Audio-Visual Pre-training.
CoRR, 2023

MetaBEV: Solving Sensor Failures for BEV Detection and Map Segmentation.
CoRR, 2023

Soft Neighbors are Positive Supporters in Contrastive Visual Representation Learning.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

MetaBEV: Solving Sensor Failures for 3D Detection and Map Segmentation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

2022
Not All Patches are What You Need: Expediting Vision Transformers via Token Reorganizations.
CoRR, 2022

AMOS: A Large-Scale Abdominal Multi-Organ Benchmark for Versatile Medical Image Segmentation.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

AdaptFormer: Adapting Vision Transformers for Scalable Visual Recognition.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

EViT: Expediting Vision Transformers via Token Reorganizations.
Proceedings of the Tenth International Conference on Learning Representations, 2022

CycleMLP: A MLP-like Architecture for Dense Prediction.
Proceedings of the Tenth International Conference on Learning Representations, 2022

2021
Revitalizing CNN Attentions via Transformers in Self-Supervised Visual Representation Learning.
CoRR, 2021

Revitalizing CNN Attention via Transformers in Self-Supervised Visual Representation Learning.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Watch Only Once: An End-to-End Video Action Detection Framework.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Parser-Free Virtual Try-On via Distilling Appearance Flows.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Disentangled Cycle Consistency for Highly-Realistic Virtual Try-On.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021


  Loading...