Chongjian Ge

Orcid: 0000-0003-1142-9171

According to our database¹, Chongjian Ge authored at least 43 papers between 2021 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

Coarse-to-Real: Generative Rendering for Populated Dynamic Scenes.

[BibT_eX]

[DOI]

Gonzalo Gomez-Nogales

CoRR, January, 2026

S2I-DiT: Unlocking the semantic-to-image transferability by fine-tuning large diffusion transformer models.

[BibT_eX]

[DOI]

Pattern Recognit., 2026

FlashVideo: Flowing Fidelity to Detail for Efficient High-Resolution Video Generation.

[BibT_eX]

[DOI]

Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

2025

Both Semantics and Reconstruction Matter: Making Representation Encoders Ready for Text-to-Image Generation and Editing.

[BibT_eX]

[DOI]

CoRR, December, 2025

DiffusionBrowser: Interactive Diffusion Previews via Multi-Branch Decoders.

[BibT_eX]

[DOI]

CoRR, December, 2025

CreativeVR: Diffusion-Prior-Guided Approach for Structure and Motion Restoration in Generative and Real Videos.

[BibT_eX]

[DOI]

Tejas Panambur

Ishan Rajendrakumar Dave

Chongjian Ge

Ersin Yumer

Xue Bai

CoRR, December, 2025

Rethinking Training Dynamics in Scale-wise Autoregressive Generation.

[BibT_eX]

[DOI]

CoRR, December, 2025

RELIC: Interactive Video World Model with Long-Horizon Memory.

[BibT_eX]

[DOI]

Yannick Hold-Geoffroy

CoRR, December, 2025

Character Mixing for Video Generation.

[BibT_eX]

[DOI]

CoRR, October, 2025

Advantage Weighted Matching: Aligning RL with Pretraining in Diffusion Models.

[BibT_eX]

[DOI]

CoRR, September, 2025

A Generative Foundation Model for Chest Radiography.

[BibT_eX]

[DOI]

CoRR, September, 2025

PixelFlow: Pixel-Space Generative Models with Flow.

[BibT_eX]

[DOI]

CoRR, April, 2025

FlashVideo: Flowing Fidelity to Detail for Efficient High-Resolution Video Generation.

[BibT_eX]

[DOI]

CoRR, February, 2025

WOMD-Reasoning: A Large-Scale Dataset for Interaction Reasoning in Driving.

[BibT_eX]

[DOI]

Proceedings of the Forty-second International Conference on Machine Learning, 2025

Prompt-A-Video: Prompt your Video Diffusion Model via Preference-Aligned LLM.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

CompGS: Unleashing 2D Compositionality for Compositional Text-to-3D via Dynamically Optimizing 3D Gaussians.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Goku: Flow Based Video Generative Foundation Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

2024

Rethinking Attentive Object Detection via Neural Attention Learning.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2024

WOMD-Reasoning: A Large-Scale Language Dataset for Interaction and Driving Intentions Reasoning.

[BibT_eX]

[DOI]

CoRR, 2024

RoboCodeX: Multimodal Code Generation for Robotic Behavior Synthesis.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Large Language Models as Automated Aligners for benchmarking Vision-Language Models.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

InstructDET: Diversifying Referring Object Detection with Generalized Instructions.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

PIXART-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

DeepAccident: A Motion and Accident Prediction Benchmark for V2X Autonomous Driving.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023

CycleMLP: A MLP-Like Architecture for Dense Visual Predictions.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., December, 2023

Advancing Vision Transformers with Group-Mix Attention.

[BibT_eX]

[DOI]

CoRR, 2023

Large Language Models as Automated Aligners for benchmarking Vision-Language Models.

[BibT_eX]

[DOI]

CoRR, 2023

PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis.

[BibT_eX]

[DOI]

CoRR, 2023

Speed Co-Augmentation for Unsupervised Audio-Visual Pre-training.

[BibT_eX]

[DOI]

CoRR, 2023

MetaBEV: Solving Sensor Failures for BEV Detection and Map Segmentation.

[BibT_eX]

[DOI]

CoRR, 2023

Soft Neighbors are Positive Supporters in Contrastive Visual Representation Learning.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

MetaBEV: Solving Sensor Failures for 3D Detection and Map Segmentation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

2022

Not All Patches are What You Need: Expediting Vision Transformers via Token Reorganizations.

[BibT_eX]

[DOI]

CoRR, 2022

AMOS: A Large-Scale Abdominal Multi-Organ Benchmark for Versatile Medical Image Segmentation.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

AdaptFormer: Adapting Vision Transformers for Scalable Visual Recognition.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

EViT: Expediting Vision Transformers via Token Reorganizations.

[BibT_eX]

[DOI]

Proceedings of the Tenth International Conference on Learning Representations, 2022

CycleMLP: A MLP-like Architecture for Dense Prediction.

[BibT_eX]

[DOI]

Proceedings of the Tenth International Conference on Learning Representations, 2022

2021

Revitalizing CNN Attentions via Transformers in Self-Supervised Visual Representation Learning.

[BibT_eX]

[DOI]

CoRR, 2021

Revitalizing CNN Attention via Transformers in Self-Supervised Visual Representation Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Watch Only Once: An End-to-End Video Action Detection Framework.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Parser-Free Virtual Try-On via Distilling Appearance Flows.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Disentangled Cycle Consistency for Highly-Realistic Virtual Try-On.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Chongjian Ge

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...