Dongdong Chen

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

FreeFlux: Understanding and Exploiting Layer-Specific Roles in RoPE-Based MMDiT for Versatile Image Editing.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

ProLongVid: A Simple but Strong Baseline for Long-context Video Instruction Tuning.

[BibT_eX]

[DOI]

Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

UNICL-SAM: Uncertainty-Driven In-Context Segmentation with Part Prototype Discovery.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Olympus: A Universal Task Router for Computer Vision Tasks.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

SmartEraser: Remove Anything from Images using Masked-Region Guidance.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

2024

SGEdit: Bridging LLM with Text2Image Generative Model for Scene Graph-based Image Editing.

[BibT_eX]

[DOI]

Zhiyuan Zhang

ACM Trans. Graph., December, 2024

High-Fidelity and Efficient Pluralistic Image Completion With Transformers.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., December, 2024

Robust Model Watermarking for Image Processing Networks via Structure Consistency.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., October, 2024

Transformer Based Pluralistic Image Completion With Reduced Information Loss.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., October, 2024

Learning a Single Network for Robust Medical Image Segmentation With Noisy Labels.

[BibT_eX]

[DOI]

IEEE Trans. Medical Imaging, September, 2024

NeRF-Art: Text-Driven Neural Radiance Fields Stylization.

[BibT_eX]

[DOI]

IEEE Trans. Vis. Comput. Graph., August, 2024

3D Question Answering.

[BibT_eX]

[DOI]

IEEE Trans. Vis. Comput. Graph., March, 2024

Deep Image Matting With Sparse User Interactions.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., February, 2024

AnimeDiff: Customized Image Generation of Anime Characters Using Diffusion Model.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2024

PersonMAE: Person Re-Identification Pre-Training With Masked AutoEncoders.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2024

PointCAT: Contrastive Adversarial Training for Robust Point Cloud Recognition.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2024

LLM2CLIP: Powerful Language Model Unlocks Richer Visual Representation.

[BibT_eX]

[DOI]

CoRR, 2024

ToolBridge: An Open-Source Dataset to Equip LLMs with External Tool Capabilities.

[BibT_eX]

[DOI]

CoRR, 2024

SynChart: Synthesizing Charts from Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

Rethinking Visual Prompting for Multimodal Large Language Models with External Knowledge.

[BibT_eX]

[DOI]

CoRR, 2024

Real-Time Image Segmentation via Hybrid Convolutional-Transformer Architecture Search.

[BibT_eX]

[DOI]

CoRR, 2024

i-Code V2: An Autoregressive Generation Framework over Vision, Language, and Speech Data.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2024, 2024

TrafficMOT: A Challenging Dataset for Multi-Object Tracking in Complex Traffic Scenarios.

[BibT_eX]

[DOI]

Carola-Bibiane Schönlieb

Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Attribute-Aware Head Swapping Guided by 3d Modeling.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Exploring Pre-trained Text-to-Video Diffusion Models for Referring Video Object Segmentation.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

OmniViD: A Generative Framework for Universal Video Understanding.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Towards More Unified In-Context Visual Understanding.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023

MADAv2: Advanced Multi-Anchor Based Active Domain Adaptation Segmentation.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., November, 2023

Robust Point Cloud Segmentation With Noisy Annotations.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., June, 2023

Semantic Probability Distribution Modeling for Diverse Semantic Image Synthesis.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., May, 2023

Cross-Domain and Disentangled Face Manipulation With 3D Guidance.

[BibT_eX]

[DOI]

IEEE Trans. Vis. Comput. Graph., April, 2023

Perceptual Hashing of Deep Convolutional Neural Networks for Model Copy Detection.

[BibT_eX]

[DOI]

ACM Trans. Multim. Comput. Commun. Appl., 2023

Coherent adversarial deepfake video generation.

[BibT_eX]

[DOI]

Signal Process., 2023

Old Photo Restoration via Deep Latent Space Translation.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2023

Mesh-Guided Neural Implicit Field Editing.

[BibT_eX]

[DOI]

CoRR, 2023

On the Hidden Waves of Image.

[BibT_eX]

[DOI]

CoRR, 2023

HQ-50K: A Large-scale, High-quality Dataset for Image Restoration.

[BibT_eX]

[DOI]

CoRR, 2023

Designing a Better Asymmetric VQGAN for StableDiffusion.

[BibT_eX]

[DOI]

CoRR, 2023

Image is First-order Norm+Linear Autoregressive.

[BibT_eX]

[DOI]

CoRR, 2023

Album Storytelling with Iterative Story-aware Captioning and Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2023

i-Code V2: An Autoregressive Generation Framework over Vision, Language, and Speech Data.

[BibT_eX]

[DOI]

CoRR, 2023

ChatVideo: A Tracklet-centric Multimodal and Versatile Video Understanding System.

[BibT_eX]

[DOI]

CoRR, 2023

OmniTracker: Unifying Object Tracking by Tracking-with-Detection.

[BibT_eX]

[DOI]

CoRR, 2023

Uni-ControlNet: All-in-One Control to Text-to-Image Diffusion Models.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Learning from Rich Semantics and Coarse Locations for Long-tailed Object Detection.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

X-Paste: Revisiting Scalable Copy-Paste for Instance Segmentation using CLIP and StableDiffusion.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

Layer Grafted Pre-training: Bridging Contrastive Learning And Masked Image Modeling For Label-Efficient Representations.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

HairCLIPv2: Unifying Hair Editing via Proxy Feature Blending.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

AvatarCraft: Transforming Text into Neural Human Avatars with Parameterized Shape and Pose Control.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Improving Adversarial Robustness of Masked Autoencoders via Test-time Frequency-domain Prompting.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Streaming Video Model.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Improving Commonsense in Vision-Language Models via Knowledge Graph Riddles.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Look Before You Match: Instance Understanding Matters in Video Object Segmentation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Masked Video Distillation: Rethinking Masked Feature Modeling for Self-supervised Video Representation Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Detection Hub: Unifying Object Detection Datasets via Query Adaptation on Language Embedding.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Diversity-Aware Meta Visual Prompting.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

MaskCLIP: Masked Self-Distillation Advances Contrastive Language-Image Pretraining.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

i-Code: An Integrative and Composable Multimodal Learning Framework.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

Frido: Feature Pyramid Diffusion for Complex Scene Image Synthesis.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

PeCo: Perceptual Codebook for BERT Pre-training of Vision Transformers.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022

Meta-PU: An Arbitrary-Scale Upsampling Network for Point Cloud.

[BibT_eX]

[DOI]

IEEE Trans. Vis. Comput. Graph., 2022

JPEG Robust Invertible Grayscale.

[BibT_eX]

[DOI]

IEEE Trans. Vis. Comput. Graph., 2022

TERA: Screen-to-Camera Image Code With Transparency, Efficiency, Robustness and Adaptability.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2022

Poison Ink: Robust and Invisible Backdoor Attack.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2022

E2Style: Improve the Efficiency and Effectiveness of StyleGAN Inversion.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2022

Translation of Aerial Image Into Digital Map via Discriminative Segmentation and Creative Generation.

[BibT_eX]

[DOI]

IEEE Trans. Geosci. Remote. Sens., 2022

Distribution-Preserving Steganography Based on Text-to-Speech Generative Models.

[BibT_eX]

[DOI]

IEEE Trans. Dependable Secur. Comput., 2022

Deep Model Intellectual Property Protection via Deep Watermarking.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2022

Efficient Semantic Image Synthesis via Class-Adaptive Normalization.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2022

Online multi-object tracking with unsupervised re-identification learning and occlusion estimation.

[BibT_eX]

[DOI]

Neurocomputing, 2022

CLIP Itself is a Strong Fine-tuner: Achieving 85.7% and 88.0% Top-1 Accuracy with ViT-B and ViT-L on ImageNet.

[BibT_eX]

[DOI]

CoRR, 2022

X-Paste: Revisit Copy-Paste at Scale with CLIP and StableDiffusion.

[BibT_eX]

[DOI]

CoRR, 2022

Self-Supervised Learning based on Heat Equation.

[BibT_eX]

[DOI]

CoRR, 2022

PointCAT: Contrastive Adversarial Training for Robust Point Cloud Recognition.

[BibT_eX]

[DOI]

CoRR, 2022

Video Mobile-Former: Video Recognition with Efficient Global Spatial-temporal Modeling.

[BibT_eX]

[DOI]

CoRR, 2022

Should All Proposals be Treated Equally in Object Detection?

[BibT_eX]

[DOI]

CoRR, 2022

Semantic Image Synthesis via Diffusion Models.

[BibT_eX]

[DOI]

CoRR, 2022

Residual Mixture of Experts.

[BibT_eX]

[DOI]

CoRR, 2022

Protecting Celebrities with Identity Consistency Transformer.

[BibT_eX]

[DOI]

CoRR, 2022

Self-supervised Transformer for Deepfake Detection.

[BibT_eX]

[DOI]

CoRR, 2022

OmniVL: One Foundation Model for Image-Language and Video-Language Tasks.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

REVIVE: Regional Visual Representation Matters in Knowledge-Based Visual Question Answering.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Should All Proposals Be Treated Equally in Object Detection?

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

Bootstrapped Masked Autoencoders for Vision BERT Pretraining.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

General Facial Representation Learning in a Visual-Linguistic Manner.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

HairCLIP: Design Your Hair by Text and Reference Image.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

BEVT: BERT Pretraining of Video Transformers.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

CLIP-NeRF: Text-and-Image Driven Manipulation of Neural Radiance Fields.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Bringing Old Films Back to Life.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Reduce Information Loss in Transformers for Pluralistic Image Inpainting.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Shape-invariant 3D Adversarial Point Clouds.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Vector Quantized Diffusion Model for Text-to-Image Synthesis.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Large-Scale Pre-training for Person Re-identification with Noisy Labels.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped Windows.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Protecting Celebrities from DeepFake with Identity Consistency Transformer.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Mobile-Former: Bridging MobileNet and Transformer.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021

Deep Template-Based Watermarking.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. Video Technol., 2021

A General Decoupled Learning Framework for Parameterized Image Operators.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2021

Explicit Filterbank Learning for Neural Image Style Transfer and Image Processing.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2021

<i>CDAE</i>: Color decomposition-based adversarial examples for screen devices.

[BibT_eX]

[DOI]

Inf. Sci., 2021

Adversarial defense via self-orthogonal randomization super-network.

[BibT_eX]

[DOI]

Neurocomputing, 2021

Visual Structure Constraint for Transductive Zero-Shot Learning in the Wild.

[BibT_eX]

[DOI]

Ziyu Wan

Int. J. Comput. Vis., 2021

Florence: A New Foundation Model for Computer Vision.

[BibT_eX]

[DOI]

CoRR, 2021

Unsupervised Finetuning.

[BibT_eX]

[DOI]

CoRR, 2021

Poison Ink: Robust and Invisible Backdoor Attack.

[BibT_eX]

[DOI]

CoRR, 2021

Exploring Structure Consistency for Deep Model Watermarking.

[BibT_eX]

[DOI]

CoRR, 2021

A Simple Baseline for StyleGAN Inversion.

[BibT_eX]

[DOI]

CoRR, 2021

Weak NAS Predictors Are All You Need.

[BibT_eX]

[DOI]

CoRR, 2021

Stronger NAS with Weaker Predictors.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Revisiting Dynamic Convolution via Matrix Decomposition.

[BibT_eX]

[DOI]

Proceedings of the 9th International Conference on Learning Representations, 2021

Learning with Noisy Labels for Robust Point Cloud Segmentation.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

High-Fidelity Pluralistic Image Completion with Transformers.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

MicroNet: Improving Image Recognition with Extremely Low FLOPs.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Improve Unsupervised Pretraining for Few-label Transfer.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Multi-Attentional Deepfake Detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Improved Image Matting via Real-Time User Clicks and Uncertainty Estimation.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Diverse Semantic Image Synthesis via Probability Distribution Modeling.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Unsupervised Pre-Training for Person Re-Identification.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Dynamic Head: Unifying Object Detection Heads With Attentions.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

2020

MichiGAN: multi-input-conditioned hair image generation for portrait editing.

[BibT_eX]

[DOI]

ACM Trans. Graph., 2020

Improving Person Re-Identification With Iterative Impression Aggregation.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2020

Controllable Image Processing via Adaptive FilterBank Pyramid.

[BibT_eX]

[DOI]

Qingnan Fan

Lu Yuan

Nenghai Yu

Gang Hua

IEEE Trans. Image Process., 2020

Are Fewer Labels Possible for Few-shot Learning?

[BibT_eX]

[DOI]

CoRR, 2020

Semantic Image Synthesis via Efficient Class-Adaptive Normalization.

[BibT_eX]

[DOI]

CoRR, 2020

Identity-Driven DeepFake Detection.

[BibT_eX]

[DOI]

CoRR, 2020

MicroNet: Towards Image Recognition with Extremely Low FLOPs.

[BibT_eX]

[DOI]

CoRR, 2020

Rethinking Spatially-Adaptive Normalization.

[BibT_eX]

[DOI]

CoRR, 2020

Passport-aware Normalization for Deep Model Protection.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

GreedyFool: Distortion-Aware Sparse Adversarial Attack.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

DA-NAS: Data Adapted Pruning for Efficient Neural Architecture Search.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

Dynamic ReLU.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

LG-GAN: Label Guided Adversarial Network for Flexible Targeted Attack of Point Cloud Based Deep Networks.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Bringing Old Photos Back to Life.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Density-Aware Graph for Deep Semi-Supervised Visual Recognition.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Robust Superpixel-Guided Attentional Adversarial Attack.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Self-Robust 3D Point Recognition via Gather-Vector Guidance.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Dynamic Convolution: Attention Over Convolution Kernels.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Model Watermarking for Image Processing Networks.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019

Progressive Color Transfer With Dense Semantic Correspondences.

[BibT_eX]

[DOI]

ACM Trans. Graph., 2019

Mirror, Mirror, on the Wall, Who's Got the Clearest Image of Them All? - A Tailored Approach to Single Image Reflection Removal.

[BibT_eX]

[DOI]

Daniel Heydecker

Georg Maierhofer

Qingnan Fan

Carola-Bibiane Schönlieb

Sabine Süsstrunk

IEEE Trans. Image Process., 2019

Deep Reflection Prior.

[BibT_eX]

[DOI]