Qifeng Chen

Orcid: 0000-0003-2199-3948

Affiliations:
  • Hong Kong University of Science and Technology, Hong Kong
  • Stanford University, Department of Computer Science, CA, USA (PhD 2017)


According to our database1, Qifeng Chen authored at least 169 papers between 2012 and 2025.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
Enhancing HDR Imaging with Joint Denoising and Deblurring.
Int. J. Comput. Vis., November, 2025

Diffusion-Based Visual Art Creation: A Survey and New Perspectives.
ACM Comput. Surv., October, 2025

Scaling Instruction-Based Video Editing with a High-Quality Synthetic Dataset.
CoRR, October, 2025

Follow-Your-Emoji-Faster: Towards Efficient, Fine-Controllable, and Expressive Freestyle Portrait Animation.
CoRR, September, 2025

SpA2V: Harnessing Spatial Auditory Cues for Audio-driven Spatially-aware Video Generation.
CoRR, August, 2025

Controllable Video Generation: A Survey.
CoRR, July, 2025

UNIC: Unified In-Context Video Editing.
CoRR, June, 2025

FullDiT2: Efficient In-Context Conditioning for Video Diffusion Transformers.
CoRR, June, 2025

AudioX: Diffusion Transformer for Anything-to-Audio Generation.
CoRR, March, 2025

Generative Artificial Intelligence in Robotic Manipulation: A Survey.
CoRR, March, 2025

Learning Thin Deformable Object Manipulation With a Multisensory Integrated Soft Hand.
IEEE Trans. Robotics, 2025

Robust Portrait Image Matting and Depth-of-field Synthesis via Multiplane Images.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2025

MagicScroll: Enhancing Immersive Storytelling with Controllable Scroll Image Generation.
Proceedings of the IEEE Conference Virtual Reality and 3D User Interfaces, 2025

Master Rules from Chaos: Learning to Reason, Plan, and Interact from Chaos for Tangram Assembly.
Proceedings of the IEEE International Conference on Robotics and Automation, 2025

LDM-ISP: Enhancing Neural ISP for Low Light with Latent Diffusion Models.
Proceedings of the IEEE International Conference on Robotics and Automation, 2025

InstantSwap: Fast Customized Concept Swapping across Sharp Shape Differences.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

SkillMimic: Learning Basketball Interaction Skills from Demonstrations.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

VidMuse: A Simple Video-to-Music Generation Framework with Long-Short-Term Modeling.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

AvatarArtist: Open-Domain 4D Avatarization.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Exploring Sparse MoE in GANs for Text-conditioned Image Synthesis.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Follow-Your-Click: Open-domain Regional Image Animation via Motion Prompts.
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

DiT4Edit: Diffusion Transformer for Image Editing.
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

Infinite-Canvas: Higher-Resolution Video Outpainting with Extensive Content Generation.
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

Learning Naturally Aggregated Appearance for Efficient 3D Editing.
Proceedings of the International Conference on 3D Vision, 2025

2024
Open-Vocabulary Category-Level Object Pose and Size Estimation.
IEEE Robotics Autom. Lett., September, 2024

In-Domain GAN Inversion for Faithful Reconstruction and Editability.
IEEE Trans. Pattern Anal. Mach. Intell., May, 2024

CMDFusion: Bidirectional Fusion Network With Cross-Modality Knowledge Distillation for LiDAR Semantic Segmentation.
IEEE Robotics Autom. Lett., January, 2024

Edicho: Consistent Image Editing in the Wild.
CoRR, 2024

ModelGrow: Continual Text-to-Video Pre-training with Model Expansion and Language Understanding Enhancement.
CoRR, 2024

SimCMF: A Simple Cross-modal Fine-tuning Strategy from Vision Foundation Models to Any Imaging Modality.
CoRR, 2024

Learning thin deformable object manipulation with a multi-sensory integrated soft hand.
CoRR, 2024

SimMAT: Exploring Transferability from Vision Foundation Models to Any Image Modality.
CoRR, 2024

HiPrompt: Tuning-free Higher-Resolution Generation with Hierarchical MLLM Prompts.
CoRR, 2024

Follow-Your-Canvas: Higher-Resolution Video Outpainting with Extensive Content Generation.
CoRR, 2024

SkillMimic: Learning Reusable Basketball Skills from Demonstrations.
CoRR, 2024

MMTrail: A Multimodal Trailer Video Dataset with Language and Music Descriptions.
CoRR, 2024

Gaussian-Informed Continuum for Physical Property Identification and Simulation.
CoRR, 2024

VidMuse: A Simple Video-to-Music Generation Framework with Long-Short-Term Modeling.
CoRR, 2024

LLMs Meet Multimodal Generation and Editing: A Survey.
CoRR, 2024

OV9D: Open-Vocabulary Category-Level 9D Object Pose and Size Estimation.
CoRR, 2024

Follow-Your-Click: Open-domain Regional Image Animation via Short Prompts.
CoRR, 2024

ENTED: Enhanced Neural Texture Extraction and Distribution for Reference-based Blind Face Restoration.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024

Follow-Your-Emoji: Fine-Controllable and Expressive Freestyle Portrait Animation.
Proceedings of the SIGGRAPH Asia 2024 Conference Papers, 2024

HeadArtist: Text-conditioned 3D Head Generation with Self Score Distillation.
Proceedings of the ACM SIGGRAPH 2024 Conference Papers, 2024

A Humanoid Robot Dialogue System Architecture Targeting Patient Interview Tasks.
Proceedings of the 33rd IEEE International Conference on Robot and Human Interactive Communication, 2024

Adaptive Domain Learning for Cross-domain Image Denoising.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

GIC: Gaussian-Informed Continuum for Physical Property Identification and Simulation.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

TALE: Training-free Cross-domain Image Composition via Adaptive Latent Manipulation and Energy-guided Optimization.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Cross-Cluster Shifting for Efficient and Effective 3D Object Detection in Autonomous Driving.
Proceedings of the IEEE International Conference on Robotics and Automation, 2024

ScaleCrafter: Tuning-free Higher-Resolution Visual Generation with Diffusion Models.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

SPIRE: Semantic Prompt-Driven Image Restoration.
Proceedings of the Computer Vision - ECCV 2024, 2024

ControlLLM: Augment Language Models with Tools by Searching on Graphs.
Proceedings of the Computer Vision - ECCV 2024, 2024

Storytelling Video Generation with Retrieval Augmentation and Character Consistency.
Proceedings of the Computer Vision - ECCV 2024 Workshops, 2024

Make a Cheap Scaling: A Self-Cascade Diffusion Model for Higher-Resolution Adaptation.
Proceedings of the Computer Vision - ECCV 2024, 2024

PPAD: Iterative Interactions of Prediction and Planning for End-to-End Autonomous Driving.
Proceedings of the Computer Vision - ECCV 2024, 2024

Learning High-Resolution Vector Representation from Multi-camera Images for 3D Object Detection.
Proceedings of the Computer Vision - ECCV 2024, 2024

Real-Time 3D-Aware Portrait Editing from a Single Image.
Proceedings of the Computer Vision - ECCV 2024, 2024

Robust Depth Enhancement via Polarization Prompt Fusion Tuning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Automatic Controllable Colorization via Imagination.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

DiffSHEG: A Diffusion-Based Approach for Real-Time Speech-Driven Holistic 3D Expression and Gesture Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Gaussian Shell Maps for Efficient 3D Human Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Follow Your Pose: Pose-Guided Text-to-Video Generation Using Pose-Free Videos.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

A Diffusion Model with State Estimation for Degradation-Blind Inverse Imaging.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

Multitarget Device-Free Localization via Cross-Domain Wi-Fi RSS Training Data and Attentional Prior Fusion.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Robust Reflection Removal With Flash-Only Cues in the Wild.
IEEE Trans. Pattern Anal. Mach. Intell., December, 2023

Defending ChatGPT against jailbreak attack via self-reminders.
Nat. Mac. Intell., December, 2023

ERRA: An Embodied Representation and Reasoning Architecture for Long-Horizon Language-Conditioned Manipulation Tasks.
IEEE Robotics Autom. Lett., June, 2023

Learn to Grasp Via Intention Discovery and Its Application to Challenging Clutter.
IEEE Robotics Autom. Lett., 2023

Deep Video Prior for Video Consistency and Propagation.
IEEE Trans. Pattern Anal. Mach. Intell., 2023

TIP: Text-Driven Image Processing with Semantic and Restoration Instructions.
CoRR, 2023

MagicScroll: Nontypical Aspect-Ratio Image Generation for Visual Storytelling via Multi-Layered Semantic-Aware Denoising.
CoRR, 2023

MagicStick: Controllable Video Editing via Control Handle Transformations.
CoRR, 2023

Taming Latent Diffusion Models to See in the Dark.
CoRR, 2023

DeepEMplanner: An End-to-End EM Motion Planner with Iterative Interactions.
CoRR, 2023

VideoCrafter1: Open Diffusion Models for High-Quality Video Generation.
CoRR, 2023

ControlLLM: Augment Language Models with Tools by Searching on Graphs.
CoRR, 2023

Animate-A-Story: Storytelling with Retrieval-Augmented Video Generation.
CoRR, 2023

Follow Your Pose: Pose-Guided Text-to-Video Generation using Pose-Free Videos.
CoRR, 2023

LinkGAN: Linking GAN Latents to Pixels for Controllable Image Synthesis.
CoRR, 2023

AniPortraitGAN: Animatable 3D Portrait Generation from 2D Image Collections.
Proceedings of the SIGGRAPH Asia 2023 Conference Papers, 2023

Neural Image Popularity Assessment with Retrieval-augmented Transformer.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Flipbot: Learning Continuous Paper Flipping via Coarse-to-Fine Exteroceptive-Proprioceptive Exploration.
Proceedings of the IEEE International Conference on Robotics and Automation, 2023

Video Waterdrop Removal via Spatio-Temporal Fusion in Driving Scenes.
Proceedings of the IEEE International Conference on Robotics and Automation, 2023

Improving Video Super-Resolution with Long-Term Self-Exemplars.
Proceedings of the IEEE International Conference on Robotics and Automation, 2023

Scene-level Point Cloud Colorization with Semantics-and-geometry-aware Networks.
Proceedings of the IEEE International Conference on Robotics and Automation, 2023

The Devil is in the Wrongly-classified Samples: Towards Unified Open-set Recognition.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

LinkGAN: Linking GAN Latents to Pixels for Controllable Image Synthesis.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Bootstrap Motion Forecasting With Self-Consistent Constraints.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

FateZero: Fusing Attentions for Zero-shot Text-based Video Editing.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Randomized Quantization: A Generic Augmentation for Data Agnostic Self-supervised Learning.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

MetaPortrait: Identity-Preserving Talking Head Generation with Fast Personalized Adaptation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

High-fidelity 3D GAN Inversion by Pseudo-multi-view Optimization.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

RODIN: A Generative Model for Sculpting 3D Digital Avatars Using Diffusion.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Learning 3D-Aware Image Synthesis with Unknown Pose Distribution.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

DYNAFED: Tackling Client Data Heterogeneity with Global Dynamics.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Blind Video Deflickering by Neural Filtering with a Flawed Atlas.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Enlarging Instance-specific and Class-specific Information for Open-set Action Recognition.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022
Randomized Quantization for Data Agnostic Representation Learning.
CoRR, 2022

Latent Video Diffusion Models for High-Fidelity Video Generation with Arbitrary Lengths.
CoRR, 2022

Robust Federated Learning against both Data Heterogeneity and Poisoning Attack via Aggregation Optimization.
CoRR, 2022

Pretraining is All You Need for Image-to-Image Translation.
CoRR, 2022

Interpreting Class Conditional GANs with Channel Awareness.
CoRR, 2022

CI-AVSR: A Cantonese Audio-Visual Speech Dataset for In-car Command Recognition.
CoRR, 2022

A Well-aligned Dataset for Learning Image Signal Processing on Smartphones from a High-end Camera.
Proceedings of the SIGGRAPH '22: Special Interest Group on Computer Graphics and Interactive Techniques Conference, Posters, Vancouver BC Canada, August 7, 2022

Improving 3D-aware Image Synthesis with A Geometry-aware Discriminator.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

AniFaceGAN: Animatable 3D-Aware Face Image Generation for Video Avatars.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Composite Photograph Harmonization with Complete Background Cues.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Automatic Speech Recognition Datasets in Cantonese: A Survey and New Dataset.
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

ASCEND: A Spontaneous Chinese-English Dataset for Code-switching in Multi-turn Conversation.
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

CI-AVSR: A Cantonese Audio-Visual Speech Datasetfor In-car Command Recognition.
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

Region-Based Semantic Factorization in GANs.
Proceedings of the International Conference on Machine Learning, 2022

Efficient Point Cloud Segmentation with Geometry-Aware Sparse Networks.
Proceedings of the Computer Vision - ECCV 2022, 2022

3D-Aware Indoor Scene Synthesis with Depth Priors.
Proceedings of the Computer Vision - ECCV 2022, 2022

Real-Time Neural Character Rendering with Pose-Guided Multiplane Images.
Proceedings of the Computer Vision - ECCV 2022, 2022

Optimizing Image Compression via Joint Learning with Denoising.
Proceedings of the Computer Vision - ECCV 2022, 2022

Point Cloud Compression with Sibling Context and Surface Priors.
Proceedings of the Computer Vision - ECCV 2022, 2022

High-Fidelity GAN Inversion for Image Attribute Editing.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Shape from Polarization for Complex Scenes in the Wild.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

A Categorized Reflection Removal Dataset with Diverse Real-world Scenes.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2022

Optimizing Video Prediction via Video Frame Interpolation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Volumetric-based Contact Point Detection for 7-DoF Grasping.
Proceedings of the Conference on Robot Learning, 2022

Restorable Image Operators with Quasi-Invertible Networks.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
DRINet++: Efficient Voxel-as-point Point Cloud Segmentation.
CoRR, 2021

Towards Photorealistic Colorization by Imagination.
CoRR, 2021

Video Super-Resolution with Long-Term Self-Exemplars.
CoRR, 2021

Low-Rank Subspaces in GANs.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Enhanced Invertible Encoding for Learned Image Compression.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Stereo Matching by Self-supervision of Multiscopic Vision.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2021

Stereo Waterdrop Removal with Row-wise Dilated Attention.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2021

Joint Depth and Normal Estimation from Real-world Time-of-flight Raw Data.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2021

DRINet: A Dual-Representation Iterative Learning Network for Point Cloud Segmentation.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Safety-aware Motion Prediction with Unseen Vehicles for Autonomous Driving.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Internal Video Inpainting by Implicit Long-range Propagation.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

IICNet: A Generic Framework for Reversible Image Conversion.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Embedding Novel Views in a Single JPEG Image.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Dual-Camera Super-Resolution with Aligned Attention Modules.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Neural Camera Simulators.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Robust Reflection Removal With Reflection-Free Flash-Only Cues.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Image Inpainting With External-Internal Learning and Monochromic Bottleneck.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Learning to Predict Vehicle Trajectories with Model-based Planning.
Proceedings of the Conference on Robot Learning, 8-11 November 2021, London, UK., 2021

2020
MFuseNet: Robust Depth Estimation With Learned Multiscopic Fusion.
IEEE Robotics Autom. Lett., 2020

Active Perception with A Monocular Camera for Multiscopic Vision.
CoRR, 2020

Blind Video Temporal Consistency via Deep Video Prior.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Self-supervised Object Tracking with Cycle-consistent Siamese Networks.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2020

Video Depth Estimation by Fusing Flow-to-Depth Proposals.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2020

PiP: Planning-Informed Trajectory Prediction for Autonomous Driving.
Proceedings of the Computer Vision - ECCV 2020, 2020

Future Video Synthesis With Object Motion Prediction.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Polarized Reflection Removal With Perfect Alignment in the Wild.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019
Speech Denoising with Deep Feature Losses.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Hiding Video in Audio via Reversible Generative Models.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Seeing Motion in the Dark.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Zoom to Learn, Learn to Zoom.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

3D Motion Decomposition for RGBD Future Dynamic Scene Synthesis.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Fully Automatic Video Colorization With Self-Regularization and Diversity.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

2018
Combinatorial Optimization with Graph Convolutional Networks and Guided Tree Search.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Single Image Reflection Separation With Perceptual Losses.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Semi-Parametric Image Synthesis.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Interactive Image Segmentation With Latent Diversity.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Learning to See in the Dark.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

2017
Fast Image Processing with Fully-Convolutional Networks.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Photographic Image Synthesis with Cascaded Refinement Networks.
Proceedings of the IEEE International Conference on Computer Vision, 2017

2016
Dense Monocular Depth Estimation in Complex Dynamic Scenes.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Full Flow: Optical Flow Estimation By Global Optimization over Regular Grids.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

2015
Robust Nonrigid Registration by Convex Optimization.
Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

2014
1-HKUST: Object Detection in ILSVRC 2014.
CoRR, 2014

Fast MRF Optimization with Application to Depth Reconstruction.
Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014

2013
Motion-Aware KNN Laplacian for Video Matting.
Proceedings of the IEEE International Conference on Computer Vision, 2013

A Simple Model for Intrinsic Image Decomposition with Depth Cues.
Proceedings of the IEEE International Conference on Computer Vision, 2013

2012
KNN matting.
Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012


  Loading...