Peng-Tao Jiang

Orcid: 0000-0002-1786-4943

According to our database1, Peng-Tao Jiang authored at least 74 papers between 2018 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
SEMat: Semantic Enhanced Natural Image Interactive Matting.
IEEE Trans. Circuits Syst. Video Technol., April, 2026

SmartPhotoCrafter: Unified Reasoning, Generation and Optimization for Automatic Photographic Image Editing.
CoRR, April, 2026

TS-Attn: Temporal-wise Separable Attention for Multi-Event Video Generation.
CoRR, April, 2026

Anchor Forcing: Anchor Memory and Tri-Region RoPE for Interactive Streaming Video Diffusion.
CoRR, March, 2026

MedMASLab: A Unified Orchestration Framework for Benchmarking Multimodal Medical Multi-Agent Systems.
CoRR, March, 2026

C<sup>2</sup>FG: Control Classifier-Free Guidance via Score Discrepancy Analysis.
CoRR, March, 2026

FlowConsist: Make Your Flow Consistent with Real Trajectory.
CoRR, February, 2026

Trust but Verify: Adaptive Conditioning for Reference-Based Diffusion Super-Resolution via Implicit Reference Correlation Modeling.
CoRR, February, 2026

Bidirectional Beta-Tuned Diffusion Model.
IEEE Trans. Pattern Anal. Mach. Intell., January, 2026

Bidirectional Noise Injection: Enhancing Diffusion Models via Coordinated Input-Output Perturbation.
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

Realism Control One-step Diffusion for Real-world Image Super Resolution.
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

2025
CameraMaster: Unified Camera Semantic-Parameter Control for Photography Retouching.
CoRR, November, 2025

MagicWorld: Interactive Geometry-driven Video World Exploration.
CoRR, November, 2025

FeRA: Frequency-Energy Constrained Routing for Effective Diffusion Adaptation Fine-Tuning.
CoRR, November, 2025

VisMem: Latent Vision Memory Unlocks Potential of Vision-Language Models.
CoRR, November, 2025

AgeBooth: Controllable Facial Aging and Rejuvenation via Diffusion Models.
CoRR, October, 2025

RED: Robust Event-Guided Motion Deblurring with Modality-Specific Disentangled Representation.
CoRR, September, 2025

Time-Aware One Step Diffusion Network for Real-World Image Super-Resolution.
CoRR, August, 2025

A Glimpse to Compress: Dynamic Visual Token Pruning for Large Vision-Language Models.
CoRR, August, 2025

Q-Ponder: A Unified Training Pipeline for Reasoning-based Visual Quality Assessment.
CoRR, June, 2025

HyperMotion: DiT-Based Pose-Guided Human Image Animation of Complex Motions.
CoRR, May, 2025

Any-to-Bokeh: One-Step Video Bokeh via Multi-Plane Image Guided Diffusion.
CoRR, May, 2025

MagicTryOn: Harnessing Diffusion Transformer for Garment-Preserving Video Virtual Try-on.
CoRR, May, 2025

Photography Perspective Composition: Towards Aesthetic Perspective Recommendation.
CoRR, May, 2025

Enhancing Visual Grounding for GUI Agents via Self-Evolutionary Reinforcement Learning.
CoRR, May, 2025

M2N2V2: Multi-Modal Unsupervised and Training-free Interactive Segmentation.
CoRR, March, 2025

Towards Training-Free Open-World Segmentation via Image Prompt Foundation Models.
Int. J. Comput. Vis., January, 2025

DepthMaster: Taming Diffusion Models for Monocular Depth Estimation.
CoRR, January, 2025

DSDNet: Raw Domain Demoiréing via Dual Color-Space Synergy.
Proceedings of the 33rd ACM International Conference on Multimedia, 2025

Learning Adaptive Lighting via Channel-Aware Guidance.
Proceedings of the Forty-second International Conference on Machine Learning, 2025

A Temporal Modeling Framework for Video Pre-Training on Video Instance Segmentation.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2025

High-Precision Dichotomous Image Segmentation via Probing Diffusion Capacity.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Multi-Task Dense Predictions via Unleashing the Power of Diffusion.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025


MOERL: When Mixture-Of-Experts Meet Reinforcement Learning for Adverse Weather Image Restoration.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

SDMATTE: Grafting Diffusion Models for Interactive Matting.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

Proxy-Bridged Game Transformer for Interactive Extreme Motion Prediction.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

Boosting Vision State Space Model with Fractal Scanning.
Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

Advancing Comprehensive Aesthetic Insight with Multi-Scale Text-Guided Self-Supervised Learning.
Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

2024
RDNeRF: relative depth guided NeRF for dense free view synthesis.
Vis. Comput., March, 2024

Advancing Comprehensive Aesthetic Insight with Multi-Scale Text-Guided Self-Supervised Learning.
CoRR, 2024

Learning Adaptive Lighting via Channel-Aware Guidance.
CoRR, 2024

Learning Differential Pyramid Representation for Tone Mapping.
CoRR, 2024

CPA: Camera-pose-awareness Diffusion Transformer for Video Generation.
CoRR, 2024

ClearSR: Latent Low-Resolution Image Embeddings Help Diffusion-Based Real-World Super Resolution Models See Clearer.
CoRR, 2024

ConsisSR: Delving Deep into Consistency in Diffusion-based Image Super-Resolution.
CoRR, 2024

Towards Natural Image Matting in the Wild via Real-Scenario Prior.
CoRR, 2024

Scalable Visual State Space Model with Fractal Scanning.
CoRR, 2024

Empowering Segmentation Ability to Multi-modal Large Language Models.
CoRR, 2024

Unsupervised Modality Adaptation with Text-to-Image Diffusion Models for Semantic Segmentation.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Non-uniform Timestep Sampling: Towards Faster Diffusion Model Training.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Chain of Visual Perception: Harnessing Multimodal Large Language Models for Zero-shot Camouflaged Object Detection.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Improving Adversarial Energy-Based Model via Diffusion Process.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Beta-Tuned Timestep Diffusion Model.
Proceedings of the Computer Vision - ECCV 2024, 2024

Revisiting Single Image Reflection Removal in the Wild.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Multi-Task Dense Prediction via Mixture of Low-Rank Experts.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Traffic Scene Parsing Through the TSP6K Dataset.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023
Deeply Explain CNN Via Hierarchical Decomposition.
Int. J. Comput. Vis., May, 2023

Decoupling Degradation and Content Processing for Adverse Weather Image Restoration.
CoRR, 2023

Generalization and Hallucination of Large Vision-Language Models through a Camouflaged Lens.
CoRR, 2023

Towards Training-free Open-world Segmentation via Image Prompting Foundation Models.
CoRR, 2023

PGformer: Proxy-Bridged Game Transformer for Multi-Person Extremely Interactive Motion Prediction.
CoRR, 2023

Segment Anything is A Good Pseudo-label Generator for Weakly Supervised Semantic Segmentation.
CoRR, 2023

Looking Through the Glass: Neural Surface Reconstruction Against High Specular Reflections.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022
Online Attention Accumulation for Weakly Supervised Semantic Segmentation.
IEEE Trans. Pattern Anal. Mach. Intell., 2022

Attention mechanisms in computer vision: A survey.
Comput. Vis. Media, 2022

L2G: A Simple Local-to-Global Knowledge Transfer Framework for Weakly Supervised Semantic Segmentation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021
Delving Deep Into Label Smoothing.
IEEE Trans. Image Process., 2021

LayerCAM: Exploring Hierarchical Class Activation Maps for Localization.
IEEE Trans. Image Process., 2021

Personalized Image Semantic Segmentation.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

2019
Integral Object Mining via Online Attention Accumulation.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

2018
Semantic Edge Detection with Diverse Deep Supervision.
CoRR, 2018

Self-Erasing Network for Integral Object Attention.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

DEL: Deep Embedding Learning for Efficient Image Segmentation.
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018


  Loading...