Zhengzhong Tu

Orcid: 0000-0002-7594-2292

According to our database¹, Zhengzhong Tu authored at least 107 papers between 2018 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Bibliography

2025

Ariadne: A Controllable Framework for Probing and Extending VLM Reasoning Boundaries.

[BibT_eX]

[DOI]

CoRR, November, 2025

FORGE-Tree: Diffusion-Forcing Tree Search for Long-Horizon Robot Manipulation.

[BibT_eX]

[DOI]

CoRR, October, 2025

Background Fades, Foreground Leads: Curriculum-Guided Background Pruning for Efficient Foreground-Centric Collaborative Perception.

[BibT_eX]

[DOI]

CoRR, October, 2025

SafeCoop: Unravelling Full Stack Safety in Agentic Collaborative Driving.

[BibT_eX]

[DOI]

CoRR, October, 2025

LLMs Can Get "Brain Rot"!

[BibT_eX]

[DOI]

CoRR, October, 2025

HeadsUp! High-Fidelity Portrait Image Super-Resolution.

[BibT_eX]

[DOI]

CoRR, October, 2025

Q-Router: Agentic Video Quality Assessment with Expert Model Routing and Artifact Localization.

[BibT_eX]

[DOI]

CoRR, October, 2025

Noisy-Pair Robust Representation Alignment for Positive-Unlabeled Learning.

[BibT_eX]

[DOI]

CoRR, October, 2025

VQualA 2025 Challenge on Image Super-Resolution Generated Content Quality Assessment: Methods and Results.

[BibT_eX]

[DOI]

CoRR, September, 2025

SuperGen: An Efficient Ultra-high-resolution Video Generation System with Sketching and Tiling.

[BibT_eX]

[DOI]

CoRR, August, 2025

CyPortQA: Benchmarking Multimodal Large Language Models for Cyclone Preparedness in Port Operation.

[BibT_eX]

[DOI]

CoRR, August, 2025

AdaRing: Towards Ultra-Light Vision-Language Adaptation via Cross-Layer Tensor Ring Decomposition.

[BibT_eX]

[DOI]

CoRR, August, 2025

KANMixer: Can KAN Serve as a New Modeling Core for Long-term Time Series Forecasting?

[BibT_eX]

[DOI]

CoRR, August, 2025

Edge-Based Multimodal Sensor Data Fusion with Vision Language Models (VLMs) for Real-time Autonomous Vehicle Accident Avoidance.

[BibT_eX]

[DOI]

CoRR, August, 2025

MMHU: A Massive-Scale Multimodal Benchmark for Human Behavior Understanding.

[BibT_eX]

[DOI]

CoRR, July, 2025

A Survey on Long-Video Storytelling Generation: Architectures, Consistency, and Cinematic Quality.

[BibT_eX]

[DOI]

CoRR, July, 2025

4KAgent: Agentic Any Image to 4K Super-Resolution.

[BibT_eX]

[DOI]

CoRR, July, 2025

Automated Vehicles Should be Connected with Natural Language.

[BibT_eX]

[DOI]

CoRR, July, 2025

AirV2X: Unified Air-Ground Vehicle-to-Everything Collaboration.

[BibT_eX]

[DOI]

CoRR, June, 2025

DRAMA-X: A Fine-grained Intent Prediction and Risk Reasoning Benchmark For Driving.

[BibT_eX]

[DOI]

Mihir Godbole

Xiangbo Gao

Zhengzhong Tu

CoRR, June, 2025

Demystifying the Visual Quality Paradox in Multimodal Large Language Models.

[BibT_eX]

[DOI]

CoRR, June, 2025

SAFEFLOW: A Principled Protocol for Trustworthy and Transactional Autonomous Agent Systems.

[BibT_eX]

[DOI]

CoRR, June, 2025

V2X-UniPool: Unifying Multimodal Perception and Knowledge Reasoning for Autonomous Driving.

[BibT_eX]

[DOI]

CoRR, June, 2025

MoDoMoDo: Multi-Domain Data Mixtures for Multimodal LLM Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, May, 2025

mRAG: Elucidating the Design Space of Multi-modal Retrieval-Augmented Generation.

[BibT_eX]

[DOI]

CoRR, May, 2025

DINO-R1: Incentivizing Reasoning Capability in Vision Foundation Models.

[BibT_eX]

[DOI]

CoRR, May, 2025

CAST: Contrastive Adaptation and Distillation for Semi-Supervised Instance Segmentation.

[BibT_eX]

[DOI]

CoRR, May, 2025

Simulating the Unseen: Crash Prediction Must Learn from What Did Not Happen.

[BibT_eX]

[DOI]

CoRR, May, 2025

VLM-3R: Vision-Language Models Augmented with Instruction-Aligned 3D Reconstruction.

[BibT_eX]

[DOI]

CoRR, May, 2025

SounDiT: Geo-Contextual Soundscape-to-Landscape Generation.

[BibT_eX]

[DOI]

CoRR, May, 2025

DisCO: Reinforcing Large Reasoning Models with Discriminative Constrained Optimization.

[BibT_eX]

[DOI]

CoRR, May, 2025

Generative AI for Autonomous Driving: Frontiers and Opportunities.

[BibT_eX]

[DOI]

Laurence Tianruo Yang

CoRR, May, 2025

VISTA: Generative Visual Imagination for Vision-and-Language Navigation.

[BibT_eX]

[DOI]

CoRR, May, 2025

NTIRE 2025 Challenge on UGC Video Enhancement: Methods and Results.

[BibT_eX]

[DOI]

CoRR, May, 2025

GuideSR: Rethinking Guidance for One-Step High-Fidelity Diffusion-Based Super-Resolution.

[BibT_eX]

[DOI]

CoRR, May, 2025

The Role of Open-Source LLMs in Shaping the Future of GeoAI.

[BibT_eX]

[DOI]

CoRR, April, 2025

NTIRE 2025 Challenge on Short-form UGC Video Quality Assessment and Enhancement: Methods and Results.

[BibT_eX]

[DOI]

Mohamed-Chaker Larabi

CoRR, April, 2025

The Tenth NTIRE 2025 Efficient Super-Resolution Challenge Report.

[BibT_eX]

[DOI]

CoRR, April, 2025

NuScenes-SpatialQA: A Spatial Understanding and Reasoning Benchmark for Vision-Language Models in Autonomous Driving.

[BibT_eX]

[DOI]

CoRR, April, 2025

UniOcc: A Unified Benchmark for Occupancy Forecasting and Prediction in Autonomous Driving.

[BibT_eX]

[DOI]

CoRR, March, 2025

Can Large Vision Language Models Read Maps Like a Human?

[BibT_eX]

[DOI]

CoRR, March, 2025

PANDORA: Diffusion Policy Learning for Dexterous Robotic Piano Playing.

[BibT_eX]

[DOI]

Yanjia Huang

Renjie Li

Zhengzhong Tu

CoRR, March, 2025

CoCMT: Communication-Efficient Cross-Modal Transformer for Collaborative Perception.

[BibT_eX]

[DOI]

CoRR, March, 2025

DecAlign: Hierarchical Cross-Modal Alignment for Decoupled Multimodal Representation Learning.

[BibT_eX]

[DOI]

CoRR, March, 2025

Generative AI in Transportation Planning: A Survey.

[BibT_eX]

[DOI]

CoRR, March, 2025

Secure On-Device Video OOD Detection Without Backpropagation.

[BibT_eX]

[DOI]

CoRR, March, 2025

V2X-LLM: Enhancing V2X Integration and Understanding in Connected Vehicle Corridors.

[BibT_eX]

[DOI]

CoRR, March, 2025

Complex LLM Planning via Automated Heuristics Discovery.

[BibT_eX]

[DOI]

CoRR, February, 2025

On the Trustworthiness of Generative Foundation Models: Guideline, Assessment, and Perspective.

[BibT_eX]

[DOI]

CoRR, February, 2025

Re-Align: Aligning Vision Language Models via Retrieval-Augmented Direct Preference Optimization.

[BibT_eX]

[DOI]

CoRR, February, 2025

V2X-ViTv2: Improved Vision Transformers for Vehicle-to-Everything Cooperative Perception.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., January, 2025

AutoTrust: Benchmarking Trustworthiness in Large Vision Language Models for Autonomous Driving.

[BibT_eX]

[DOI]

Trans. Mach. Learn. Res., 2025

Bigger is not Always Better: Scaling Properties of Latent Diffusion Models.

[BibT_eX]

[DOI]

Trans. Mach. Learn. Res., 2025

Subjective and Objective Quality Assessment of Banding Artifacts on Compressed Videos.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2025

Understanding, detecting, and removing perceptual banding artifacts in compressed videos.

[BibT_eX]

[DOI]

Signal Process. Image Commun., 2025

OpenEMMA: Open-Source Multimodal Model for End-to-End Autonomous Driving.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2025

HFMF: Hierarchical Fusion Meets Multi-Stream Models for Deepfake Detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2025

V2X-DGW: Domain Generalization for Multi-Agent Perception Under Adverse Weather Conditions.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Robotics and Automation, 2025

4K4DGen: Panoramic 4D Generation at 4K Resolution.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

STAMP: Scalable Task- And Model-agnostic Collaborative Perception.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Edit Away and My Face Will not Stay: Personal Biometric Defense against Malicious Generative Editing.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

SleeperMark: Towards Robust Watermark against Fine-Tuning Text-to-image Diffusion Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

NTIRE 2025 Challenge on UGC Video Enhancement: Methods and Results.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2025

NTIRE 2025 Challenge on Short-form UGC Video Quality Assessment and Enhancement: Methods and Results.

[BibT_eX]

[DOI]

Mohamed-Chaker Larabi

Nourine Mohammed Nadir

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2025

DPU: Dynamic Prototype Updating for Multimodal Out-of-Distribution Detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

LangCoop: Collaborative Driving with Language.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2025

2024

MWFormer: Multi-Weather Image Restoration Using Degradation-Aware Transformers.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2024

FAVER: Blind quality prediction of variable frame rate videos.

[BibT_eX]

[DOI]

Signal Process. Image Commun., 2024

Political-LLM: Large Language Models in Political Science.

[BibT_eX]

[DOI]

CoRR, 2024

Video Quality Assessment: A Comprehensive Survey.

[BibT_eX]

[DOI]

CoRR, 2024

Unraveling Cross-Modality Knowledge Conflicts in Large Vision-Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

CoMamba: Real-time Cooperative Perception Unlocked with State Space Models.

[BibT_eX]

[DOI]

CoRR, 2024

AIS 2024 Challenge on Video Quality Assessment of User-Generated Content: Methods and Results.

[BibT_eX]

[DOI]

CoRR, 2024

V2X-DGW: Domain Generalization for Multi-agent Perception under Adverse Weather Conditions.

[BibT_eX]

[DOI]

CoRR, 2024

AIM 2024 Challenge on Compressed Video Quality Assessment: Methods and Results.

[BibT_eX]

[DOI]

Maksim Smirnov

Aleksandr Gushchin

Anastasia Antsiferova

Proceedings of the Computer Vision - ECCV 2024 Workshops, 2024

SPIRE: Semantic Prompt-Driven Image Restoration.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

CoDi: Conditional Diffusion Distillation for Higher-Fidelity and Faster Image Generation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Light the Night: A Multi-Condition Diffusion Framework for Unpaired Low-Light Enhancement in Autonomous Driving.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

COVER: A Comprehensive Video Quality Evaluator.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

AIS 2024 Challenge on Video Quality Assessment of User-Generated Content: Methods and Results.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023

TIP: Text-Driven Image Processing with Semantic and Restoration Instructions.

[BibT_eX]

[DOI]

CoRR, 2023

Conditional Diffusion Distillation.

[BibT_eX]

[DOI]

CoRR, 2023

Pik-Fix: Restoring and Colorizing Old Photos.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023

MULLER: Multilayer Laplacian Resizer for Vision.

[BibT_eX]

[DOI]

Zhengzhong Tu

Peyman Milanfar

Hossein Talebi

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

V2V4Real: A Real-World Large-Scale Dataset for Vehicle-to-Vehicle Cooperative Perception.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022

Completely Blind Video Quality Evaluator.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2022

Perceptual Quality Assessment of UGC Gaming Videos.

[BibT_eX]

[DOI]

CoRR, 2022

Subjective Quality Assessment of User-Generated Content Gaming Videos.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision Workshops, 2022

Blind Video Quality Assessment via Space-Time Slice Statistics.

[BibT_eX]

[DOI]

Proceedings of the 2022 IEEE International Conference on Image Processing, 2022

No-Reference Quality Assessment of Variable Frame-Rate Videos Using Temporal Bandpass Statistics.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

V2X-ViT: Vehicle-to-Everything Cooperative Perception with Vision Transformer.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

MaxViT: Multi-axis Vision Transformer.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision, 2022

MAXIM: Multi-Axis MLP for Image Processing.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

CoBEVT: Cooperative Bird's Eye View Semantic Segmentation with Sparse Transformers.

[BibT_eX]

[DOI]

Proceedings of the Conference on Robot Learning, 2022

2021

UGC-VQA: Benchmarking Blind Video Quality Assessment for User Generated Content.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2021

Predicting Eye Fixations Under Distortion Using Bayesian Observers.

[BibT_eX]

[DOI]

Zhengzhong Tu

CoRR, 2021

RAPIQUE: Rapid and Accurate Video Quality Prediction of User Generated Content.

[BibT_eX]

[DOI]

CoRR, 2021

Efficient User-Generated Video Quality Prediction.

[BibT_eX]

[DOI]

Proceedings of the Picture Coding Symposium, 2021

A Temporal Statistics Model For UGC Video Quality Prediction.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE International Conference on Image Processing, 2021

Video Quality Assessment of User Generated Content: A Benchmark Study and a New Model.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE International Conference on Image Processing, 2021

Regression or classification? New methods to evaluate no-reference picture and video quality models.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

2020

Adaptive Debanding Filter.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2020

A Comparative Evaluation Of Temporal Pooling Methods For Blind Video Quality Assessment.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Image Processing, 2020

BBAND INDEX: A NO-REFERENCE BANDING ARTIFACT PREDICTOR.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019

Fitness Done Right: a Real-time Intelligent Personal Trainer for Exercise Correction.

[BibT_eX]

[DOI]

Yun Chen

Yiyue Chen

Zhengzhong Tu

CoRR, 2019

2018

Panoramic video delivery based on Laplace compensation and Sphere-Markov probability model.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Consumer Electronics, 2018

Content adaptive tiling method based on user access preference for streaming panoramic video.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Consumer Electronics, 2018

Zhengzhong Tu

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...