Yu Wu

Orcid: 0000-0002-1680-8253

Affiliations:

Princeton University, NJ, USA
University of Technology Sydney, Center for Artificial Intelligence, Australia

According to our database¹, Yu Wu authored at least 79 papers between 2017 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Bibliography

2026

CORE: Code-based Inverse Self-Training Framework with Graph Expansion for Virtual Agents.

[BibT_eX]

[DOI]

CoRR, January, 2026

2025

IS-Diff: Improving Diffusion-Based Inpainting with Better Initial Seed.

[BibT_eX]

[DOI]

CoRR, September, 2025

DVIS++: Improved Decoupled Framework for Universal Video Segmentation.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., July, 2025

FADngs: Federated Learning for Anomaly Detection.

[BibT_eX]

[DOI]

IEEE Trans. Neural Networks Learn. Syst., February, 2025

TV-Dialogue: Crafting Theme-Aware Video Dialogues with Immersive Interaction.

[BibT_eX]

[DOI]

CoRR, January, 2025

BNMusic: Blending Environmental Noises into Personalized Music.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

Spotlighting Partially Visible Cinematic Language for Video-to-Audio Generation via Self-distillation.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence, 2025

The Silent Assistant: NoiseQuery as Implicit Guidance for Goal-Driven Image Generation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

D^3: Scaling Up Deepfake Detection by Learning from Discrepancy.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Implicit Bias Injection Attacks against Text-to-Image Diffusion Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Re-HOLD: Video Hand Object Interaction Reenactment via adaptive Layout-instructed Diffusion Model.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

CoMM: A Coherent Interleaved Image-Text Dataset for Multimodal Understanding and Generation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

2024

Vision + X: A Survey on Multimodal Learning in the Light of Data.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., December, 2024

Promote knowledge mining towards open-world semi-supervised learning.

[BibT_eX]

[DOI]

Pattern Recognit., 2024

The Silent Prompt: Initial Noise as Implicit Guidance for Goal-Driven Image Generation.

[BibT_eX]

[DOI]

CoRR, 2024

SOWing Information: Cultivating Contextual Coherence with MLLMs in Image Generation.

[BibT_eX]

[DOI]

CoRR, 2024

An Efficient and Effective Transformer Decoder-Based Framework for Multi-Task Visual Grounding.

[BibT_eX]

[DOI]

Wei Chen

Long Chen

Yu Wu

CoRR, 2024

MAC: A Benchmark for Multiple Attributes Compositional Zero-Shot Learning.

[BibT_eX]

[DOI]

CoRR, 2024

D<sup>3</sup>: Scaling Up Deepfake Detection by Learning from Discrepancy.

[BibT_eX]

[DOI]

CoRR, 2024

Toward Real Ultra Image Segmentation: Leveraging Surrounding Context to Cultivate General Segmentation Model.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Momentor: Advancing Video Large Language Model with Fine-Grained Temporal Reasoning.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Diffusion in Diffusion: Cyclic One-Way Diffusion for Text-Vision-Conditioned Generation.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Iterative Ensemble Training with Anti-gradient Control for Mitigating Memorization in Diffusion Models.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

An Efficient and Effective Transformer Decoder-Based Framework for Multi-task Visual Grounding.

[BibT_eX]

[DOI]

Wei Chen

Long Chen

Yu Wu

Proceedings of the Computer Vision - ECCV 2024, 2024

Improving Bird's Eye View Semantic Segmentation by Task Decomposition.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Omni-Q: Omni-Directional Scene Understanding for Unsupervised Visual Grounding.

[BibT_eX]

[DOI]

Sai Wang

Yutian Lin

Yu Wu

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023

Symbiotic Attention for Egocentric Action Recognition With Object-Centric Alignment.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., June, 2023

Switchable Novel Object Captioner.

[BibT_eX]

[DOI]

Yu Wu

Lu Jiang

Yi Yang

IEEE Trans. Pattern Anal. Mach. Intell., 2023

DETER: Detecting Edited Regions for Deterring Generative Manipulations.

[BibT_eX]

[DOI]

CoRR, 2023

Unseen Image Synthesis with Diffusion Models.

[BibT_eX]

[DOI]

CoRR, 2023

1st Place Solution for the 5th LSVOS Challenge: Video Instance Segmentation.

[BibT_eX]

[DOI]

CoRR, 2023

1st Place Solution for PVUW Challenge 2023: Video Panoptic Segmentation.

[BibT_eX]

[DOI]

CoRR, 2023

Boundary Guided Mixing Trajectory for Semantic Control with Diffusion Models.

[BibT_eX]

[DOI]

CoRR, 2023

Boundary Guided Learning-Free Semantic Control with Diffusion Models.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

RIO: A Benchmark for Reasoning Intention-Oriented Objects in Open Environments.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Revisit Weakly-Supervised Audio-Visual Video Parsing from the Language Perspective.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Discrete Contrastive Diffusion for Cross-Modal Music and Image Generation.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

DVIS: Decoupled Video Instance Segmentation Framework.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Visually-Prompted Language Model for Fine-Grained Scene Graph Generation in an Open World.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Learning to Segment Every Referring Object Point by Point.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022

Identifying Visible Parts via Pose Estimation for Occluded Person Re-Identification.

[BibT_eX]

[DOI]

Jiaxu Miao

Yu Wu

Yi Yang

IEEE Trans. Neural Networks Learn. Syst., 2022

Learning With Noisy Labels via Self-Reweighting From Class Centroids.

[BibT_eX]

[DOI]

IEEE Trans. Neural Networks Learn. Syst., 2022

Saying the Unseen: Video Descriptions via Dialog Agents.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2022

NAP: Neural architecture search with pruning.

[BibT_eX]

[DOI]

Neurocomputing, 2022

Discrete Contrastive Diffusion for Cross-Modal and Conditional Generation.

[BibT_eX]

[DOI]

CoRR, 2022

Enabling Detailed Action Recognition Evaluation Through Video Dataset Augmentation.

[BibT_eX]

[DOI]

Jihoon Chung

Yu Wu

Olga Russakovsky

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Quantized GAN for Complex Music Generation from Dance Videos.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

Multi-query Video Retrieval.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

SiRi: A Simple Selective Retraining Mechanism for Transformer-Based Visual Grounding.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

Large-scale Video Panoptic Segmentation in the Wild: A Benchmark.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Learning to Learn by Jointly Optimizing Neural Architecture and Weights.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021

Multimodal Learning and Video Analysis with Deep Neural Networks

[BibT_eX]

[DOI]

Yu Wu

PhD thesis, 2021

Learning to Anticipate Egocentric Actions by Imagination.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2021

Holistic LSTM for Pedestrian Trajectory Prediction.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2021

Progressive Transfer Learning for Face Anti-Spoofing.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2021

Contrastive Video-Language Segmentation.

[BibT_eX]

[DOI]

CoRR, 2021

Rethinking Cross-modal Interaction from a Top-down Perspective for Referring Video Object Segmentation.

[BibT_eX]

[DOI]

CoRR, 2021

ClawCraneNet: Leveraging Object-level Relation for Text-based Video Segmentation.

[BibT_eX]

[DOI]

CoRR, 2021

Learning Audio-Visual Correlations From Variational Cross-Modal Generation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Exploring Heterogeneous Clues for Weakly-Supervised Audio-Visual Video Parsing.

[BibT_eX]

[DOI]

Yu Wu

Yi Yang

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

VSPW: A Large-scale Dataset for Video Scene Parsing in the Wild.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

2020

Revisiting EmbodiedQA: A Simple Baseline and Beyond.

[BibT_eX]

[DOI]

Yu Wu

Lu Jiang

Yi Yang

IEEE Trans. Image Process., 2020

Unsupervised Person Re-identification via Cross-Camera Similarity Exploration.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2020

Cascaded Revision Network for Novel Object Captioning.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. Video Technol., 2020

Describing Unseen Videos via Multi-modal Cooperative Dialog Agents.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

Gated Channel Transformation for Visual Recognition.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Imitative Non-Autoregressive Modeling for Trajectory Forecasting and Imputation.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Unsupervised Person Re-Identification via Softened Similarity Learning.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Symbiotic Attention with Privileged Information for Egocentric Action Recognition.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019

Progressive Learning for Person Re-Identification With One Example.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2019

Improving person re-identification by attribute and identity learning.

[BibT_eX]

[DOI]

Pattern Recognit., 2019

Cascaded Revision Network for Novel Object Captioning.

[BibT_eX]

[DOI]

CoRR, 2019

Baidu-UTS Submission to the EPIC-Kitchens Action Recognition Challenge 2019.

[BibT_eX]

[DOI]

CoRR, 2019

Dual Attention Matching for Audio-Visual Event Localization.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Auto-ReID: Searching for a Part-Aware ConvNet for Person Re-Identification.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Pose-Guided Feature Alignment for Occluded Person Re-Identification.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

2018

Decoupled Novel Object Captioner.

[BibT_eX]

[DOI]

Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Exploit the Unknown Gradually: One-Shot Video-Based Person Re-Identification by Stepwise Learning.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

2017

Improving Person Re-identification by Attribute and Identity Learning.

[BibT_eX]

[DOI]

CoRR, 2017

Yu Wu

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...