Jianlong Fu

According to our database1, Jianlong Fu authored at least 124 papers between 2010 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Learning Position-Aware Implicit Neural Network for Real-World Face Inpainting.
CoRR, 2024

2023
Learning Degradation-Robust Spatiotemporal Frequency-Transformer for Video Super-Resolution.
IEEE Trans. Pattern Anal. Mach. Intell., December, 2023

Aggregated Contextual Transformations for High-Resolution Image Inpainting.
IEEE Trans. Vis. Comput. Graph., July, 2023

Weakly-supervised pre-training for 3D human pose estimation via perspective knowledge.
Pattern Recognit., July, 2023

Meta Attention-Generation Network for Cross-Granularity Few-Shot Learning.
Int. J. Comput. Vis., May, 2023

Language-Guided Face Animation by Recurrent StyleGAN-Based Generator.
IEEE Trans. Multim., 2023

4D LUT: Learnable Context-Aware 4D Lookup Table for Image Enhancement.
IEEE Trans. Image Process., 2023

TTVFI: Learning Trajectory-Aware Transformer for Video Frame Interpolation.
IEEE Trans. Image Process., 2023

Cyclic Differentiable Architecture Search.
IEEE Trans. Pattern Anal. Mach. Intell., 2023

ViCo: Engaging Video Comment Generation with Human Preference Rewards.
CoRR, 2023

Pave the Way to Grasp Anything: Transferring Foundation Models for Universal Pick-Place Robots.
CoRR, 2023

AlphaBlock: Embodied Finetuning for Vision-Language Reasoning in Robot Manipulation.
CoRR, 2023

Solving Diffusion ODEs with Optimal Boundary Conditions for Better Image Super-Resolution.
CoRR, 2023

VideoFactory: Swap Attention in Spatiotemporal Diffusions for Text-to-Video Generation.
CoRR, 2023

NUWA-XL: Diffusion over Diffusion for eXtremely Long Video Generation.
CoRR, 2023

Unified Multi-Modal Latent Diffusion for Joint Subject and Text Conditional Image Generation.
CoRR, 2023

MovieFactory: Automatic Movie Creation from Text using Large Generative Models for Language and Images.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

MobileVidFactory: Automatic Diffusion-Based Social Media Video Generation for Mobile Devices from Text.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Learning Profitable NFT Image Diffusions via Multiple Visual-Policy Guided Reinforcement Learning.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

CLIP-ViP: Adapting Pre-trained Image-Text Model to Video-Language Alignment.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

The First Visual Object Tracking Segmentation VOTS2023 Challenge Results.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Learning Data-Driven Vector-Quantized Degradation Model for Animation Video Super-Resolution.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Improving Diversity in Zero-Shot GAN Adaptation with Semantic Variations.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

SINC: Self-Supervised In-Context Learning for Vision-Language Tasks.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Anchor-Based Detection for Natural Language Localization in Ego-Centric Videos.
Proceedings of the IEEE International Conference on Consumer Electronics, 2023

MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

NUWA-XL: Diffusion over Diffusion for eXtremely Long Video Generation.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2022
Multi-Scale 2D Temporal Adjacency Networks for Moment Localization With Natural Language.
IEEE Trans. Pattern Anal. Mach. Intell., 2022

Guest Editorial: Introduction to the Special Section on Fine-Grained Visual Categorization.
IEEE Trans. Pattern Anal. Mach. Intell., 2022

Learning Spatiotemporal Frequency-Transformer for Low-Quality Video Super-Resolution.
CoRR, 2022

CLIP-ViP: Adapting Pre-trained Image-Text Model to Video-Language Representation Alignment.
CoRR, 2022

Language-Guided Face Animation by Recurrent StyleGAN-based Generator.
CoRR, 2022

Exploring Anchor-based Detection for Ego4D Natural Language Query.
CoRR, 2022

Degradation-Guided Meta-Restoration Network for Blind Super-Resolution.
CoRR, 2022

Long-Form Video-Language Pre-Training with Multimodal Temporal Contrastive Learning.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

AI Illustrator: Translating Raw Descriptions into Images by Prompt-based Cross-Modal Generation.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

TinyViT: Fast Pretraining Distillation for Small Vision Transformers.
Proceedings of the Computer Vision - ECCV 2022, 2022

Learning Spatiotemporal Frequency-Transformer for Compressed Video Super-Resolution.
Proceedings of the Computer Vision - ECCV 2022, 2022

Expanding Language-Image Pretrained Models for General Video Recognition.
Proceedings of the Computer Vision - ECCV 2022, 2022

GRIT-VLP: Grouped Mini-batch Sampling for Efficient Vision and Language Pre-training.
Proceedings of the Computer Vision - ECCV 2022, 2022

MiniViT: Compressing Vision Transformers with Weight Multiplexing.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Advancing High-Resolution Video-Language Representation with Large-Scale Video Transcriptions.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Learning Trajectory-Aware Transformer for Video Super-Resolution.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Fine-Grained Image Style Transfer with Visual Transformers.
Proceedings of the Computer Vision - ACCV 2022, 2022

2021
Reference-Based Defect Detection Network.
IEEE Trans. Image Process., 2021

Food and Ingredient Joint Learning for Fine-Grained Recognition.
IEEE Trans. Circuits Syst. Video Technol., 2021

Probing Inter-modality: Visual Parsing with Self-Attention for Vision-Language Pre-training.
CoRR, 2021

One-Shot Neural Ensemble Architecture Search by Diversity-Guided Search Space Shrinking.
CoRR, 2021

Tyre pattern image retrieval - current status and challenges.
Connect. Sci., 2021

Improving Visual Quality of Image Synthesis by A Token-based Generator with Transformers.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Probing Inter-modality: Visual Parsing with Self-Attention for Vision-and-Language Pre-training.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Searching the Search Space of Vision Transformer.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Learning Fine-Grained Motion Embedding for Landscape Animation.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

A Picture is Worth a Thousand Words: A Unified System for Diverse Captions and Rich Images Generation.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

MMPT'21: International Joint Workshop on Multi-Modal Pre-Training for Multimedia Understanding.
Proceedings of the ICMR '21: International Conference on Multimedia Retrieval, 2021

Learning Conditional Knowledge Distillation for Degraded-Reference Image Quality Assessment.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Rethinking and Improving Relative Position Encoding for Vision Transformer.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Domain-Aware Universal Style Transfer.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

AutoFormer: Searching Transformers for Visual Recognition.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Learning Spatio-Temporal Transformer for Visual Tracking.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

LightTrack: Finding Lightweight Neural Networks for Object Tracking via One-Shot Architecture Search.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Seeing Out of the Box: End-to-End Pre-Training for Vision-Language Representation Learning.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

One-Shot Neural Ensemble Architecture Search by Diversity-Guided Search Space Shrinking.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

2020
Learning Rich Part Hierarchies With Progressive Attention Networks for Fine-Grained Image Recognition.
IEEE Trans. Image Process., 2020

Revisiting Anchor Mechanisms for Temporal Action Localization.
IEEE Trans. Image Process., 2020

Multi-Scale 2D Temporal Adjacent Networks for Moment Localization with Natural Language.
CoRR, 2020

Pixel-BERT: Aligning Image Pixels with Text by Deep Multi-Modal Transformers.
CoRR, 2020

360-Indoor: Towards Learning Real-World Objects in 360° Indoor Equirectangular Images.
Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2020

Learning Semantic-aware Normalization for Generative Adversarial Networks.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Cream of the Crop: Distilling Prioritized Paths For One-Shot Neural Architecture Search.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Aesthetic-Aware Image Style Transfer.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Ocean: Object-Aware Anchor-Free Tracking.
Proceedings of the Computer Vision - ECCV 2020, 2020

Learning Joint Spatial-Temporal Transformations for Video Inpainting.
Proceedings of the Computer Vision - ECCV 2020, 2020

The Eighth Visual Object Tracking VOT2020 Challenge Results.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
Proceedings of the Computer Vision - ECCV 2020 Workshops, 2020


Learning Texture Transformer Network for Image Super-Resolution.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Learning 2D Temporal Adjacent Networks for Moment Localization with Natural Language.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

DGCN: Dynamic Graph Convolutional Network for Efficient Multi-Person Pose Estimation.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Multi-source Multi-level Attention Networks for Visual Question Answering.
ACM Trans. Multim. Comput. Commun. Appl., 2019

Show, Reward, and Tell: Adversarial Visual Story Generation.
ACM Trans. Multim. Comput. Commun. Appl., 2019

Exploiting hierarchical visual features for visual question answering.
Neurocomputing, 2019

Learning Sparse 2D Temporal Adjacent Networks for Temporal Action Localization.
CoRR, 2019

Learning Rich Image Region Representation for Visual Question Answering.
CoRR, 2019

WSOD^2: Learning Bottom-up and Top-down Objectness Distillation for Weakly-supervised Object Detection.
CoRR, 2019

Learn to Scale: Generating Multipolar Normalized Density Map for Crowd Counting.
CoRR, 2019

Activitynet 2019 Task 3: Exploring Contexts for Dense Captioning Events in Videos.
CoRR, 2019

Learning Deep Bilinear Transformation for Fine-grained Image Representation.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

AI Coach: Deep Human Pose Estimation and Analysis for Personalized Athletic Training Assistance.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

Neural Storyboard Artist: Visualizing Stories with Coherent Image Sequences.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

Emotion Reinforced Visual Storytelling.
Proceedings of the 2019 on International Conference on Multimedia Retrieval, 2019

From Words to Sentences: A Progressive Learning Approach for Zero-resource Machine Translation with Visual Pivots.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Learning Recurrent Structure-Guided Attention Network for Multi-person Pose Estimation.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2019

The Seventh Visual Object Tracking VOT2019 Challenge Results.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshops, 2019

WSOD2: Learning Bottom-Up and Top-Down Objectness Distillation for Weakly-Supervised Object Detection.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Learn to Scale: Generating Multipolar Normalized Density Maps for Crowd Counting.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Looking for the Devil in the Details: Learning Trilinear Attention Sampling Network for Fine-Grained Image Recognition.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Learning Pyramid-Context Encoder Network for High-Quality Image Inpainting.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

2018
Image Inspired Poetry Generation in XiaoIce.
CoRR, 2018

DA-GAN: Instance-level Image Translation by Deep Attention Generative Adversarial Networks (with Supplementary Materials).
CoRR, 2018

Beyond Narrative Description: Generating Poetry from Images by Multi-Adversarial Training.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

What Dress Fits Me Best?: Fashion Recommendation on the Clothing Style for Personal Body Shape.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Tell-and-Answer: Towards Explainable Visual Question Answering using Attributes and Captions.
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31, 2018

Deep Attention Neural Tensor Network for Visual Question Answering.
Proceedings of the Computer Vision - ECCV 2018, 2018

DA-GAN: Instance-Level Image Translation by Deep Attention Generative Adversarial Networks.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Show, Reward and Tell: Automatic Generation of Narrative Paragraph From Photo Stream by Adversarial Training.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

Self-View Grounding Given a Narrated 360° Video.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017
Searching Personal Photos on the Phone with Instant Visual Query Suggestion and Joint Text-Image Hashing.
Proceedings of the 2017 ACM on Multimedia Conference, 2017

3D Human Body Reshaping with Anthropometric Modeling.
Proceedings of the Internet Multimedia Computing and Service, 2017

Learning Multi-attention Convolutional Neural Network for Fine-Grained Image Recognition.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Show, Adapt and Tell: Adversarial Training of Cross-Domain Image Captioner.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Multi-level Attention Networks for Visual Question Answering.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine-Grained Image Recognition.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Let Your Photos Talk: Generating Narrative Paragraph for Photo Stream via Bidirectional Attention Recurrent Neural Networks.
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

2016
Storytelling of Photo Stream with Bidirectional Multi-thread Recurrent Neural Network.
CoRR, 2016

Beyond Object Recognition: Visual Sentiment Analysis with Deep Coupled Adjective and Noun Neural Networks.
Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, 2016

2015
Image Tag Refinement With View-Dependent Concept Representations.
IEEE Trans. Circuits Syst. Video Technol., 2015

Finding logos in real-world images with point-context representation-based region search.
Multim. Syst., 2015

Tagging Personal Photos with Transfer Deep Learning.
Proceedings of the 24th International Conference on World Wide Web, 2015

Relaxing from Vocabulary: Robust Weakly-Supervised Deep Learning for Vocabulary-Free Image Tagging.
Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015


2014
What Visual Attributes Characterize an Object Class?
Proceedings of the Computer Vision - ACCV 2014, 2014

2012
Point-context descriptor based region search for logo recognition.
Proceedings of the 4th International Conference on Internet Multimedia Computing and Service, 2012

Efficient Clothing Retrieval with Semantic-Preserving Visual Phrases.
Proceedings of the Computer Vision, 2012

2010
Effective logo retrieval with adaptive local feature selection.
Proceedings of the 18th International Conference on Multimedia 2010, 2010


  Loading...