Jianlong Fu

Orcid: 0000-0002-1025-2012

According to our database¹, Jianlong Fu authored at least 141 papers between 2010 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

Latent Policy Steering through One-Step Flow Policies.

[BibT_eX]

[DOI]

CoRR, March, 2026

2025

LoLA: Long Horizon Latent Action Learning for General Robot Manipulation.

[BibT_eX]

[DOI]

CoRR, December, 2025

MindPower: Enabling Theory-of-Mind Reasoning in VLM-based Embodied Agents.

[BibT_eX]

[DOI]

CoRR, November, 2025

LatBot: Distilling Universal Latent Actions for Vision-Language-Action Models.

[BibT_eX]

[DOI]

CoRR, November, 2025

Beyond Success: Refining Elegant Robot Manipulation from Mixed-Quality Data via Just-in-Time Intervention.

[BibT_eX]

[DOI]

CoRR, November, 2025

TwinVLA: Data-Efficient Bimanual Manipulation with Twin Single-Arm Vision-Language-Action Models.

[BibT_eX]

[DOI]

CoRR, November, 2025

Swap Attention in Spatiotemporal Diffusions for Text-to-Video Generation.

[BibT_eX]

[DOI]

Int. J. Comput. Vis., July, 2025

Learning position-aware implicit neural network for real-world face inpainting.

[BibT_eX]

[DOI]

Bo Zhao

Huan Yang

Jianlong Fu

Pattern Recognit., 2025

Transferring Foundation Models for Generalizable Robotic Manipulation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2025

RoLD: Robot Latent Diffusion for Multi-task Policy Modeling.

[BibT_eX]

[DOI]

Proceedings of the MultiMedia Modeling, 2025

2024

Prompt-Based Modality Bridging for Unified Text-to-Face Generation and Manipulation.

[BibT_eX]

[DOI]

ACM Trans. Multim. Comput. Commun. Appl., December, 2024

CogACT: A Foundational Vision-Language-Action Model for Synergizing Cognition and Action in Robotic Manipulation.

[BibT_eX]

[DOI]

CoRR, 2024

Multi-task Manipulation Policy Modeling with Visuomotor Latent Diffusion.

[BibT_eX]

[DOI]

CoRR, 2024

Spatiotemporal Predictive Pre-training for Robotic Motor Control.

[BibT_eX]

[DOI]

CoRR, 2024

Learning Position-Aware Implicit Neural Network for Real-World Face Inpainting.

[BibT_eX]

[DOI]

Bo Zhao

Huan Yang

Jianlong Fu

CoRR, 2024

PromptFix: You Prompt and We Fix the Photo.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

ViCo: Engaging Video Comment Generation with Human Preference Rewards.

[BibT_eX]

[DOI]

Proceedings of the 6th ACM International Conference on Multimedia in Asia, 2024

Solving Diffusion ODEs with Optimal Boundary Conditions for Better Image Super-Resolution.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Revisiting Generative Adversarial Network for Downstream Task of Speech Recognition.

[BibT_eX]

[DOI]

Sheng Li

Bei Liu

Jianlong Fu

Proceedings of the IEEE Gaming, Entertainment, and Media Conference, 2024

Zero-Reference Low-Light Enhancement via Physical Quadruple Priors.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Capture Concept Through Comparison: Vision-and-Language Representation Learning with Intrinsic Information Mining.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ACCV 2024, 2024

2023

Learning Degradation-Robust Spatiotemporal Frequency-Transformer for Video Super-Resolution.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., December, 2023

Aggregated Contextual Transformations for High-Resolution Image Inpainting.

[BibT_eX]

[DOI]

IEEE Trans. Vis. Comput. Graph., July, 2023

Weakly-supervised pre-training for 3D human pose estimation via perspective knowledge.

[BibT_eX]

[DOI]

Pattern Recognit., July, 2023

Meta Attention-Generation Network for Cross-Granularity Few-Shot Learning.

[BibT_eX]

[DOI]

Int. J. Comput. Vis., May, 2023

Language-Guided Face Animation by Recurrent StyleGAN-Based Generator.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2023

4D LUT: Learnable Context-Aware 4D Lookup Table for Image Enhancement.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2023

TTVFI: Learning Trajectory-Aware Transformer for Video Frame Interpolation.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2023

Cyclic Differentiable Architecture Search.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2023

Pave the Way to Grasp Anything: Transferring Foundation Models for Universal Pick-Place Robots.

[BibT_eX]

[DOI]

CoRR, 2023

AlphaBlock: Embodied Finetuning for Vision-Language Reasoning in Robot Manipulation.

[BibT_eX]

[DOI]

CoRR, 2023

VideoFactory: Swap Attention in Spatiotemporal Diffusions for Text-to-Video Generation.

[BibT_eX]

[DOI]

CoRR, 2023

NUWA-XL: Diffusion over Diffusion for eXtremely Long Video Generation.

[BibT_eX]

[DOI]

CoRR, 2023

Unified Multi-Modal Latent Diffusion for Joint Subject and Text Conditional Image Generation.

[BibT_eX]

[DOI]

CoRR, 2023

MovieFactory: Automatic Movie Creation from Text using Large Generative Models for Language and Images.

[BibT_eX]

[DOI]

Proceedings of the 31st ACM International Conference on Multimedia, 2023

MobileVidFactory: Automatic Diffusion-Based Social Media Video Generation for Mobile Devices from Text.

[BibT_eX]

[DOI]

Proceedings of the 31st ACM International Conference on Multimedia, 2023

Learning Profitable NFT Image Diffusions via Multiple Visual-Policy Guided Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the 31st ACM International Conference on Multimedia, 2023

CLIP-ViP: Adapting Pre-trained Image-Text Model to Video-Language Alignment.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

The First Visual Object Tracking Segmentation VOTS2023 Challenge Results.

[BibT_eX]

[DOI]

Kannappan Palaniappan

Norbert Scherer-Negenborn

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Learning Data-Driven Vector-Quantized Degradation Model for Animation Video Super-Resolution.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Improving Diversity in Zero-Shot GAN Adaptation with Semantic Variations.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

SINC: Self-Supervised In-Context Learning for Vision-Language Tasks.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Anchor-Based Detection for Natural Language Localization in Ego-Centric Videos.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Consumer Electronics, 2023

MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

NUWA-XL: Diffusion over Diffusion for eXtremely Long Video Generation.

[BibT_eX]

[DOI]

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2022

Multi-Scale 2D Temporal Adjacency Networks for Moment Localization With Natural Language.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2022

Guest Editorial: Introduction to the Special Section on Fine-Grained Visual Categorization.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2022

Learning Spatiotemporal Frequency-Transformer for Low-Quality Video Super-Resolution.

[BibT_eX]

[DOI]

CoRR, 2022

CLIP-ViP: Adapting Pre-trained Image-Text Model to Video-Language Representation Alignment.

[BibT_eX]

[DOI]

CoRR, 2022

Exploring Anchor-based Detection for Ego4D Natural Language Query.

[BibT_eX]

[DOI]

CoRR, 2022

Degradation-Guided Meta-Restoration Network for Blind Super-Resolution.

[BibT_eX]

[DOI]

CoRR, 2022

Long-Form Video-Language Pre-Training with Multimodal Temporal Contrastive Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

AI Illustrator: Translating Raw Descriptions into Images by Prompt-based Cross-Modal Generation.

[BibT_eX]

[DOI]

Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

TinyViT: Fast Pretraining Distillation for Small Vision Transformers.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

Learning Spatiotemporal Frequency-Transformer for Compressed Video Super-Resolution.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

Expanding Language-Image Pretrained Models for General Video Recognition.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

GRIT-VLP: Grouped Mini-batch Sampling for Efficient Vision and Language Pre-training.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

MiniViT: Compressing Vision Transformers with Weight Multiplexing.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Advancing High-Resolution Video-Language Representation with Large-Scale Video Transcriptions.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Learning Trajectory-Aware Transformer for Video Super-Resolution.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Fine-Grained Image Style Transfer with Visual Transformers.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ACCV 2022, 2022

2021

Reference-Based Defect Detection Network.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2021

Food and Ingredient Joint Learning for Fine-Grained Recognition.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. Video Technol., 2021

Probing Inter-modality: Visual Parsing with Self-Attention for Vision-Language Pre-training.

[BibT_eX]

[DOI]

CoRR, 2021

One-Shot Neural Ensemble Architecture Search by Diversity-Guided Search Space Shrinking.

[BibT_eX]

[DOI]

CoRR, 2021

Tyre pattern image retrieval - current status and challenges.

[BibT_eX]

[DOI]

Connect. Sci., 2021

Improving Visual Quality of Image Synthesis by A Token-based Generator with Transformers.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Probing Inter-modality: Visual Parsing with Self-Attention for Vision-and-Language Pre-training.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Searching the Search Space of Vision Transformer.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Learning Fine-Grained Motion Embedding for Landscape Animation.

[BibT_eX]

[DOI]

Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

A Picture is Worth a Thousand Words: A Unified System for Diverse Captions and Rich Images Generation.

[BibT_eX]

[DOI]

Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

MMPT'21: International Joint Workshop on Multi-Modal Pre-Training for Multimedia Understanding.

[BibT_eX]

[DOI]

Alexander G. Hauptmann

Yong Rui

Proceedings of the ICMR '21: International Conference on Multimedia Retrieval, 2021

Learning Conditional Knowledge Distillation for Degraded-Reference Image Quality Assessment.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Rethinking and Improving Relative Position Encoding for Vision Transformer.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Domain-Aware Universal Style Transfer.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

AutoFormer: Searching Transformers for Visual Recognition.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Learning Spatio-Temporal Transformer for Visual Tracking.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

LightTrack: Finding Lightweight Neural Networks for Object Tracking via One-Shot Architecture Search.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Seeing Out of the Box: End-to-End Pre-Training for Vision-Language Representation Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

One-Shot Neural Ensemble Architecture Search by Diversity-Guided Search Space Shrinking.

[BibT_eX]

[DOI]

Minghao Chen

Jianlong Fu

Haibin Ling

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

2020

Learning Rich Part Hierarchies With Progressive Attention Networks for Fine-Grained Image Recognition.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2020

Revisiting Anchor Mechanisms for Temporal Action Localization.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2020

Multi-Scale 2D Temporal Adjacent Networks for Moment Localization with Natural Language.

[BibT_eX]

[DOI]

CoRR, 2020

Pixel-BERT: Aligning Image Pixels with Text by Deep Multi-Modal Transformers.

[BibT_eX]

[DOI]

CoRR, 2020

360-Indoor: Towards Learning Real-World Objects in 360° Indoor Equirectangular Images.

[BibT_eX]

[DOI]

Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2020

Learning Semantic-aware Normalization for Generative Adversarial Networks.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Cream of the Crop: Distilling Prioritized Paths For One-Shot Neural Architecture Search.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Aesthetic-Aware Image Style Transfer.

[BibT_eX]

[DOI]

Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Ocean: Object-Aware Anchor-Free Tracking.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

Learning Joint Spatial-Temporal Transformations for Video Inpainting.

[BibT_eX]

[DOI]

Yanhong Zeng

Jianlong Fu

Hongyang Chao

Proceedings of the Computer Vision - ECCV 2020, 2020

The Eighth Visual Object Tracking VOT2020 Challenge Results.

[BibT_eX]

[DOI]

Joni-Kristian Kämäräinen

Alexander G. Hauptmann

Alireza Memarmoghadam

Álvaro García-Martín

Andreas Robinson

Anton Varfolomieiev

Awet Haileslassie Gebrehiwot

Hari Chandana Kuchibhotla

Hasan Saribas

Heng Fan

Hossein Ghanei-Yakhdan

Rama Krishna Sai Subrahmanyam Gorthi

Seokeon Choi

Seyed Mojtaba Marvasti-Zadeh

Proceedings of the Computer Vision - ECCV 2020 Workshops, 2020

NTIRE 2020 Challenge on Perceptual Extreme Super-Resolution: Methods and Results.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Learning Texture Transformer Network for Image Super-Resolution.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Learning 2D Temporal Adjacent Networks for Moment Localization with Natural Language.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

DGCN: Dynamic Graph Convolutional Network for Efficient Multi-Person Pose Estimation.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019

Multi-source Multi-level Attention Networks for Visual Question Answering.

[BibT_eX]

[DOI]

ACM Trans. Multim. Comput. Commun. Appl., 2019

Show, Reward, and Tell: Adversarial Visual Story Generation.

[BibT_eX]

[DOI]

ACM Trans. Multim. Comput. Commun. Appl., 2019

Exploiting hierarchical visual features for visual question answering.

[BibT_eX]

[DOI]

Neurocomputing, 2019

Learning Sparse 2D Temporal Adjacent Networks for Temporal Action Localization.

[BibT_eX]

[DOI]

CoRR, 2019

Learning Rich Image Region Representation for Visual Question Answering.

[BibT_eX]

[DOI]

CoRR, 2019

WSOD^2: Learning Bottom-up and Top-down Objectness Distillation for Weakly-supervised Object Detection.

[BibT_eX]

[DOI]

CoRR, 2019

Learn to Scale: Generating Multipolar Normalized Density Map for Crowd Counting.

[BibT_eX]

[DOI]

CoRR, 2019

Activitynet 2019 Task 3: Exploring Contexts for Dense Captioning Events in Videos.

[BibT_eX]

[DOI]

Alexander G. Hauptmann

CoRR, 2019

Learning Deep Bilinear Transformation for Fine-grained Image Representation.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

AI Coach: Deep Human Pose Estimation and Analysis for Personalized Athletic Training Assistance.

[BibT_eX]

[DOI]

Proceedings of the 27th ACM International Conference on Multimedia, 2019

Neural Storyboard Artist: Visualizing Stories with Coherent Image Sequences.

[BibT_eX]

[DOI]

Proceedings of the 27th ACM International Conference on Multimedia, 2019

Emotion Reinforced Visual Storytelling.

[BibT_eX]

[DOI]

Proceedings of the 2019 on International Conference on Multimedia Retrieval, 2019

From Words to Sentences: A Progressive Learning Approach for Zero-resource Machine Translation with Visual Pivots.

[BibT_eX]

[DOI]

Shizhe Chen

Qin Jin

Jianlong Fu

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Learning Recurrent Structure-Guided Attention Network for Multi-person Pose Estimation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2019

The Seventh Visual Object Tracking VOT2019 Challenge Results.

[BibT_eX]

[DOI]

Abdelrahman Eldesokey

Rama Krishna Sai Subrahmanyam Gorthi

Alireza Memarmoghadam

Ardhendu Shekhar Tripathi

Arnold W. M. Smeulders

Joni-Kristian Kämäräinen

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshops, 2019

WSOD2: Learning Bottom-Up and Top-Down Objectness Distillation for Weakly-Supervised Object Detection.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Learn to Scale: Generating Multipolar Normalized Density Maps for Crowd Counting.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Looking for the Devil in the Details: Learning Trilinear Attention Sampling Network for Fine-Grained Image Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Learning Pyramid-Context Encoder Network for High-Quality Image Inpainting.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

2018

Image Inspired Poetry Generation in XiaoIce.

[BibT_eX]

[DOI]

CoRR, 2018

DA-GAN: Instance-level Image Translation by Deep Attention Generative Adversarial Networks (with Supplementary Materials).

[BibT_eX]

[DOI]

CoRR, 2018

Beyond Narrative Description: Generating Poetry from Images by Multi-Adversarial Training.

[BibT_eX]

[DOI]

Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

What Dress Fits Me Best?: Fashion Recommendation on the Clothing Style for Personal Body Shape.

[BibT_eX]

[DOI]

Shintami Chusnul Hidayati

Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Tell-and-Answer: Towards Explainable Visual Question Answering using Attributes and Captions.

[BibT_eX]

[DOI]

Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31, 2018

Deep Attention Neural Tensor Network for Visual Question Answering.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2018, 2018

DA-GAN: Instance-Level Image Translation by Deep Attention Generative Adversarial Networks.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Show, Reward and Tell: Automatic Generation of Narrative Paragraph From Photo Stream by Adversarial Training.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

Self-View Grounding Given a Narrated 360° Video.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017

Searching Personal Photos on the Phone with Instant Visual Query Suggestion and Joint Text-Image Hashing.

[BibT_eX]

[DOI]

Proceedings of the 2017 ACM on Multimedia Conference, 2017

3D Human Body Reshaping with Anthropometric Modeling.

[BibT_eX]

[DOI]

Yanhong Zeng

Jianlong Fu

Hongyang Chao

Proceedings of the Internet Multimedia Computing and Service, 2017

Learning Multi-attention Convolutional Neural Network for Fine-Grained Image Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Computer Vision, 2017

Show, Adapt and Tell: Adversarial Training of Cross-Domain Image Captioner.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Computer Vision, 2017

Multi-level Attention Networks for Visual Question Answering.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine-Grained Image Recognition.

[BibT_eX]

[DOI]

Jianlong Fu

Heliang Zheng

Tao Mei

Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Let Your Photos Talk: Generating Narrative Paragraph for Photo Stream via Bidirectional Attention Recurrent Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

2016

Storytelling of Photo Stream with Bidirectional Multi-thread Recurrent Neural Network.

[BibT_eX]

[DOI]

CoRR, 2016

Beyond Object Recognition: Visual Sentiment Analysis with Deep Coupled Adjective and Noun Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, 2016

2015

Image Tag Refinement With View-Dependent Concept Representations.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. Video Technol., 2015

Finding logos in real-world images with point-context representation-based region search.

[BibT_eX]

[DOI]

Jinqiao Wang

Jianlong Fu

Hanqing Lu

Multim. Syst., 2015

Tagging Personal Photos with Transfer Deep Learning.

[BibT_eX]

[DOI]

Proceedings of the 24th International Conference on World Wide Web, 2015

Relaxing from Vocabulary: Robust Weakly-Supervised Deep Learning for Vocabulary-Free Image Tagging.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

3D Object Retrieval with Multimodal Views.

[BibT_eX]

[DOI]

Proceedings of the 8th Eurographics Workshop on 3D Object Retrieval, 2015

2014

What Visual Attributes Characterize an Object Class?

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ACCV 2014, 2014

2012

Point-context descriptor based region search for logo recognition.

[BibT_eX]

[DOI]

Proceedings of the 4th International Conference on Internet Multimedia Computing and Service, 2012

Efficient Clothing Retrieval with Semantic-Preserving Visual Phrases.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision, 2012

2010

Effective logo retrieval with adaptive local feature selection.

[BibT_eX]

[DOI]

Jianlong Fu

Jinqiao Wang

Hanqing Lu

Proceedings of the 18th International Conference on Multimedia 2010, 2010

Jianlong Fu

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...