Hexiang Hu

According to our database1, Hexiang Hu authored at least 47 papers between 2016 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions.
CoRR, 2024

Instruct-Imagen: Image Generation with Multi-modal Instruction.
CoRR, 2024

2023
Drinking From a Firehose: Continual Learning With Web-Scale Natural Language.
IEEE Trans. Pattern Anal. Mach. Intell., May, 2023

UniIR: Training and Benchmarking Universal Multimodal Information Retrievers.
CoRR, 2023

PaLI-X: On Scaling up a Multilingual Vision and Language Model.
CoRR, 2023

From Pixels to UI Actions: Learning to Follow Instructions via Graphical User Interfaces.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Subject-driven Text-to-Image Generation via Apprenticeship Learning.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Pix2Struct: Screenshot Parsing as Pretraining for Visual Language Understanding.
Proceedings of the International Conference on Machine Learning, 2023

Re-Imagen: Retrieval-Augmented Text-to-Image Generator.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

PreSTU: Pre-Training for Scene-Text Understanding.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Open-domain Visual Entity Recognition: Towards Recognizing Millions of Wikipedia Entities.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Can Pre-trained Vision and Language Models Answer Visual Information-Seeking Questions?
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

2022
MuRAG: Multimodal Retrieval-Augmented Generator for Open Question Answering over Images and Text.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

2021
Learning Adaptive Classifiers Synthesis for Generalized Few-Shot Learning.
Int. J. Comput. Vis., 2021

A Simple and Effective Use of Object-Centric Images for Long-Tailed Object Detection.
CoRR, 2021

On Model Calibration for Long-Tailed Object Detection and Instance Segmentation.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

MosaicOS: A Simple and Effective Use of Object-Centric Images for Long-Tailed Object Detection.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Systematic Generalization on gSCAN: What is Nearly Solved and What is Next?
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

Visually Grounded Concept Composition.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, 2021

Learning the Best Pooling Strategy for Visual Semantic Embedding.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

2020
Structured Label Inference for Visual Understanding.
IEEE Trans. Pattern Anal. Mach. Intell., 2020

A Hierarchical Multi-Modal Encoder for Moment Localization in Video Corpus.
CoRR, 2020

Visual Storytelling via Predicting Anchor Word Embeddings in the Stories.
CoRR, 2020

Learning to Represent Image and Text with Denotation Graph.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

Few-Shot Learning via Embedding Adaptation With Set-to-Set Functions.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

BabyWalk: Going Farther in Vision-and-Language Navigation by Taking Baby Steps.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

2019
Learning Classifier Synthesis for Generalized Few-Shot Learning.
CoRR, 2019

Synthesized Policies for Transfer and Adaptation across Tasks and Environments.
CoRR, 2019

Binary Image Selection (BISON): Interpretable Evaluation of Visual Grounding.
CoRR, 2019

Multimodal Model-Agnostic Meta-Learning via Task-Aware Modulation.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Evaluating Text-to-Image Matching using Binary Image Selection (BISON).
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshops, 2019

Engaging Image Captioning via Personality.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

2018
Toward Multimodal Model-Agnostic Meta-Learning.
CoRR, 2018

Learning Embedding Adaptation for Few-Shot Learning.
CoRR, 2018

Synthesize Policies for Transfer and Adaptation across Tasks and Environments.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Being Negative but Constructively: Lessons Learnt from Creating Better Visual Question Answering Datasets.
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018

Cross-Modal and Hierarchical Modeling of Video and Text.
Proceedings of the Computer Vision - ECCV 2018, 2018

Compressed Video Action Recognition.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Learning Answer Embeddings for Visual Question Answering.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Cross-Dataset Adaptation for Visual Question Answering.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Multi-Task Learning for Sequence Tagging: An Empirical Study.
Proceedings of the 27th International Conference on Computational Linguistics, 2018

2017
LabelBank: Revisiting Global Perspectives for Semantic Segmentation.
CoRR, 2017

FastMask: Segment Multi-scale Object Candidates in One Shot.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

2016
FastMask: Segment Object Multi-scale Candidates in One Shot.
CoRR, 2016

Recalling Holistic Information for Semantic Segmentation.
CoRR, 2016

Learning Structured Inference Neural Networks with Label Relations.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Structure Inference Machines: Recurrent Neural Networks for Analyzing Relations in Group Activity Recognition.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016


  Loading...