Hao Tan

Affiliations:
  • University of North Carolina, Chapel Hill, NC, USA


According to our database1, Hao Tan authored at least 31 papers between 2017 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Single-View 3D Human Digitalization with Large Reconstruction Models.
CoRR, 2024

2023
Carve3D: Improving Multi-view Reconstruction Consistency for Diffusion Models with RL Finetuning.
CoRR, 2023

PF-LRM: Pose-Free Large Reconstruction Model for Joint Pose and Shape Prediction.
CoRR, 2023

DMV3D: Denoising Multi-View Diffusion using 3D Large Reconstruction Model.
CoRR, 2023

Instant3D: Fast Text-to-3D with Sparse-View Generation and Large Reconstruction Model.
CoRR, 2023

LRM: Large Reconstruction Model for Single Image to 3D.
CoRR, 2023

Scaling Data Generation in Vision-and-Language Navigation.
CoRR, 2023

Boosting Punctuation Restoration with Data Generation and Reinforcement Learning.
CoRR, 2023

Scaling Data Generation in Vision-and-Language Navigation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Learning Navigational Visual Representations with Semantic Map Supervision.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

2022
CLEAR: Improving Vision-Language Navigation with Cross-Lingual, Environment-Agnostic Representations.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2022, 2022

How Much Can CLIP Benefit Vision-and-Language Tasks?
Proceedings of the Tenth International Conference on Learning Representations, 2022

Envedit: Environment Editing for Vision-and-Language Navigation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Scientific Chart Summarization: Datasets and Improved Text Modeling.
Proceedings of the Workshop on Scientific Document Understanding co-located with 36th AAAI Conference on Artificial Inteligence, 2022

2021
VIMPAC: Video Pre-Training via Masked Token Prediction and Contrastive Learning.
CoRR, 2021

VidLanKD: Improving Language Understanding via Video-Distilled Knowledge Transfer.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Improving Cross-Modal Alignment in Vision Language Navigation via Syntactic Information.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

Unifying Vision-and-Language Tasks via Text Generation.
Proceedings of the 38th International Conference on Machine Learning, 2021

2020
Diagnosing the Environment Bias in Vision-and-Language Navigation.
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

Enabling Robots to Understand Incomplete Natural Language Instructions Using Commonsense Reasoning.
Proceedings of the 2020 IEEE International Conference on Robotics and Automation, 2020

The Curse of Performance Instability in Analysis Datasets: Consequences, Source, and Suggestions.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

MAF: Multimodal Alignment Framework for Weakly-Supervised Phrase Grounding.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

Vokenization: Improving Language Understanding with Contextualized, Visual-Grounded Supervision.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

ArraMon: A Joint Navigation-Assembly Instruction Interpretation Task in Dynamic Environments.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020

Modality-Balanced Models for Visual Dialogue.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Learning to Navigate Unseen Environments: Back Translation with Environmental Dropout.
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019

LXMERT: Learning Cross-Modality Encoder Representations from Transformers.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

Expressing Visual Relationships via Language.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

2018
Object Ordering with Bidirectional Matchings for Visual Reasoning.
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018

Source-Target Inference Models for Spatial Instruction Understanding.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017
A Joint Speaker-Listener-Reinforcer Model for Referring Expressions.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017


  Loading...