We stand with Ukraine

We stand with Ukraine

Yicong Hong

Orcid: 0000-0002-5068-1508

According to our database¹, Yicong Hong authored at least 30 papers between 2020 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2025

Diffusion Transformer-to-Mamba Distillation for High-Resolution Image Generation.

[BibT_eX]

[DOI]

,

,

,

,

,

CoRR, June, 2025

Test-Time Training Done Right.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Kalyan Sunkavalli

,

William T. Freeman

,

CoRR, May, 2025

VEGGIE: Instructional Editing and Reasoning of Video Concepts with Grounded Generation.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

CoRR, March, 2025

REGEN: Learning Compact Video Embedding with (Re-)Generative Decoder.

[BibT_eX]

[DOI]

,

,

Aniruddha Mahapatra

,

,

,

,

,

CoRR, March, 2025

Pushing the Boundaries of State Space Models for Image and Video Generation.

[BibT_eX]

[DOI]

,

,

,

CoRR, February, 2025

Bootstrapping Language-Guided Navigation Learning with Self-Refining Data Flywheel.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Progressive Autoregressive Video Diffusion Models.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Arie E. Kaufman

,

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2025

2024

SAME: Learning Generic Language-Guided Visual Navigation with State-Adaptive Mixture of Experts.

[BibT_eX]

[DOI]

,

,

,

,

,

CoRR, 2024

Long-LRM: Long-sequence Large Reconstruction Model for Wide-coverage Gaussian Splats.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

CoRR, 2024

Bi-directional Training for Composed Image Retrieval via Text Prompt Learning.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024

NaVid: Video-based VLM Plans the Next Step for Vision-and-Language Navigation.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

Proceedings of the Robotics: Science and Systems XX, 2024

Instant3D: Fast Text-to-3D with Sparse-view Generation and Large Reconstruction Model.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

Kalyan Sunkavalli

,

Greg Shakhnarovich

,

Proceedings of the Twelfth International Conference on Learning Representations, 2024

LRM: Large Reconstruction Model for Single Image to 3D.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

Kalyan Sunkavalli

,

,

Proceedings of the Twelfth International Conference on Learning Representations, 2024

NavGPT-2: Unleashing Navigational Reasoning Capability for Large Vision-Language Models.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the Computer Vision - ECCV 2024, 2024

NavGPT: Explicit Reasoning in Vision-and-Language Navigation with Large Language Models.

[BibT_eX]

[DOI]

,

,

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

Augmented Commonsense Knowledge for Remote Object Grounding.

[BibT_eX]

[DOI]

Bahram Mohammadi

,

,

,

,

,

Javen Qinfeng Shi

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023

HOP+: History-Enhanced and Order-Aware Pre-Training for Vision-and-Language Navigation.

[BibT_eX]

[DOI]

,

,

,

,

,

IEEE Trans. Pattern Anal. Mach. Intell., July, 2023

Scaling Data Generation in Vision-and-Language Navigation.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Learning Navigational Visual Representations with Semantic Map Supervision.

[BibT_eX]

[DOI]

,

,

,

Franck Dernoncourt

,

,

,

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

2022

1st Place Solutions for RxR-Habitat Vision-and-Language Navigation Competition (CVPR 2022).

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

CoRR, 2022

HOP: History-and-Order Aware Pre-training for Vision-and-Language Navigation.

[BibT_eX]

[DOI]

,

,

,

,

,

CoRR, 2022

HOP: History-and-Order Aware Pretraining for Vision-and-Language Navigation.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Bridging the Gap Between Learning in Discrete and Continuous Environments for Vision-and-Language Navigation.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021

Know What and Know Where: An Object-and-Room Informed Sequential BERT for Indoor Vision-Language Navigation.

[BibT_eX]

[DOI]

,

,

,

Ming-Hsuan Yang

,

Anton van den Hengel

,

CoRR, 2021

Learning structure-aware semantic segmentation with image-level supervision.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the International Joint Conference on Neural Networks, 2021

The Road to Know-Where: An Object-and-Room Informed Sequential BERT for Indoor Vision-Language Navigation.

[BibT_eX]

[DOI]

,

,

,

Ming-Hsuan Yang

,

Anton van den Hengel

,

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

VLN BERT: A Recurrent Vision-and-Language BERT for Navigation.

[BibT_eX]

[DOI]

,

,

,

Cristian Rodriguez Opazo

,

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

2020

A Recurrent Vision-and-Language BERT for Navigation.

[BibT_eX]

[DOI]

,

,

,

Cristian Rodriguez Opazo

,

CoRR, 2020

Language and Visual Entity Relationship Graph for Agent Navigation.

[BibT_eX]

[DOI]

,

Cristian Rodriguez Opazo

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Sub-Instruction Aware Vision-and-Language Navigation.

[BibT_eX]

[DOI]

,

Cristian Rodriguez Opazo

,

,

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

Loading...