Yushi Hu

Orcid: 0000-0002-7540-2413

According to our database1, Yushi Hu authored at least 22 papers between 2020 and 2025.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
MMMG: a Comprehensive and Reliable Evaluation Suite for Multitask Multimodal Generation.
CoRR, May, 2025

BLIP3-o: A Family of Fully Open Unified Multimodal Models-Architecture, Training and Dataset.
CoRR, May, 2025

Token-Shuffle: Towards High-Resolution Image Generation with Autoregressive Models.
CoRR, April, 2025

DreamSync: Aligning Text-to-Image Generation with Image Understanding Feedback.
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025

Eval3D: Interpretable and Fine-grained Evaluation for 3D Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

2024
Decoding-Time Language Model Alignment with Multiple Objectives.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal Language Models.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Davidsonian Scene Graph: Improving Reliability in Fine-grained Evaluation for Text-to-Image Generation.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

BLINK: Multimodal Large Language Models Can See but Not Perceive.
Proceedings of the Computer Vision - ECCV 2024, 2024

Visual Program Distillation: Distilling Tools and Programmatic Reasoning into Vision-Language Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Beyond Thumbs Up/Down: Untangling Challenges of Fine-Grained Feedback for Text-to-Image Generation.
Proceedings of the Seventh AAAI/ACM Conference on AI, Ethics, and Society (AIES-24) - Full Archival Papers, October 21-23, 2024, San Jose, California, USA, 2024

Training Language Models to Generate Text with Citations via Fine-grained Rewards.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

2023
Fine-Grained Human Feedback Gives Better Rewards for Language Model Training.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Binding Language Models in Symbolic Languages.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation with Question Answering.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

PromptCap: Prompt-Guided Image Captioning for VQA with GPT-3.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

One Embedder, Any Task: Instruction-Finetuned Text Embeddings.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

2022
PromptCap: Prompt-Guided Task-Aware Image Captioning.
CoRR, 2022

Unsupervised Learning of Hierarchical Conversation Structure.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

In-Context Learning for Few-Shot Dialogue State Tracking.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

2021
Acoustic Span Embeddings for Multilingual Query-by-Example Search.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021

2020
Multilingual Jointly Trained Acoustic and Written Word Embeddings.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020


  Loading...