Yufan Zhou

Affiliations:

Luma AI, San Francisco, CA, USA
Adobe, San Jose, CA, USA
University at Buffalo, Department of Computer Science and Engineering, Buffalo, NY, USA

According to our database¹, Yufan Zhou authored at least 39 papers between 2018 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Bibliography

2025

Multimodal LLMs as Customized Reward Models for Text-to-Image Generation.

[BibT_eX]

[DOI]

CoRR, July, 2025

Towards Visual Text Grounding of Multimodal Large Language Model.

[BibT_eX]

[DOI]

CoRR, April, 2025

A Survey on Mechanistic Interpretability for Multi-Modal Foundation Models.

[BibT_eX]

[DOI]

CoRR, February, 2025

ARTIST: Improving the Generation of Text-Rich Images with Disentangled Diffusion Models and Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2025

TTVD: Towards a Geometric Framework for Test-Time Adaptation Based on Voronoi Diagram.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

SV-RAG: LoRA-Contextualizing Adaptation of MLLMs for Long Document Understanding.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

A High-Quality Text-Rich Image Instruction Tuning Dataset via Hybrid Instruction Generation.

[BibT_eX]

[DOI]

Proceedings of the 31st International Conference on Computational Linguistics, 2025

Numerical Pruning for Efficient Autoregressive Models.

[BibT_eX]

[DOI]

Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

2024

Numerical Pruning for Efficient Autoregressive Models.

[BibT_eX]

[DOI]

CoRR, 2024

SUGAR: Subject-Driven Video Customization in a Zero-Shot Manner.

[BibT_eX]

[DOI]

CoRR, 2024

LoRA-Contextualizing Adaptation of Large Multimodal Models for Long Document Understanding.

[BibT_eX]

[DOI]

CoRR, 2024

Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

MMR: Evaluating Reading Ability of Large Multimodal Models.

[BibT_eX]

[DOI]

CoRR, 2024

LLaVA-Read: Enhancing Reading Ability of Multimodal Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

ARTIST: Improving the Generation of Text-rich Images by Disentanglement.

[BibT_eX]

[DOI]

CoRR, 2024

Toffee: Efficient Million-Scale Dataset Construction for Subject-Driven Text-to-Image Generation.

[BibT_eX]

[DOI]

CoRR, 2024

Towards Aligned Layout Generation via Diffusion Model with Aesthetic Constraints.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

TextLap: Customizing Language Models for Text-to-Layout Planning.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

Customization Assistant for Text-to-image Generation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

TRINS: Towards Multimodal Language Models that Can Read.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Navigating the Dual Facets: A Comprehensive Evaluation of Sequential Memory Editing in Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

2023

LLaVAR: Enhanced Visual Instruction Tuning for Text-Rich Image Understanding.

[BibT_eX]

[DOI]

CoRR, 2023

Enhancing Detail Preservation for Customized Text-to-Image Generation: A Regularization-Free Approach.

[BibT_eX]

[DOI]

CoRR, 2023

Shifted Diffusion for Text-to-image Generation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022

Lafite2: Few-shot Text-to-Image Generation.

[BibT_eX]

[DOI]

CoRR, 2022

Towards Language-Free Training for Text-to-Image Generation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

TiGAN: Text-Based Interactive Image Generation and Manipulation.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021

A Generic Approach for Enhancing GANs by Regularized Latent Optimization.

[BibT_eX]

[DOI]

CoRR, 2021

LAFITE: Towards Language-Free Training for Text-to-Image Generation.

[BibT_eX]

[DOI]

CoRR, 2021

Learning High-Dimensional Distributions with Latent Neural Fokker-Planck Kernels.

[BibT_eX]

[DOI]

Yufan Zhou

Changyou Chen

Jinhui Xu

CoRR, 2021

Meta-Learning with Neural Tangent Kernels.

[BibT_eX]

[DOI]

Proceedings of the 9th International Conference on Learning Representations, 2021

MixKD: Towards Efficient Distillation of Large-scale Language Models.

[BibT_eX]

[DOI]

Proceedings of the 9th International Conference on Learning Representations, 2021

2020

Graph Neural Networks with Composite Kernels.

[BibT_eX]

[DOI]

CoRR, 2020

Learning Manifold Implicitly via Explicit Heat-Kernel Learning.

[BibT_eX]

[DOI]

Yufan Zhou

Changyou Chen

Jinhui Xu

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Weakly-Supervised Brain Tumor Classification with Global Diagnosis Label.

[BibT_eX]

[DOI]

Proceedings of the 17th IEEE International Symposium on Biomedical Imaging, 2020

Variational Adversarial Kernel Learned Imitation Learning.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

Learning Diverse Stochastic Human-Action Generators by Learning Smooth Latent Transitions.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019

KernelNet: A Data-Dependent Kernel Parameterization for Deep Generative Modeling.

[BibT_eX]

[DOI]

Yufan Zhou

Changyou Chen

Jinhui Xu

CoRR, 2019

2018

Holistic Brain Tumor Screening and Classification Based on DenseNet and Recurrent Neural Network.

[BibT_eX]

[DOI]

Proceedings of the Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries, 2018

Yufan Zhou

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...