Ruiyi Zhang

Orcid: 0000-0002-8157-6364

Affiliations:
  • Adobe Research, San Jose, CA, USA
  • Duke University, Department of Computer Science, Durham, NC, USA (PhD 2021)


According to our database1, Ruiyi Zhang authored at least 119 papers between 2018 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
Multimodal LLMs as Customized Reward Models for Text-to-Image Generation.
CoRR, July, 2025

Lizard: An Efficient Linearization Framework for Large Language Models.
CoRR, July, 2025

A Survey on Long-Video Storytelling Generation: Architectures, Consistency, and Cinematic Quality.
CoRR, July, 2025

CaughtCheating: Is Your MLLM a Good Cheating Detective? Exploring the Boundary of Visual Perception and Reasoning.
CoRR, July, 2025

A Personalized Conversational Benchmark: Towards Simulating Personalized Conversations.
CoRR, May, 2025

CachePrune: Neural-Based Attribution Defense Against Indirect Prompt Injection Attacks.
CoRR, April, 2025

Towards Visual Text Grounding of Multimodal Large Language Model.
CoRR, April, 2025

Towards Agentic Recommender Systems in the Era of Multimodal Large Language Models.
CoRR, March, 2025

GUI-Bee: Align GUI Action Grounding to Novel Environments via Autonomous Exploration.
CoRR, January, 2025

Personalization of Large Language Models: A Survey.
Trans. Mach. Learn. Res., 2025

Personalizing Data Delivery: Investigating User Characteristics and Enhancing LLM Predictions.
Proceedings of the Companion Proceedings of the ACM on Web Conference 2025, 2025

ARTIST: Improving the Generation of Text-Rich Images with Disentangled Diffusion Models and Large Language Models.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2025

Self-Debiasing Large Language Models: Zero-Shot Recognition and Reduction of Stereotypes.
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025

SV-RAG: LoRA-Contextualizing Adaptation of MLLMs for Long Document Understanding.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

A High-Quality Text-Rich Image Instruction Tuning Dataset via Hybrid Instruction Generation.
Proceedings of the 31st International Conference on Computational Linguistics, 2025



Numerical Pruning for Efficient Autoregressive Models.
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

2024
GUI Agents: A Survey.
CoRR, 2024

Numerical Pruning for Efficient Autoregressive Models.
CoRR, 2024

SUGAR: Subject-Driven Video Customization in a Zero-Shot Manner.
CoRR, 2024

Personalized Multimodal Large Language Models: A Survey.
CoRR, 2024

Optimizing Data Delivery: Insights from User Preferences on Visuals, Tables, and Text.
CoRR, 2024

DynaSaur: Large Language Agents Beyond Predefined Actions.
CoRR, 2024

LoRA-Contextualizing Adaptation of Large Multimodal Models for Long Document Understanding.
CoRR, 2024

Survey of User Interface Design and Interaction Techniques in Generative AI Applications.
CoRR, 2024

A Survey of Small Language Models.
CoRR, 2024

Taipan: Efficient and Expressive State Space Language Models with Selective Attention.
CoRR, 2024

VipAct: Visual-Perception Enhancement via Specialized VLM Agent Collaboration and Tool-use.
CoRR, 2024

Federated Large Language Models: Current Progress and Future Directions.
CoRR, 2024

Visual Prompting in Multimodal Large Language Models: A Survey.
CoRR, 2024

A Multi-LLM Debiasing Framework.
CoRR, 2024

Automatic Pseudo-Harmful Prompt Generation for Evaluating False Refusals in Large Language Models.
CoRR, 2024

MMR: Evaluating Reading Ability of Large Multimodal Models.
CoRR, 2024

LLaVA-Read: Enhancing Reading Ability of Multimodal Language Models.
CoRR, 2024

KaPQA: Knowledge-Augmented Product Question-Answering.
CoRR, 2024

LongLaMP: A Benchmark for Personalized Long-form Text Generation.
CoRR, 2024

ARTIST: Improving the Generation of Text-rich Images by Disentanglement.
CoRR, 2024

Toffee: Efficient Million-Scale Dataset Construction for Subject-Driven Text-to-Image Generation.
CoRR, 2024

Augmenting Textual Generation via Topology Aware Retrieval.
CoRR, 2024

Improve Temporal Awareness of LLMs for Sequential Recommendation.
CoRR, 2024

Automatic Layout Planning for Visually-Rich Documents with Instruction-Following Models.
CoRR, 2024

Self-Debiasing Large Language Models: Zero-Shot Recognition and Reduction of Stereotypes.
CoRR, 2024

Bias and Fairness in Large Language Models: A Survey.
Comput. Linguistics, 2024

Personalized Federated Learning for Text Classification with Gradient-Free Prompt Tuning.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2024, 2024

Self-Cleaning: Improving a Named Entity Recognizer Trained on Noisy Data with a Few Clean Instances.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2024, 2024

ADOPD: A Large-Scale Document Page Decomposition Dataset.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Towards Aligned Layout Generation via Diffusion Model with Aesthetic Constraints.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

SOHES: Self-supervised Open-world Hierarchical Entity Segmentation.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Towards Building The Federatedgpt: Federated Instruction Tuning.
Proceedings of the IEEE International Conference on Acoustics, 2024

TextLap: Customizing Language Models for Text-to-Layout Planning.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

Few-Shot Dialogue Summarization via Skeleton-Assisted Prompt Transfer in Prompt Tuning.
Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics, 2024

Customization Assistant for Text-to-image Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

TRINS: Towards Multimodal Language Models that Can Read.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Topology-aware Retrieval Augmentation for Text Generation.
Proceedings of the 33rd ACM International Conference on Information and Knowledge Management, 2024

Knowledge Graph Prompting for Multi-Document Question Answering.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
VaQuitA: Enhancing Alignment in LLM-Assisted Video Understanding.
CoRR, 2023

Token-Level Adversarial Prompt Detection Based on Perplexity Measures and Contextual Information.
CoRR, 2023

Improving a Named Entity Recognizer Trained on Noisy Data with a Few Clean Instances.
CoRR, 2023

AutoDAN: Automatic and Interpretable Adversarial Attacks on Large Language Models.
CoRR, 2023

LLaVAR: Enhanced Visual Instruction Tuning for Text-Rich Image Understanding.
CoRR, 2023

Enhancing Detail Preservation for Customized Text-to-Image Generation: A Regularization-Free Approach.
CoRR, 2023

Few-Shot Dialogue Summarization via Skeleton-Assisted Prompt Transfer.
CoRR, 2023

Towards Building the Federated GPT: Federated Instruction Tuning.
CoRR, 2023

InfoPrompt: Information-Theoretic Soft Prompt Tuning for Natural Language Understanding.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Label-Retrieval-Augmented Diffusion Models for Learning from Noisy Labels.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Learning Navigational Visual Representations with Semantic Map Supervision.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

A Critical Analysis of Document Out-of-Distribution Detection.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Understanding Demonstration-based Learning from a Causal Perspective.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2023

Federated Domain Adaptation for Named Entity Recognition via Distilling with Heterogeneous Tag Sets.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

Few-Shot Composition Learning for Image Retrieval with Prompt Tuning.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Wasserstein Uncertainty Estimation for Adversarial Domain Matching.
Frontiers Big Data, 2022

Dynamics-Aware Adaptation for Reinforcement Learning Based Cross-Domain Interactive Recommendation.
Proceedings of the SIGIR '22: The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Madrid, Spain, July 11, 2022

CIPhy: Causal Intervention with Physical Confounder from IoT Sensor Data for Robust Occupant Information Inference.
Proceedings of the 20th ACM Conference on Embedded Networked Sensor Systems, 2022

Federated Non-negative Matrix Factorization for Short Texts Topic Modeling with Mutual Information.
Proceedings of the International Joint Conference on Neural Networks, 2022

Context-aware Information-theoretic Causal De-biasing for Interactive Sequence Labeling.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

Towards Language-Free Training for Text-to-Image Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Learning Adaptive Axis Attentions in Fine-tuning: Beyond Fixed Sparse Attention Patterns.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, 2022

Few-Shot Class-Incremental Learning for Named Entity Recognition.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

TiGAN: Text-Based Interactive Image Generation and Manipulation.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

Text-Based Interactive Recommendation via Offline Reinforcement Learning.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
Towards Uncertainty and Efficiency in Reinforcement Learning.
PhD thesis, 2021

LAFITE: Towards Language-Free Training for Text-to-Image Generation.
CoRR, 2021

SDA: Improving Text Generation with Self Data Augmentation.
CoRR, 2021

Reinforcement Learning for Flexibility Design Problems.
CoRR, 2021

Improving Zero-Shot Voice Style Transfer via Disentangled Representation Learning.
Proceedings of the 9th International Conference on Learning Representations, 2021

Unsupervised Abstractive Dialogue Summarization for Tete-a-Tetes.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
Influence Diagram Bandits: Variational Thompson Sampling for Structured Bandit Problems.
CoRR, 2020

Reward Constrained Interactive Recommendation with Natural Language Feedback.
CoRR, 2020

Figure Captioning with Relation Maps for Reasoning.
Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2020

Graphical Models Meet Bandits: A Variational Thompson Sampling Approach.
Proceedings of the 37th International Conference on Machine Learning, 2020

Bayesian Meta Sampling for Fast Uncertainty Adaptation.
Proceedings of the 8th International Conference on Learning Representations, 2020

Semantic Matching via Optimal Partial Transport.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020

Improving Text Generation with Student-Forcing Optimal Transport.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

Repulsive Attention: Rethinking Multi-head Attention as Bayesian Inference.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

Stochastic Particle-Optimization Sampling and the Non-Asymptotic Convergence Theory.
Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

Nested-Wasserstein Self-Imitation Learning for Sequence Generation.
Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

Improving Adversarial Text Generation by Modeling the Distant Future.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

Learning Diverse Stochastic Human-Action Generators by Learning Smooth Latent Transitions.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Collaborative Filtering with A Synthetic Feedback Loop.
CoRR, 2019

Figure Captioning with Reasoning and Sequence-Level Training.
CoRR, 2019

Topic-Guided Variational Autoencoders for Text Generation.
CoRR, 2019

Scalable Thompson Sampling via Optimal Transport.
CoRR, 2019

Text-Based Interactive Recommendation via Constraint-Augmented Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Improving Textual Network Learning with Variational Homophilic Embeddings.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Topic-Guided Variational Auto-Encoder for Text Generation.
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019

Vision-Language Recommendation via Attribute Augmented Multimodal Reinforcement Learning.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

Variational Annealing of GANs: A Langevin Perspective.
Proceedings of the 36th International Conference on Machine Learning, 2019

Understanding and Accelerating Particle-Based Variational Inference.
Proceedings of the 36th International Conference on Machine Learning, 2019

Improving Sequence-to-Sequence Learning via Optimal Transport.
Proceedings of the 7th International Conference on Learning Representations, 2019

Neural caption generation over figures.
Proceedings of the 2019 ACM International Joint Conference on Pervasive and Ubiquitous Computing and Proceedings of the 2019 ACM International Symposium on Wearable Computers, 2019

Scalable Thompson Sampling via Optimal Transport.
Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, 2019

2018
Sequence Generation with Guider Network.
CoRR, 2018

Accelerated First-order Methods on the Wasserstein Space for Bayesian Inference.
CoRR, 2018

A Unified Particle-Optimization Framework for Scalable Bayesian Sampling.
Proceedings of the Thirty-Fourth Conference on Uncertainty in Artificial Intelligence, 2018

Adversarial Text Generation via Feature-Mover's Distance.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Policy Optimization as Wasserstein Gradient Flows.
Proceedings of the 35th International Conference on Machine Learning, 2018

Variational Inference and Model Selection with Generalized Evidence Bounds.
Proceedings of the 35th International Conference on Machine Learning, 2018

Learning Structural Weight Uncertainty for Sequential Decision-Making.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2018


  Loading...