Zhihui Xie

Affiliations:
  • University of Hong Kong, Hong Kong
  • Shanghai Jiao Tong University, China (former)


According to our database1, Zhihui Xie authored at least 19 papers between 2020 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
Dream 7B: Diffusion Large Language Models.
CoRR, August, 2025

Teaching Language Models to Critique via Reinforcement Learning.
CoRR, February, 2025

Jailbreaking as a Reward Misspecification Problem.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

VL-RewardBench: A Challenging Benchmark for Vision-Language Generative Reward Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

2024
Toward joint utilization of absolute and relative bandit feedback for conversational recommendation.
User Model. User Adapt. Interact., November, 2024

VLRewardBench: A Challenging Benchmark for Vision-Language Generative Reward Models.
CoRR, 2024

Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models.
CoRR, 2024

Learning Versatile Skills with Curriculum Masking.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Calibrating Reasoning in Language Models with Internal Consistency.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

VLFeedback: A Large-Scale AI Feedback Dataset for Large Vision-Language Models Alignment.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

2023
Silkie: Preference Distillation for Large Visual Language Models.
CoRR, 2023

Future-conditioned Unsupervised Pretraining for Decision Transformer.
CoRR, 2023

Future-conditioned Unsupervised Pretraining for Decision Transformer.
Proceedings of the International Conference on Machine Learning, 2023

2022
Pretraining in Deep Reinforcement Learning: A Survey.
CoRR, 2022

Knowledge-aware Conversational Preference Elicitation with Bandit Feedback.
Proceedings of the WWW '22: The ACM Web Conference 2022, Virtual Event, Lyon, France, April 25, 2022

Dynamics-Aware Adaptation for Reinforcement Learning Based Cross-Domain Interactive Recommendation.
Proceedings of the SIGIR '22: The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Madrid, Spain, July 11, 2022

Discovering Low-rank Subspaces for Language-agnostic Multilingual Representations.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

2021
Comparison-based Conversational Recommender System with Relative Bandit Feedback.
Proceedings of the SIGIR '21: The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2021

2020
Layered Neighborhood Expansion for Incremental Multiple Graph Matching.
Proceedings of the Computer Vision - ECCV 2020, 2020


  Loading...