We stand with Ukraine

We stand with Ukraine

Zongxia Li

According to our database¹, Zongxia Li authored at least 21 papers between 2021 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2025

VOGUE: Guiding Exploration with Visual Uncertainty Improves Multimodal Reasoning.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

CoRR, October, 2025

Self-Rewarding Vision-Language Model via Reasoning Decomposition.

[BibT_eX]

[DOI]

,

,

Chengsong Huang

,

,

,

,

,

,

Jordan L. Boyd-Graber

,

,

CoRR, August, 2025

R-Zero: Self-Evolving Reasoning LLM from Zero Data.

[BibT_eX]

[DOI]

Chengsong Huang

,

,

,

,

,

,

,

,

CoRR, August, 2025

Semantically-Aware Rewards for Open-Ended R1 Training in Free-Form Generation.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Jordan Lee Boyd-Graber

CoRR, June, 2025

VideoHallu: Evaluating and Mitigating Multi-modal Hallucinations on Synthetic Video Understanding.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

Jordan Lee Boyd-Graber

CoRR, May, 2025

Large Language Models Are Effective Human Annotation Assistants, But Not Good Independent Annotators.

[BibT_eX]

[DOI]

,

,

Carlos Rafael Colon

,

,

,

Jordan Lee Boyd-Graber

CoRR, March, 2025

Large Language Models Struggle to Describe the Haystack without Human Help: Human-in-the-loop Evaluation of LLMs.

[BibT_eX]

[DOI]

,

Lorena Calvo-Bartolomé

,

Alexander Miserlis Hoyle

,

,

,

Juan Francisco Fung

,

Jordan L. Boyd-Graber

CoRR, February, 2025

Benchmark Evaluations, Applications, and Challenges of Large Vision Language Models: A Survey.

[BibT_eX]

[DOI]

,

,

,

,

CoRR, January, 2025

A Survey of State of the Art Large Vision Language Models: Benchmark Evaluations and Challenges.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2025

Large Language Models Struggle to Describe the Haystack without Human Help: A Social Science-Inspired Evaluation of Topic Models.

[BibT_eX]

[DOI]

,

Lorena Calvo-Bartolomé

,

Alexander Miserlis Hoyle

,

,

Daniel Kofi Stephens

,

Juan Francisco Fung

,

,

Jordan Lee Boyd-Graber

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

2024

PANDA (Pedantic ANswer-correctness Determination and Adjudication): Improving Automatic Evaluation for Question Answering and Text Generation.

[BibT_eX]

[DOI]

,

,

,

,

Jordan Lee Boyd-Graber

CoRR, 2024

Beyond Automated Evaluation Metrics: Evaluating Topic Models On Practical Social Science Content Analysis Tasks.

[BibT_eX]

[DOI]

,

,

Daniel Kofi Stephens

,

,

,

,

,

Jordan L. Boyd-Graber

CoRR, 2024

SciDoc2Diagrammer-MAF: Towards Generation of Scientific Diagrams from Documents guided by Multi-Aspect Feedback Refinement.

[BibT_eX]

[DOI]

,

,

,

Anandhavelu Natarajan

,

Aparna Garimella

,

Jordan L. Boyd-Graber

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

PEDANTS: Cheap but Effective and Interpretable Answer Equivalence.

[BibT_eX]

[DOI]

,

,

,

,

Jordan L. Boyd-Graber

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

Improving the TENOR of Labeling: Re-evaluating Topic Models for Content Analysis.

[BibT_eX]

[DOI]

,

,

Daniel Kofi Stephens

,

,

,

,

,

Jordan L. Boyd-Graber

Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics, 2024

Hallusionbench: An Advanced Diagnostic Suite for Entangled Language Hallucination and Visual Illusion in Large Vision-Language Models.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

,

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Do Large Language Models Discriminate in Hiring Decisions on the Basis of Race, Ethnicity, and Gender?

[BibT_eX]

[DOI]

,

Christabel Acquaye

,

,

,

Rachel Rudinger

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics, 2024

2023

HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models.

[BibT_eX]

[DOI]

,

,

,

,

,

,

CoRR, 2023

Towards Understanding In-Context Learning with Contrastive Demonstrations and Saliency Maps.

[BibT_eX]

[DOI]

,

,

,

CoRR, 2023

SODAPOP: Open-Ended Discovery of Social Biases in Social Commonsense Reasoning Models.

[BibT_eX]

[DOI]

,

,

,

Rachel Rudinger

Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, 2023

2021

An Empirical Comparison of the Quadratic Sieve Factoring Algorithm and the Pollard Rho Factoring Algorithm.

[BibT_eX]

[DOI]

,

William Gasarch

CoRR, 2021

Loading...