Danielle S. Bitterman

CoRR, November, 2025

Equitable Survival Prediction: A Fairness-Aware Survival Modeling (FASM) Approach.

[BibT_eX]

[DOI]

William G. La Cava

CoRR, October, 2025

Gender Bias in Large Language Models for Healthcare: Assignment Consistency and Clinical Implications.

[BibT_eX]

[DOI]

Marcus Eng Hock Ong

CoRR, October, 2025

Beyond the Algorithm: A Field Guide to Deploying AI Agents in Clinical Practice.

[BibT_eX]

[DOI]

CoRR, September, 2025

Foundation Artificial Intelligence Models for Health Recognition Using Face Photographs (FAHR-Face).

[BibT_eX]

[DOI]

CoRR, June, 2025

KScope: A Framework for Characterizing the Knowledge Status of Language Models.

[BibT_eX]

[DOI]

Yuxin Xiao

Jack Gallifant

Thomas Hartvigsen

Marzyeh Ghassemi

CoRR, June, 2025

When Models Reason in Your Language: Controlling Thinking Trace Language Comes at the Cost of Accuracy.

[BibT_eX]

[DOI]

Arianna Bisazza

CoRR, May, 2025

MedBrowseComp: Benchmarking Medical Deep Research and Computer Use.

[BibT_eX]

[DOI]

CoRR, May, 2025

Sparse Autoencoder Features for Classifications and Transferability.

[BibT_eX]

[DOI]

CoRR, February, 2025

Regulatory Science Innovation for Generative AI and Large Language Models in Health and Medicine: A Global Call for Action.

[BibT_eX]

[DOI]

Jasmine Chiat Ling Ong

Xiaoxuan Liu

Alastair K. Denniston

CoRR, February, 2025

Preventing unrestricted and unmonitored AI experimentation in healthcare through transparency and accountability.

[BibT_eX]

[DOI]

Donnella S. Comeau

npj Digit. Medicine, 2025

LCD benchmark: long clinical document benchmark on mortality prediction for language models.

[BibT_eX]

[DOI]

Majid Afshar

Juan Carlos Climent Pardo

J. Am. Medical Informatics Assoc., 2025

Collaborative large language models for automated data extraction in living systematic reviews.

[BibT_eX]

[DOI]

J. Am. Medical Informatics Assoc., 2025

WorldMedQA-V: a multilingual, multimodal medical examination dataset for multimodal language models evaluation.

[BibT_eX]

[DOI]

Jose M. M. Pascual-Leone

Jack Gallifant

Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2025, Albuquerque, New Mexico, USA, April 29, 2025

Do They Really Know? Evaluating Large Language Models' Ability to Reference and Cite Oncology Guidelines.

[BibT_eX]

[DOI]

Pietro Belligoli

Carolina Pelegrini Barbosa Gracitelli

Proceedings of the Artificial Intelligence in Medicine - 23rd International Conference, 2025

Multi-OphthaLingua: A Multilingual Benchmark for Assessing and Debiasing LLM Ophthalmological QA in LMICs.

[BibT_eX]

[DOI]

Thao Nguyen Minh Phan

Vincenz Ferrer

Michael G. Morley

Luis Filipe Nakayama

Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

2024

Large language models to identify social determinants of health in electronic health records.

[BibT_eX]

[DOI]

npj Digit. Medicine, 2024

Ethical debates amidst flawed healthcare artificial intelligence metrics.

[BibT_eX]

[DOI]

Jack Gallifant

npj Digit. Medicine, 2024

Evaluating the ChatGPT family of models for biomedical reasoning and classification.

[BibT_eX]

[DOI]

J. Am. Medical Informatics Assoc., 2024

The use of large language models to enhance cancer clinical trial educational materials.

[BibT_eX]

[DOI]

Lisa Soleymani Lehmann

David E. Kozono

Brian Anthony

Dmitriy Dligach

CoRR, 2024

ClinicalBench: Can LLMs Beat Traditional ML Models in Clinical Prediction?

[BibT_eX]

[DOI]

Fei Wang

Kai Shu

CoRR, 2024

Position Paper On Diagnostic Uncertainty Estimation from Large Language Models: Next-Word Probability Is Not Pre-test Probability.

[BibT_eX]

[DOI]

CoRR, 2024

Mapping Bias in Vision Language Models: Signposts, Pitfalls, and the Road Ahead.

[BibT_eX]

[DOI]

Kuleen Sasse

Jackson Pond

John D. Osborne

CoRR, 2024

Wait, but Tylenol is Acetaminophen... Investigating and Improving Language Models' Ability to Resist Requests for Misinformation.

[BibT_eX]

[DOI]

CoRR, 2024

Safety challenges of AI in medicine.

[BibT_eX]

[DOI]

CoRR, 2024

AIPatient: Simulating Patients with EHRs and LLM Powered Agentic Workflow.

[BibT_eX]

[DOI]

Themistocles L. Assimes

Xin Ma

Lin Lu

Lizhou Fan

CoRR, 2024

Retrieval-Augmented Generation for Generative Artificial Intelligence in Medicine.

[BibT_eX]

[DOI]

Jasmine Chiat Ling Ong

Daniel Shu Wei Ting

CoRR, 2024

Seeds of Stereotypes: A Large-Scale Textual Analysis of Race and Gender Associations with Diseases in Online Sources.

[BibT_eX]

[DOI]

Marcela Aguirre-Jerez

Judy Gichoya

CoRR, 2024

Improving Clinical NLP Performance through Language Model-Generated Synthetic Clinical Data.

[BibT_eX]

[DOI]

CoRR, 2024

Cross-Care: Assessing the Healthcare Implications of Pre-training Data on Language Model Bias.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Seeing Beyond Borders: Evaluating LLMs in Multilingual Ophthalmological Question Answering.

[BibT_eX]

[DOI]

Vincenz Ferrer

Yindalon Aphinyanaphongs

Proceedings of the 12th IEEE International Conference on Healthcare Informatics, 2024

When Raw Data Prevails: Are Large Language Model Embeddings Effective in Numerical Data Representation for Medical Machine Learning Applications?

[BibT_eX]

[DOI]

Matthew M. Churpek

Majid Afshar

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

Language Models are Surprisingly Fragile to Drug Names in Biomedical Benchmarks.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

2023

The impact of using an AI chatbot to respond to patient messages.

[BibT_eX]

[DOI]

CoRR, 2023

Measuring Pointwise $\mathcal{V}$-Usable Information In-Context-ly.

[BibT_eX]

[DOI]

Sheng Lu

Yingya Li

Guergana Savova

Iryna Gurevych

CoRR, 2023

Considerations for health care institutions training large language models on electronic health records.

[BibT_eX]

[DOI]

Weipeng Zhou

Majid Afshar

CoRR, 2023

Large Language Models to Identify Social Determinants of Health in Electronic Health Records.

[BibT_eX]

[DOI]

CoRR, 2023

Evaluation of ChatGPT Family of Models for Biomedical Reasoning and Classification.

[BibT_eX]

[DOI]

CoRR, 2023

Natural language processing to automatically extract the presence and severity of esophagitis in notes of patients undergoing radiotherapy.

[BibT_eX]

[DOI]

CoRR, 2023

Measuring Pointwise \mathcalV-Usable Information In-Context-ly.

[BibT_eX]

[DOI]

Sheng Lu

Yingya Li

Guergana Savova

Iryna Gurevych

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

2022

Classifying unstructured electronic consult messages to understand primary care physician specialty information needs.

[BibT_eX]

[DOI]

J. Am. Medical Informatics Assoc., 2022

2021

Deep-learning system to improve the quality and efficiency of volumetric heart segmentation for breast cancer.

[BibT_eX]

[DOI]

npj Digit. Medicine, 2021

2020

Extracting Relations between Radiotherapy Treatment Details.

[BibT_eX]

[DOI]