Developing and Evaluating AI
to Transform Healthcare

We are a joint UC Berkeley and UCSF multidisciplinary research lab building and evaluating AI for healthcare. We develop methods to bring AI into clinical practice, and benchmarks to measure its real-world impact.

Our Publications Meet the Team

Navigate our research here!

Featured Publications

All publications →

ER-Reason: A Benchmark Dataset for LLM Clinical Reasoning in the Emergency Room

Nikita Mehandru, Niloufar Golchini, Namrata Garg, Kathy LeSaint, Christopher Nash, Anu Ramachandran, Travis Zack, Liam McCoy, Adam Rodman, David Bamman, Melanie Molina, Ahmed Alaa

arXiv preprint · 2026

Paper Code Data

CheXthought: A Global Multimodal Dataset of Clinical Chain-of-thought Reasoning and Visual Attention for Chest X-ray Interpretation

Sonali Sharma, Jin Long, George Shih, Sarah Eid, Christian Bluethgen, Francine L Jacobson, Emily B Tsai, Ahmed M Alaa, Curtis P Langlotz, Global Radiology Consortium

arXiv preprint · 2026

Paper Data

Position: Medical Large Language Model Benchmarks Should Prioritize Construct Validity

Ahmed Alaa, Thomas Hartvigsen, Niloufar Golchini, Shiladitya Dutta, Frances Dean, Inioluwa Deborah Raji, Travis Zack

ICML 2025 · Oral Presentation

Paper

Evaluating Large Language Models as Agents in the Clinic

Nikita Mehandru, Brenda Miao, Eduardo Rodriguez, Madhumita Sushil, Atul Butte, Ahmed Alaa

NPJ Digital Medicine · 2024

Paper

How Faithful is your Synthetic Data? Sample-level Metrics for Evaluating and Auditing Generative Models

Ahmed Alaa, Boris van Breugel, Evgeny Saveliev, Mihaela van der Schaar

ICML 2022

Paper Code

Lab News

All news →

Jun 2026

New paper in Nature by Alex on ECG-based biomarkers for sudden cardiac death!
Congratulations to Jenna Fields and Franny Dean on passing the qual exam! 🎉🎓
Congratulations to Jivat Kaur for receiving a Stellar Abstract Award at STAI-X 2026!
Franny Dean presents her work on AI surrogates for causal effect estimation in ASCO!
📣 We're excited to share ER-Reason, a new dataset of 25K+ de-identified clinical notes on PhysioNet. Read our paper!

May 2026

📣 In collaboration with Stanford University, we created CheXthought, a new dataset for training and evaluating medical VLMs! Available here.

Apr 2026

Our lab participated in the Frontiers in CPH conference with various talks and posters!
Two papers accepted at ICML!
Our lab is part of a team awarded a moonshot seed grant from the Laude Institute! 🏆

Mar 2026

We celebrated Alex, Jivat, William, and Zhongyuan's advancement to candidacy! 🎉
Nilo Golchini is admitted to the UC Berkeley - UCSF Joint Medical Program! Congrats to the first ever MD student in our lab!

Developing and Evaluating AIto Transform Healthcare

Featured Publications

Developing and Evaluating AI
to Transform Healthcare