On May 13, 2025, OpenAI launched HealthBench, a global open-source dataset built to evaluate how well AI models respond to healthcare-related questions. Developed with input from physicians across 60 countries, it is a significant step toward safer, more reliable AI in medicine.
BulletsIn
-
HealthBench launched by OpenAI on May 13, 2025
-
Tests AI’s ability to answer real-world medical questions
-
Dataset includes 5,000 simulated doctor-patient conversations
-
Created with help from 262 physicians from 60 countries
-
Evaluation rubrics written and weighted by doctors
-
GPT-4.1 used to score model responses
-
Assesses accuracy, safety, and medical reasoning
-
Dataset is open-source and publicly accessible
-
Aims to benchmark AI models for health-related use
-
Major advancement in trustworthy AI healthcare tools




What do you think?
It is nice to know your opinion. Leave a comment.