Upload folder using huggingface_hub
Browse files
README.md
CHANGED
|
@@ -1,12 +1,18 @@
|
|
| 1 |
---
|
| 2 |
-
base_model:
|
| 3 |
-
- google/medgemma-4b-it
|
| 4 |
-
paper: 2507.05201
|
| 5 |
-
library_name: transformers
|
| 6 |
license: other
|
| 7 |
license_name: health-ai-developer-foundations
|
| 8 |
license_link: https://developers.google.com/health-ai-developer-foundations/terms
|
|
|
|
| 9 |
pipeline_tag: image-text-to-text
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 10 |
tags:
|
| 11 |
- medical
|
| 12 |
- unsloth
|
|
@@ -16,14 +22,7 @@ tags:
|
|
| 16 |
- pathology
|
| 17 |
- ophthalmology
|
| 18 |
- chest-x-ray
|
| 19 |
-
extra_gated_heading: Access MedGemma on Hugging Face
|
| 20 |
-
extra_gated_prompt: To access MedGemma on Hugging Face, you're required to review
|
| 21 |
-
and agree to [Health AI Developer Foundation's terms of use](https://developers.google.com/health-ai-developer-foundations/terms).
|
| 22 |
-
To do this, please ensure you're logged in to Hugging Face and click below. Requests
|
| 23 |
-
are processed immediately.
|
| 24 |
-
extra_gated_button_content: Acknowledge license
|
| 25 |
---
|
| 26 |
-
|
| 27 |
<div>
|
| 28 |
<p style="margin-top: 0;margin-bottom: 0;">
|
| 29 |
<em><a href="https://docs.unsloth.ai/basics/unsloth-dynamic-v2.0-gguf">Unsloth Dynamic 2.0</a> achieves superior accuracy & outperforms other leading quants.</em>
|
|
@@ -236,7 +235,7 @@ See the following Colab notebooks for examples of how to use MedGemma:
|
|
| 236 |
### Model architecture overview
|
| 237 |
|
| 238 |
The MedGemma model is built based on [Gemma 3](https://ai.google.dev/gemma/) and
|
| 239 |
-
uses the same decoder-only transformer architecture as Gemma 3
|
| 240 |
about the architecture, consult the Gemma 3 [model
|
| 241 |
card](https://ai.google.dev/gemma/docs/core/model_card_3).
|
| 242 |
|
|
@@ -301,17 +300,17 @@ health benchmarks.
|
|
| 301 |
| Task and metric | Gemma 3 4B | MedGemma 4B |
|
| 302 |
| :---- | :---- | :---- |
|
| 303 |
| **Medical image classification** | | |
|
| 304 |
-
| MIMIC CXR\*\*
|
| 305 |
-
| CheXpert CXR
|
| 306 |
-
| CXR14
|
| 307 |
-
| PathMCQA\* (histopathology, internal\*\*)
|
| 308 |
-
| US-DermMCQA\*
|
| 309 |
-
| EyePACS\* (fundus, internal)
|
| 310 |
| **Visual question answering** | | |
|
| 311 |
-
| SLAKE (radiology)
|
| 312 |
-
| VQA-RAD\*\*\* (radiology)
|
| 313 |
| **Knowledge and reasoning** | | | | |
|
| 314 |
-
| MedXpertQA (text
|
| 315 |
|
| 316 |
*Internal datasets. US-DermMCQA is described in [Liu (2020, Nature
|
| 317 |
medicine)](https://www.nature.com/articles/s41591-020-0842-3), presented as a
|
|
@@ -338,7 +337,7 @@ pre-trained checkpoint with our previous best model for CXR report generation,
|
|
| 338 |
|
| 339 |
| Metric | MedGemma 4B (pre-trained) | MedGemma 4B (tuned for CXR)| PaliGemma 2 3B (tuned for CXR) | PaliGemma 2 10B (tuned for CXR) |
|
| 340 |
| :---- | :---- | :---- | :---- | :---- |
|
| 341 |
-
| MIMIC CXR
|
| 342 |
|
| 343 |
|
| 344 |
|
|
@@ -380,6 +379,69 @@ Gemma models.
|
|
| 380 |
| EHRQA | 70.9 | 67.6 |
|
| 381 |
|
| 382 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 383 |
### Ethics and safety evaluation
|
| 384 |
|
| 385 |
#### Evaluation approach
|
|
@@ -524,19 +586,19 @@ consented participants.
|
|
| 524 |
clinical and dermatoscopic) from Australia.
|
| 525 |
* **Dermatology dataset 3:** De-identified dataset of non-diseased skin images
|
| 526 |
from an internal data collection effort.
|
| 527 |
-
* **Pathology dataset 1:** De-identified dataset of histopathology H
|
| 528 |
slide images created in collaboration with an academic research hospital and
|
| 529 |
biobank in Europe. Comprises de-identified colon, prostate, and lymph nodes.
|
| 530 |
-
* **Pathology dataset 2:** De-identified dataset of lung histopathology H
|
| 531 |
and IHC whole slide images created by a commercial biobank in the United
|
| 532 |
States.
|
| 533 |
* **Pathology dataset 3:** De-identified dataset of prostate and lymph node
|
| 534 |
-
H
|
| 535 |
research organization in the United States.
|
| 536 |
* **Pathology dataset 4:** De-identified dataset of histopathology whole slide
|
| 537 |
images created in collaboration with a large, tertiary teaching hospital in
|
| 538 |
the United States. Comprises a diverse set of tissue and stain types,
|
| 539 |
-
predominantly H
|
| 540 |
* **EHR dataset 1:** Question/answer dataset drawn from synthetic FHIR records
|
| 541 |
created by [Synthea.](https://synthetichealth.github.io/synthea/) The test
|
| 542 |
set includes 19 unique patients with 200 questions per patient divided into
|
|
@@ -549,7 +611,7 @@ consented participants.
|
|
| 549 |
[https://physionet.org/content/mimic-cxr/2.1.0/](https://physionet.org/content/mimic-cxr/2.1.0/)
|
| 550 |
*and* Johnson, Alistair E. W., Tom J. Pollard, Seth J. Berkowitz, Nathaniel
|
| 551 |
R. Greenbaum, Matthew P. Lungren, Chih-Ying Deng, Roger G. Mark, and Steven
|
| 552 |
-
Horng. 2019
|
| 553 |
Chest Radiographs with Free-Text Reports." *Scientific Data 6* (1): 1–8.
|
| 554 |
|
| 555 |
* **SLAKE:** Liu, Bo, Li-Ming Zhan, Li Xu, Lin Ma, Yan Yang, and Xiao-Ming Wu.
|
|
@@ -559,10 +621,10 @@ consented participants.
|
|
| 559 |
|
| 560 |
* **PAD-UEFS-20:** Pacheco, Andre GC, et al. "PAD-UFES-20: A skin lesion
|
| 561 |
dataset composed of patient data and clinical images collected from
|
| 562 |
-
smartphones." *Data in brief* 32 (2020): 106221
|
| 563 |
|
| 564 |
* **SCIN:** Ward, Abbi, Jimmy Li, Julie Wang, Sriram Lakshminarasimhan, Ashley
|
| 565 |
-
Carrick, Bilson Campana, Jay Hartford, et al. 2024
|
| 566 |
Dermatology Dataset Through Crowdsourcing With Web Search Advertisements."
|
| 567 |
*JAMA Network Open 7* (11): e2446615–e2446615.
|
| 568 |
|
|
@@ -572,7 +634,7 @@ consented participants.
|
|
| 572 |
|
| 573 |
* **CAMELYON16:** Ehteshami Bejnordi, Babak, Mitko Veta, Paul Johannes van
|
| 574 |
Diest, Bram van Ginneken, Nico Karssemeijer, Geert Litjens, Jeroen A. W. M.
|
| 575 |
-
van der Laak, et al. 2017
|
| 576 |
Algorithms for Detection of Lymph Node Metastases in Women With Breast
|
| 577 |
Cancer." *JAMA 318* (22): 2199–2210.
|
| 578 |
|
|
@@ -581,22 +643,22 @@ consented participants.
|
|
| 581 |
10.17632/t9ndx37v5h.1
|
| 582 |
|
| 583 |
* **VQA-RAD:** Lau, Jason J., Soumya Gayen, Asma Ben Abacha, and Dina
|
| 584 |
-
Demner-Fushman. 2018
|
| 585 |
and Answers about Radiology Images." *Scientific Data 5* (1): 1–10.
|
| 586 |
|
| 587 |
* **Chest ImaGenome:** Wu, J., Agu, N., Lourentzou, I., Sharma, A., Paguio,
|
| 588 |
J., Yao, J. S., Dee, E. C., Mitchell, W., Kashyap, S., Giovannini, A., Celi,
|
| 589 |
L. A., Syeda-Mahmood, T., & Moradi, M. (2021). Chest ImaGenome Dataset
|
| 590 |
-
(version 1.0.0). PhysioNet. RRID:
|
| 591 |
[https://doi.org/10.13026/wv01-y230](https://doi.org/10.13026/wv01-y230)
|
| 592 |
|
| 593 |
* **MedQA:** Jin, Di, Eileen Pan, Nassim Oufattole, Wei-Hung Weng, Hanyi Fang,
|
| 594 |
-
and Peter Szolovits. 2020
|
| 595 |
Large-Scale Open Domain Question Answering Dataset from Medical Exams."
|
| 596 |
[http://arxiv.org/abs/2009.13081](http://arxiv.org/abs/2009.13081).
|
| 597 |
|
| 598 |
* **AfrimedQA:** Olatunji, Tobi, Charles Nimo, Abraham Owodunni, Tassallah
|
| 599 |
-
Abdullahi, Emmanuel Ayodele, Mardhiyah Sanni, Chinemelu Aka, et al. 2024
|
| 600 |
"AfriMed-QA: A Pan-African, Multi-Specialty, Medical Question-Answering
|
| 601 |
Benchmark Dataset."
|
| 602 |
[http://arxiv.org/abs/2411.15640](http://arxiv.org/abs/2411.15640).
|
|
@@ -607,46 +669,10 @@ consented participants.
|
|
| 607 |
[https://arxiv.org/abs/2404.05590](https://arxiv.org/abs/2404.05590)
|
| 608 |
|
| 609 |
* **MedXpertQA:** Zuo, Yuxin, Shang Qu, Yifei Li, Zhangren Chen, Xuekai Zhu,
|
| 610 |
-
Ermo Hua, Kaiyan Zhang, Ning Ding, and Bowen Zhou. 2025
|
| 611 |
Benchmarking Expert-Level Medical Reasoning and Understanding."
|
| 612 |
[http://arxiv.org/abs/2501.18362](http://arxiv.org/abs/2501.18362).
|
| 613 |
|
| 614 |
-
* **HealthSearchQA:** This dataset consists of consisting of 3,173 commonly searched consumer
|
| 615 |
-
questions
|
| 616 |
-
|
| 617 |
-
In addition to the public datasets listed above, MedGemma was also trained on
|
| 618 |
-
de-identified, licensed datasets or datasets collected internally at Google from
|
| 619 |
-
consented participants.
|
| 620 |
-
|
| 621 |
-
* **Radiology dataset 1:** De-identified dataset of different CT studies
|
| 622 |
-
across body parts from a US-based radiology outpatient diagnostic center
|
| 623 |
-
network.
|
| 624 |
-
* **Ophthalmology dataset 1 (EyePACS):** De-identified dataset of fundus
|
| 625 |
-
images from diabetic retinopathy screening.
|
| 626 |
-
* **Dermatology dataset 1:** De-identified dataset of teledermatology skin
|
| 627 |
-
condition images (both clinical and dermatoscopic) from Colombia.
|
| 628 |
-
* **Dermatology dataset 2:** De-identified dataset of skin cancer images (both
|
| 629 |
-
clinical and dermatoscopic) from Australia.
|
| 630 |
-
* **Dermatology dataset 3:** De-identified dataset of non-diseased skin images
|
| 631 |
-
from an internal data collection effort.
|
| 632 |
-
* **Pathology dataset 1:** De-identified dataset of histopathology H&E whole
|
| 633 |
-
slide images created in collaboration with an academic research hospital and
|
| 634 |
-
biobank in Europe. Comprises de-identified colon, prostate, and lymph nodes.
|
| 635 |
-
* **Pathology dataset 2:** De-identified dataset of lung histopathology H&E
|
| 636 |
-
and IHC whole slide images created by a commercial biobank in the United
|
| 637 |
-
States.
|
| 638 |
-
* **Pathology dataset 3:** De-identified dataset of prostate and lymph node
|
| 639 |
-
H&E and IHC histopathology whole slide images created by a contract
|
| 640 |
-
research organization in the United States.
|
| 641 |
-
* **Pathology dataset 4:** De-identified dataset of histopathology whole slide
|
| 642 |
-
images created in collaboration with a large, tertiary teaching hospital in
|
| 643 |
-
the United States. Comprises a diverse set of tissue and stain types,
|
| 644 |
-
predominantly H&E.
|
| 645 |
-
* **EHR dataset 1:** Question/answer dataset drawn from synthetic FHIR records
|
| 646 |
-
created by [Synthea.](https://synthetichealth.github.io/synthea/) The test
|
| 647 |
-
set includes 19 unique patients with 200 questions per patient divided into
|
| 648 |
-
10 different categories.
|
| 649 |
-
|
| 650 |
### De-identification/anonymization:
|
| 651 |
|
| 652 |
Google and its partners utilize datasets that have been rigorously anonymized or
|
|
@@ -716,7 +742,7 @@ of multiple images.
|
|
| 716 |
MedGemma has not been evaluated or optimized for multi-turn applications.
|
| 717 |
|
| 718 |
MedGemma's training may make it more sensitive to the specific prompt used than
|
| 719 |
-
Gemma 3
|
| 720 |
|
| 721 |
When adapting MedGemma developer should consider the following:
|
| 722 |
|
|
|
|
| 1 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2 |
license: other
|
| 3 |
license_name: health-ai-developer-foundations
|
| 4 |
license_link: https://developers.google.com/health-ai-developer-foundations/terms
|
| 5 |
+
library_name: transformers
|
| 6 |
pipeline_tag: image-text-to-text
|
| 7 |
+
extra_gated_heading: Access MedGemma on Hugging Face
|
| 8 |
+
extra_gated_prompt: >-
|
| 9 |
+
To access MedGemma on Hugging Face, you're required to review and
|
| 10 |
+
agree to [Health AI Developer Foundation's terms of use](https://developers.google.com/health-ai-developer-foundations/terms).
|
| 11 |
+
To do this, please ensure you're logged in to Hugging Face and click below.
|
| 12 |
+
Requests are processed immediately.
|
| 13 |
+
extra_gated_button_content: Acknowledge license
|
| 14 |
+
base_model:
|
| 15 |
+
- google/medgemma-4b-it
|
| 16 |
tags:
|
| 17 |
- medical
|
| 18 |
- unsloth
|
|
|
|
| 22 |
- pathology
|
| 23 |
- ophthalmology
|
| 24 |
- chest-x-ray
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 25 |
---
|
|
|
|
| 26 |
<div>
|
| 27 |
<p style="margin-top: 0;margin-bottom: 0;">
|
| 28 |
<em><a href="https://docs.unsloth.ai/basics/unsloth-dynamic-v2.0-gguf">Unsloth Dynamic 2.0</a> achieves superior accuracy & outperforms other leading quants.</em>
|
|
|
|
| 235 |
### Model architecture overview
|
| 236 |
|
| 237 |
The MedGemma model is built based on [Gemma 3](https://ai.google.dev/gemma/) and
|
| 238 |
+
uses the same decoder-only transformer architecture as Gemma 3\. To read more
|
| 239 |
about the architecture, consult the Gemma 3 [model
|
| 240 |
card](https://ai.google.dev/gemma/docs/core/model_card_3).
|
| 241 |
|
|
|
|
| 300 |
| Task and metric | Gemma 3 4B | MedGemma 4B |
|
| 301 |
| :---- | :---- | :---- |
|
| 302 |
| **Medical image classification** | | |
|
| 303 |
+
| MIMIC CXR\*\* \- macro F1 for top 5 conditions | 81.2 | 88.9 |
|
| 304 |
+
| CheXpert CXR \- macro F1 for top 5 conditions | 32.6 | 48.1 |
|
| 305 |
+
| CXR14 \- macro F1 for 3 conditions | 32.0 | 50.1 |
|
| 306 |
+
| PathMCQA\* (histopathology, internal\*\*) \- Accuracy | 37.1 | 69.8 |
|
| 307 |
+
| US-DermMCQA\* \- Accuracy | 52.5 | 71.8 |
|
| 308 |
+
| EyePACS\* (fundus, internal) \- Accuracy | 14.4 | 64.9 |
|
| 309 |
| **Visual question answering** | | |
|
| 310 |
+
| SLAKE (radiology) \- Tokenized F1 | 40.2 | 72.3 |
|
| 311 |
+
| VQA-RAD\*\*\* (radiology) \- Tokenized F1 | 33.6 | 49.9 |
|
| 312 |
| **Knowledge and reasoning** | | | | |
|
| 313 |
+
| MedXpertQA (text \+ multimodal questions) \- Accuracy | 16.4 | 18.8 |
|
| 314 |
|
| 315 |
*Internal datasets. US-DermMCQA is described in [Liu (2020, Nature
|
| 316 |
medicine)](https://www.nature.com/articles/s41591-020-0842-3), presented as a
|
|
|
|
| 337 |
|
| 338 |
| Metric | MedGemma 4B (pre-trained) | MedGemma 4B (tuned for CXR)| PaliGemma 2 3B (tuned for CXR) | PaliGemma 2 10B (tuned for CXR) |
|
| 339 |
| :---- | :---- | :---- | :---- | :---- |
|
| 340 |
+
| MIMIC CXR \- RadGraph F1 | 29.5 | 30.3 |28.8 | 29.5 |
|
| 341 |
|
| 342 |
|
| 343 |
|
|
|
|
| 379 |
| EHRQA | 70.9 | 67.6 |
|
| 380 |
|
| 381 |
|
| 382 |
+
### Ethics and safety evaluation
|
| 383 |
+
|
| 384 |
+
#### Evaluation approach
|
| 385 |
+
|
| 386 |
+
Our evaluation methods include structured evaluations and internal red-teaming
|
| 387 |
+
testing of relevant content policies. Red-teaming was conducted by a number of
|
| 388 |
+
different teams, each with different goals and human evaluation metrics. These
|
| 389 |
+
models were evaluated against a number of different categories relevant to
|
| 390 |
+
ethics and safety, including:
|
| 391 |
+
|
| 392 |
+
* **Child safety**: Evaluation of text-to-text and image-to-text prompts
|
| 393 |
+
covering child safety policies, including child sexual abuse and
|
| 394 |
+
exploitation.
|
| 395 |
+
* **Content safety:** Evaluation of text-to-text and image-to-text prompts
|
| 396 |
+
covering safety policies, including harassment, violence and gore, and hate
|
| 397 |
+
speech.
|
| 398 |
+
* **Representational harms**: Evaluation of text-to-text and image-to-text
|
| 399 |
+
prompts covering safety policies, including bias, stereotyping, and harmful
|
| 400 |
+
associations or inaccuracies.
|
| 401 |
+
* **General medical harms:** Evaluation of text-to-text and image-to-text
|
| 402 |
+
prompts covering safety policies, including information quality and harmful
|
| 403 |
+
associations or inaccuracies.
|
| 404 |
+
|
| 405 |
+
In addition to development level evaluations, we conduct "assurance evaluations"
|
| 406 |
+
which are our "arms-length" internal evaluations for responsibility governance
|
| 407 |
+
decision making. They are conducted separately from the model development team,
|
| 408 |
+
to inform decision making about release. High-level findings are fed back to the
|
| 409 |
+
model team, but prompt sets are held out to prevent overfitting and preserve the
|
| 410 |
+
results' ability to inform decision making. Notable assurance evaluation results
|
| 411 |
+
are reported to our Responsibility & Safety Council as part of release review.
|
| 412 |
+
|
| 413 |
+
#### Evaluation results
|
| 414 |
+
|
| 415 |
+
For all areas of safety testing, we saw safe levels of performance across the
|
| 416 |
+
categories of child safety, content safety, and representational harms. All
|
| 417 |
+
testing was conducted without safety filters to evaluate the model capabilities
|
| 418 |
+
and behaviors. For text-to-text, image-to-text, and audio-to-text, and across
|
| 419 |
+
both MedGemma model sizes, the model produced minimal policy violations. A
|
| 420 |
+
limitation of our evaluations was that they included primarily English language
|
| 421 |
+
prompts.
|
| 422 |
+
|
| 423 |
+
## Data card
|
| 424 |
+
|
| 425 |
+
### Dataset overview
|
| 426 |
+
|
| 427 |
+
#### Training
|
| 428 |
+
|
| 429 |
+
The base Gemma models are pre-trained on a large corpus of text and code data.
|
| 430 |
+
MedGemma 4B utilizes a [SigLIP](https://arxiv.org/abs/2303.15343) image encoder
|
| 431 |
+
that has been specifically pre-trained on a variety of de-identified medical
|
| 432 |
+
data, including radiology images, histopathology images, ophthalmology images,
|
| 433 |
+
and dermatology images. Its LLM component is trained on a diverse set of medical
|
| 434 |
+
data, including medical text relevant to radiology images, chest-x rays,
|
| 435 |
+
histopathology patches, ophthalmology images and dermatology images.
|
| 436 |
+
|
| 437 |
+
#### Evaluation
|
| 438 |
+
|
| 439 |
+
MedGemma models have been evaluated on a comprehensive set of clinically
|
| 440 |
+
relevant benchmarks, including over 22 datasets across 5 different tasks and 6
|
| 441 |
+
medical image modalities. These include both open benchmark datasets and curated
|
| 442 |
+
datasets, with a focus on expert human evaluations for tasks like CXR report
|
| 443 |
+
generation and radiology VQA.
|
| 444 |
+
|
| 445 |
### Ethics and safety evaluation
|
| 446 |
|
| 447 |
#### Evaluation approach
|
|
|
|
| 586 |
clinical and dermatoscopic) from Australia.
|
| 587 |
* **Dermatology dataset 3:** De-identified dataset of non-diseased skin images
|
| 588 |
from an internal data collection effort.
|
| 589 |
+
* **Pathology dataset 1:** De-identified dataset of histopathology H\&E whole
|
| 590 |
slide images created in collaboration with an academic research hospital and
|
| 591 |
biobank in Europe. Comprises de-identified colon, prostate, and lymph nodes.
|
| 592 |
+
* **Pathology dataset 2:** De-identified dataset of lung histopathology H\&E
|
| 593 |
and IHC whole slide images created by a commercial biobank in the United
|
| 594 |
States.
|
| 595 |
* **Pathology dataset 3:** De-identified dataset of prostate and lymph node
|
| 596 |
+
H\&E and IHC histopathology whole slide images created by a contract
|
| 597 |
research organization in the United States.
|
| 598 |
* **Pathology dataset 4:** De-identified dataset of histopathology whole slide
|
| 599 |
images created in collaboration with a large, tertiary teaching hospital in
|
| 600 |
the United States. Comprises a diverse set of tissue and stain types,
|
| 601 |
+
predominantly H\&E.
|
| 602 |
* **EHR dataset 1:** Question/answer dataset drawn from synthetic FHIR records
|
| 603 |
created by [Synthea.](https://synthetichealth.github.io/synthea/) The test
|
| 604 |
set includes 19 unique patients with 200 questions per patient divided into
|
|
|
|
| 611 |
[https://physionet.org/content/mimic-cxr/2.1.0/](https://physionet.org/content/mimic-cxr/2.1.0/)
|
| 612 |
*and* Johnson, Alistair E. W., Tom J. Pollard, Seth J. Berkowitz, Nathaniel
|
| 613 |
R. Greenbaum, Matthew P. Lungren, Chih-Ying Deng, Roger G. Mark, and Steven
|
| 614 |
+
Horng. 2019\. "MIMIC-CXR, a de-Identified Publicly Available Database of
|
| 615 |
Chest Radiographs with Free-Text Reports." *Scientific Data 6* (1): 1–8.
|
| 616 |
|
| 617 |
* **SLAKE:** Liu, Bo, Li-Ming Zhan, Li Xu, Lin Ma, Yan Yang, and Xiao-Ming Wu.
|
|
|
|
| 621 |
|
| 622 |
* **PAD-UEFS-20:** Pacheco, Andre GC, et al. "PAD-UFES-20: A skin lesion
|
| 623 |
dataset composed of patient data and clinical images collected from
|
| 624 |
+
smartphones." *Data in brief* 32 (2020): 106221\.
|
| 625 |
|
| 626 |
* **SCIN:** Ward, Abbi, Jimmy Li, Julie Wang, Sriram Lakshminarasimhan, Ashley
|
| 627 |
+
Carrick, Bilson Campana, Jay Hartford, et al. 2024\. "Creating an Empirical
|
| 628 |
Dermatology Dataset Through Crowdsourcing With Web Search Advertisements."
|
| 629 |
*JAMA Network Open 7* (11): e2446615–e2446615.
|
| 630 |
|
|
|
|
| 634 |
|
| 635 |
* **CAMELYON16:** Ehteshami Bejnordi, Babak, Mitko Veta, Paul Johannes van
|
| 636 |
Diest, Bram van Ginneken, Nico Karssemeijer, Geert Litjens, Jeroen A. W. M.
|
| 637 |
+
van der Laak, et al. 2017\. "Diagnostic Assessment of Deep Learning
|
| 638 |
Algorithms for Detection of Lymph Node Metastases in Women With Breast
|
| 639 |
Cancer." *JAMA 318* (22): 2199–2210.
|
| 640 |
|
|
|
|
| 643 |
10.17632/t9ndx37v5h.1
|
| 644 |
|
| 645 |
* **VQA-RAD:** Lau, Jason J., Soumya Gayen, Asma Ben Abacha, and Dina
|
| 646 |
+
Demner-Fushman. 2018\. "A Dataset of Clinically Generated Visual Questions
|
| 647 |
and Answers about Radiology Images." *Scientific Data 5* (1): 1–10.
|
| 648 |
|
| 649 |
* **Chest ImaGenome:** Wu, J., Agu, N., Lourentzou, I., Sharma, A., Paguio,
|
| 650 |
J., Yao, J. S., Dee, E. C., Mitchell, W., Kashyap, S., Giovannini, A., Celi,
|
| 651 |
L. A., Syeda-Mahmood, T., & Moradi, M. (2021). Chest ImaGenome Dataset
|
| 652 |
+
(version 1.0.0). PhysioNet. RRID:SCR\_007345.
|
| 653 |
[https://doi.org/10.13026/wv01-y230](https://doi.org/10.13026/wv01-y230)
|
| 654 |
|
| 655 |
* **MedQA:** Jin, Di, Eileen Pan, Nassim Oufattole, Wei-Hung Weng, Hanyi Fang,
|
| 656 |
+
and Peter Szolovits. 2020\. "What Disease Does This Patient Have? A
|
| 657 |
Large-Scale Open Domain Question Answering Dataset from Medical Exams."
|
| 658 |
[http://arxiv.org/abs/2009.13081](http://arxiv.org/abs/2009.13081).
|
| 659 |
|
| 660 |
* **AfrimedQA:** Olatunji, Tobi, Charles Nimo, Abraham Owodunni, Tassallah
|
| 661 |
+
Abdullahi, Emmanuel Ayodele, Mardhiyah Sanni, Chinemelu Aka, et al. 2024\.
|
| 662 |
"AfriMed-QA: A Pan-African, Multi-Specialty, Medical Question-Answering
|
| 663 |
Benchmark Dataset."
|
| 664 |
[http://arxiv.org/abs/2411.15640](http://arxiv.org/abs/2411.15640).
|
|
|
|
| 669 |
[https://arxiv.org/abs/2404.05590](https://arxiv.org/abs/2404.05590)
|
| 670 |
|
| 671 |
* **MedXpertQA:** Zuo, Yuxin, Shang Qu, Yifei Li, Zhangren Chen, Xuekai Zhu,
|
| 672 |
+
Ermo Hua, Kaiyan Zhang, Ning Ding, and Bowen Zhou. 2025\. "MedXpertQA:
|
| 673 |
Benchmarking Expert-Level Medical Reasoning and Understanding."
|
| 674 |
[http://arxiv.org/abs/2501.18362](http://arxiv.org/abs/2501.18362).
|
| 675 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 676 |
### De-identification/anonymization:
|
| 677 |
|
| 678 |
Google and its partners utilize datasets that have been rigorously anonymized or
|
|
|
|
| 742 |
MedGemma has not been evaluated or optimized for multi-turn applications.
|
| 743 |
|
| 744 |
MedGemma's training may make it more sensitive to the specific prompt used than
|
| 745 |
+
Gemma 3\.
|
| 746 |
|
| 747 |
When adapting MedGemma developer should consider the following:
|
| 748 |
|