danielhanchen commited on
Commit
fad4db6
·
verified ·
1 Parent(s): dfb3f78

Upload folder using huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +98 -72
README.md CHANGED
@@ -1,12 +1,18 @@
1
  ---
2
- base_model:
3
- - google/medgemma-4b-it
4
- paper: 2507.05201
5
- library_name: transformers
6
  license: other
7
  license_name: health-ai-developer-foundations
8
  license_link: https://developers.google.com/health-ai-developer-foundations/terms
 
9
  pipeline_tag: image-text-to-text
 
 
 
 
 
 
 
 
 
10
  tags:
11
  - medical
12
  - unsloth
@@ -16,14 +22,7 @@ tags:
16
  - pathology
17
  - ophthalmology
18
  - chest-x-ray
19
- extra_gated_heading: Access MedGemma on Hugging Face
20
- extra_gated_prompt: To access MedGemma on Hugging Face, you're required to review
21
- and agree to [Health AI Developer Foundation's terms of use](https://developers.google.com/health-ai-developer-foundations/terms).
22
- To do this, please ensure you're logged in to Hugging Face and click below. Requests
23
- are processed immediately.
24
- extra_gated_button_content: Acknowledge license
25
  ---
26
-
27
  <div>
28
  <p style="margin-top: 0;margin-bottom: 0;">
29
  <em><a href="https://docs.unsloth.ai/basics/unsloth-dynamic-v2.0-gguf">Unsloth Dynamic 2.0</a> achieves superior accuracy & outperforms other leading quants.</em>
@@ -236,7 +235,7 @@ See the following Colab notebooks for examples of how to use MedGemma:
236
  ### Model architecture overview
237
 
238
  The MedGemma model is built based on [Gemma 3](https://ai.google.dev/gemma/) and
239
- uses the same decoder-only transformer architecture as Gemma 3. To read more
240
  about the architecture, consult the Gemma 3 [model
241
  card](https://ai.google.dev/gemma/docs/core/model_card_3).
242
 
@@ -301,17 +300,17 @@ health benchmarks.
301
  | Task and metric | Gemma 3 4B | MedGemma 4B |
302
  | :---- | :---- | :---- |
303
  | **Medical image classification** | | |
304
- | MIMIC CXR\*\* - macro F1 for top 5 conditions | 81.2 | 88.9 |
305
- | CheXpert CXR - macro F1 for top 5 conditions | 32.6 | 48.1 |
306
- | CXR14 - macro F1 for 3 conditions | 32.0 | 50.1 |
307
- | PathMCQA\* (histopathology, internal\*\*) - Accuracy | 37.1 | 69.8 |
308
- | US-DermMCQA\* - Accuracy | 52.5 | 71.8 |
309
- | EyePACS\* (fundus, internal) - Accuracy | 14.4 | 64.9 |
310
  | **Visual question answering** | | |
311
- | SLAKE (radiology) - Tokenized F1 | 40.2 | 72.3 |
312
- | VQA-RAD\*\*\* (radiology) - Tokenized F1 | 33.6 | 49.9 |
313
  | **Knowledge and reasoning** | | | | |
314
- | MedXpertQA (text + multimodal questions) - Accuracy | 16.4 | 18.8 |
315
 
316
  *Internal datasets. US-DermMCQA is described in [Liu (2020, Nature
317
  medicine)](https://www.nature.com/articles/s41591-020-0842-3), presented as a
@@ -338,7 +337,7 @@ pre-trained checkpoint with our previous best model for CXR report generation,
338
 
339
  | Metric | MedGemma 4B (pre-trained) | MedGemma 4B (tuned for CXR)| PaliGemma 2 3B (tuned for CXR) | PaliGemma 2 10B (tuned for CXR) |
340
  | :---- | :---- | :---- | :---- | :---- |
341
- | MIMIC CXR - RadGraph F1 | 29.5 | 30.3 |28.8 | 29.5 |
342
 
343
 
344
 
@@ -380,6 +379,69 @@ Gemma models.
380
  | EHRQA | 70.9 | 67.6 |
381
 
382
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
383
  ### Ethics and safety evaluation
384
 
385
  #### Evaluation approach
@@ -524,19 +586,19 @@ consented participants.
524
  clinical and dermatoscopic) from Australia.
525
  * **Dermatology dataset 3:** De-identified dataset of non-diseased skin images
526
  from an internal data collection effort.
527
- * **Pathology dataset 1:** De-identified dataset of histopathology H&E whole
528
  slide images created in collaboration with an academic research hospital and
529
  biobank in Europe. Comprises de-identified colon, prostate, and lymph nodes.
530
- * **Pathology dataset 2:** De-identified dataset of lung histopathology H&E
531
  and IHC whole slide images created by a commercial biobank in the United
532
  States.
533
  * **Pathology dataset 3:** De-identified dataset of prostate and lymph node
534
- H&E and IHC histopathology whole slide images created by a contract
535
  research organization in the United States.
536
  * **Pathology dataset 4:** De-identified dataset of histopathology whole slide
537
  images created in collaboration with a large, tertiary teaching hospital in
538
  the United States. Comprises a diverse set of tissue and stain types,
539
- predominantly H&E.
540
  * **EHR dataset 1:** Question/answer dataset drawn from synthetic FHIR records
541
  created by [Synthea.](https://synthetichealth.github.io/synthea/) The test
542
  set includes 19 unique patients with 200 questions per patient divided into
@@ -549,7 +611,7 @@ consented participants.
549
  [https://physionet.org/content/mimic-cxr/2.1.0/](https://physionet.org/content/mimic-cxr/2.1.0/)
550
  *and* Johnson, Alistair E. W., Tom J. Pollard, Seth J. Berkowitz, Nathaniel
551
  R. Greenbaum, Matthew P. Lungren, Chih-Ying Deng, Roger G. Mark, and Steven
552
- Horng. 2019. "MIMIC-CXR, a de-Identified Publicly Available Database of
553
  Chest Radiographs with Free-Text Reports." *Scientific Data 6* (1): 1–8.
554
 
555
  * **SLAKE:** Liu, Bo, Li-Ming Zhan, Li Xu, Lin Ma, Yan Yang, and Xiao-Ming Wu.
@@ -559,10 +621,10 @@ consented participants.
559
 
560
  * **PAD-UEFS-20:** Pacheco, Andre GC, et al. "PAD-UFES-20: A skin lesion
561
  dataset composed of patient data and clinical images collected from
562
- smartphones." *Data in brief* 32 (2020): 106221.
563
 
564
  * **SCIN:** Ward, Abbi, Jimmy Li, Julie Wang, Sriram Lakshminarasimhan, Ashley
565
- Carrick, Bilson Campana, Jay Hartford, et al. 2024. "Creating an Empirical
566
  Dermatology Dataset Through Crowdsourcing With Web Search Advertisements."
567
  *JAMA Network Open 7* (11): e2446615–e2446615.
568
 
@@ -572,7 +634,7 @@ consented participants.
572
 
573
  * **CAMELYON16:** Ehteshami Bejnordi, Babak, Mitko Veta, Paul Johannes van
574
  Diest, Bram van Ginneken, Nico Karssemeijer, Geert Litjens, Jeroen A. W. M.
575
- van der Laak, et al. 2017. "Diagnostic Assessment of Deep Learning
576
  Algorithms for Detection of Lymph Node Metastases in Women With Breast
577
  Cancer." *JAMA 318* (22): 2199–2210.
578
 
@@ -581,22 +643,22 @@ consented participants.
581
  10.17632/t9ndx37v5h.1
582
 
583
  * **VQA-RAD:** Lau, Jason J., Soumya Gayen, Asma Ben Abacha, and Dina
584
- Demner-Fushman. 2018. "A Dataset of Clinically Generated Visual Questions
585
  and Answers about Radiology Images." *Scientific Data 5* (1): 1–10.
586
 
587
  * **Chest ImaGenome:** Wu, J., Agu, N., Lourentzou, I., Sharma, A., Paguio,
588
  J., Yao, J. S., Dee, E. C., Mitchell, W., Kashyap, S., Giovannini, A., Celi,
589
  L. A., Syeda-Mahmood, T., & Moradi, M. (2021). Chest ImaGenome Dataset
590
- (version 1.0.0). PhysioNet. RRID:SCR_007345.
591
  [https://doi.org/10.13026/wv01-y230](https://doi.org/10.13026/wv01-y230)
592
 
593
  * **MedQA:** Jin, Di, Eileen Pan, Nassim Oufattole, Wei-Hung Weng, Hanyi Fang,
594
- and Peter Szolovits. 2020. "What Disease Does This Patient Have? A
595
  Large-Scale Open Domain Question Answering Dataset from Medical Exams."
596
  [http://arxiv.org/abs/2009.13081](http://arxiv.org/abs/2009.13081).
597
 
598
  * **AfrimedQA:** Olatunji, Tobi, Charles Nimo, Abraham Owodunni, Tassallah
599
- Abdullahi, Emmanuel Ayodele, Mardhiyah Sanni, Chinemelu Aka, et al. 2024.
600
  "AfriMed-QA: A Pan-African, Multi-Specialty, Medical Question-Answering
601
  Benchmark Dataset."
602
  [http://arxiv.org/abs/2411.15640](http://arxiv.org/abs/2411.15640).
@@ -607,46 +669,10 @@ consented participants.
607
  [https://arxiv.org/abs/2404.05590](https://arxiv.org/abs/2404.05590)
608
 
609
  * **MedXpertQA:** Zuo, Yuxin, Shang Qu, Yifei Li, Zhangren Chen, Xuekai Zhu,
610
- Ermo Hua, Kaiyan Zhang, Ning Ding, and Bowen Zhou. 2025. "MedXpertQA:
611
  Benchmarking Expert-Level Medical Reasoning and Understanding."
612
  [http://arxiv.org/abs/2501.18362](http://arxiv.org/abs/2501.18362).
613
 
614
- * **HealthSearchQA:** This dataset consists of consisting of 3,173 commonly searched consumer
615
- questions
616
-
617
- In addition to the public datasets listed above, MedGemma was also trained on
618
- de-identified, licensed datasets or datasets collected internally at Google from
619
- consented participants.
620
-
621
- * **Radiology dataset 1:** De-identified dataset of different CT studies
622
- across body parts from a US-based radiology outpatient diagnostic center
623
- network.
624
- * **Ophthalmology dataset 1 (EyePACS):** De-identified dataset of fundus
625
- images from diabetic retinopathy screening.
626
- * **Dermatology dataset 1:** De-identified dataset of teledermatology skin
627
- condition images (both clinical and dermatoscopic) from Colombia.
628
- * **Dermatology dataset 2:** De-identified dataset of skin cancer images (both
629
- clinical and dermatoscopic) from Australia.
630
- * **Dermatology dataset 3:** De-identified dataset of non-diseased skin images
631
- from an internal data collection effort.
632
- * **Pathology dataset 1:** De-identified dataset of histopathology H&E whole
633
- slide images created in collaboration with an academic research hospital and
634
- biobank in Europe. Comprises de-identified colon, prostate, and lymph nodes.
635
- * **Pathology dataset 2:** De-identified dataset of lung histopathology H&E
636
- and IHC whole slide images created by a commercial biobank in the United
637
- States.
638
- * **Pathology dataset 3:** De-identified dataset of prostate and lymph node
639
- H&E and IHC histopathology whole slide images created by a contract
640
- research organization in the United States.
641
- * **Pathology dataset 4:** De-identified dataset of histopathology whole slide
642
- images created in collaboration with a large, tertiary teaching hospital in
643
- the United States. Comprises a diverse set of tissue and stain types,
644
- predominantly H&E.
645
- * **EHR dataset 1:** Question/answer dataset drawn from synthetic FHIR records
646
- created by [Synthea.](https://synthetichealth.github.io/synthea/) The test
647
- set includes 19 unique patients with 200 questions per patient divided into
648
- 10 different categories.
649
-
650
  ### De-identification/anonymization:
651
 
652
  Google and its partners utilize datasets that have been rigorously anonymized or
@@ -716,7 +742,7 @@ of multiple images.
716
  MedGemma has not been evaluated or optimized for multi-turn applications.
717
 
718
  MedGemma's training may make it more sensitive to the specific prompt used than
719
- Gemma 3.
720
 
721
  When adapting MedGemma developer should consider the following:
722
 
 
1
  ---
 
 
 
 
2
  license: other
3
  license_name: health-ai-developer-foundations
4
  license_link: https://developers.google.com/health-ai-developer-foundations/terms
5
+ library_name: transformers
6
  pipeline_tag: image-text-to-text
7
+ extra_gated_heading: Access MedGemma on Hugging Face
8
+ extra_gated_prompt: >-
9
+ To access MedGemma on Hugging Face, you're required to review and
10
+ agree to [Health AI Developer Foundation's terms of use](https://developers.google.com/health-ai-developer-foundations/terms).
11
+ To do this, please ensure you're logged in to Hugging Face and click below.
12
+ Requests are processed immediately.
13
+ extra_gated_button_content: Acknowledge license
14
+ base_model:
15
+ - google/medgemma-4b-it
16
  tags:
17
  - medical
18
  - unsloth
 
22
  - pathology
23
  - ophthalmology
24
  - chest-x-ray
 
 
 
 
 
 
25
  ---
 
26
  <div>
27
  <p style="margin-top: 0;margin-bottom: 0;">
28
  <em><a href="https://docs.unsloth.ai/basics/unsloth-dynamic-v2.0-gguf">Unsloth Dynamic 2.0</a> achieves superior accuracy & outperforms other leading quants.</em>
 
235
  ### Model architecture overview
236
 
237
  The MedGemma model is built based on [Gemma 3](https://ai.google.dev/gemma/) and
238
+ uses the same decoder-only transformer architecture as Gemma 3\. To read more
239
  about the architecture, consult the Gemma 3 [model
240
  card](https://ai.google.dev/gemma/docs/core/model_card_3).
241
 
 
300
  | Task and metric | Gemma 3 4B | MedGemma 4B |
301
  | :---- | :---- | :---- |
302
  | **Medical image classification** | | |
303
+ | MIMIC CXR\*\* \- macro F1 for top 5 conditions | 81.2 | 88.9 |
304
+ | CheXpert CXR \- macro F1 for top 5 conditions | 32.6 | 48.1 |
305
+ | CXR14 \- macro F1 for 3 conditions | 32.0 | 50.1 |
306
+ | PathMCQA\* (histopathology, internal\*\*) \- Accuracy | 37.1 | 69.8 |
307
+ | US-DermMCQA\* \- Accuracy | 52.5 | 71.8 |
308
+ | EyePACS\* (fundus, internal) \- Accuracy | 14.4 | 64.9 |
309
  | **Visual question answering** | | |
310
+ | SLAKE (radiology) \- Tokenized F1 | 40.2 | 72.3 |
311
+ | VQA-RAD\*\*\* (radiology) \- Tokenized F1 | 33.6 | 49.9 |
312
  | **Knowledge and reasoning** | | | | |
313
+ | MedXpertQA (text \+ multimodal questions) \- Accuracy | 16.4 | 18.8 |
314
 
315
  *Internal datasets. US-DermMCQA is described in [Liu (2020, Nature
316
  medicine)](https://www.nature.com/articles/s41591-020-0842-3), presented as a
 
337
 
338
  | Metric | MedGemma 4B (pre-trained) | MedGemma 4B (tuned for CXR)| PaliGemma 2 3B (tuned for CXR) | PaliGemma 2 10B (tuned for CXR) |
339
  | :---- | :---- | :---- | :---- | :---- |
340
+ | MIMIC CXR \- RadGraph F1 | 29.5 | 30.3 |28.8 | 29.5 |
341
 
342
 
343
 
 
379
  | EHRQA | 70.9 | 67.6 |
380
 
381
 
382
+ ### Ethics and safety evaluation
383
+
384
+ #### Evaluation approach
385
+
386
+ Our evaluation methods include structured evaluations and internal red-teaming
387
+ testing of relevant content policies. Red-teaming was conducted by a number of
388
+ different teams, each with different goals and human evaluation metrics. These
389
+ models were evaluated against a number of different categories relevant to
390
+ ethics and safety, including:
391
+
392
+ * **Child safety**: Evaluation of text-to-text and image-to-text prompts
393
+ covering child safety policies, including child sexual abuse and
394
+ exploitation.
395
+ * **Content safety:** Evaluation of text-to-text and image-to-text prompts
396
+ covering safety policies, including harassment, violence and gore, and hate
397
+ speech.
398
+ * **Representational harms**: Evaluation of text-to-text and image-to-text
399
+ prompts covering safety policies, including bias, stereotyping, and harmful
400
+ associations or inaccuracies.
401
+ * **General medical harms:** Evaluation of text-to-text and image-to-text
402
+ prompts covering safety policies, including information quality and harmful
403
+ associations or inaccuracies.
404
+
405
+ In addition to development level evaluations, we conduct "assurance evaluations"
406
+ which are our "arms-length" internal evaluations for responsibility governance
407
+ decision making. They are conducted separately from the model development team,
408
+ to inform decision making about release. High-level findings are fed back to the
409
+ model team, but prompt sets are held out to prevent overfitting and preserve the
410
+ results' ability to inform decision making. Notable assurance evaluation results
411
+ are reported to our Responsibility & Safety Council as part of release review.
412
+
413
+ #### Evaluation results
414
+
415
+ For all areas of safety testing, we saw safe levels of performance across the
416
+ categories of child safety, content safety, and representational harms. All
417
+ testing was conducted without safety filters to evaluate the model capabilities
418
+ and behaviors. For text-to-text, image-to-text, and audio-to-text, and across
419
+ both MedGemma model sizes, the model produced minimal policy violations. A
420
+ limitation of our evaluations was that they included primarily English language
421
+ prompts.
422
+
423
+ ## Data card
424
+
425
+ ### Dataset overview
426
+
427
+ #### Training
428
+
429
+ The base Gemma models are pre-trained on a large corpus of text and code data.
430
+ MedGemma 4B utilizes a [SigLIP](https://arxiv.org/abs/2303.15343) image encoder
431
+ that has been specifically pre-trained on a variety of de-identified medical
432
+ data, including radiology images, histopathology images, ophthalmology images,
433
+ and dermatology images. Its LLM component is trained on a diverse set of medical
434
+ data, including medical text relevant to radiology images, chest-x rays,
435
+ histopathology patches, ophthalmology images and dermatology images.
436
+
437
+ #### Evaluation
438
+
439
+ MedGemma models have been evaluated on a comprehensive set of clinically
440
+ relevant benchmarks, including over 22 datasets across 5 different tasks and 6
441
+ medical image modalities. These include both open benchmark datasets and curated
442
+ datasets, with a focus on expert human evaluations for tasks like CXR report
443
+ generation and radiology VQA.
444
+
445
  ### Ethics and safety evaluation
446
 
447
  #### Evaluation approach
 
586
  clinical and dermatoscopic) from Australia.
587
  * **Dermatology dataset 3:** De-identified dataset of non-diseased skin images
588
  from an internal data collection effort.
589
+ * **Pathology dataset 1:** De-identified dataset of histopathology H\&E whole
590
  slide images created in collaboration with an academic research hospital and
591
  biobank in Europe. Comprises de-identified colon, prostate, and lymph nodes.
592
+ * **Pathology dataset 2:** De-identified dataset of lung histopathology H\&E
593
  and IHC whole slide images created by a commercial biobank in the United
594
  States.
595
  * **Pathology dataset 3:** De-identified dataset of prostate and lymph node
596
+ H\&E and IHC histopathology whole slide images created by a contract
597
  research organization in the United States.
598
  * **Pathology dataset 4:** De-identified dataset of histopathology whole slide
599
  images created in collaboration with a large, tertiary teaching hospital in
600
  the United States. Comprises a diverse set of tissue and stain types,
601
+ predominantly H\&E.
602
  * **EHR dataset 1:** Question/answer dataset drawn from synthetic FHIR records
603
  created by [Synthea.](https://synthetichealth.github.io/synthea/) The test
604
  set includes 19 unique patients with 200 questions per patient divided into
 
611
  [https://physionet.org/content/mimic-cxr/2.1.0/](https://physionet.org/content/mimic-cxr/2.1.0/)
612
  *and* Johnson, Alistair E. W., Tom J. Pollard, Seth J. Berkowitz, Nathaniel
613
  R. Greenbaum, Matthew P. Lungren, Chih-Ying Deng, Roger G. Mark, and Steven
614
+ Horng. 2019\. "MIMIC-CXR, a de-Identified Publicly Available Database of
615
  Chest Radiographs with Free-Text Reports." *Scientific Data 6* (1): 1–8.
616
 
617
  * **SLAKE:** Liu, Bo, Li-Ming Zhan, Li Xu, Lin Ma, Yan Yang, and Xiao-Ming Wu.
 
621
 
622
  * **PAD-UEFS-20:** Pacheco, Andre GC, et al. "PAD-UFES-20: A skin lesion
623
  dataset composed of patient data and clinical images collected from
624
+ smartphones." *Data in brief* 32 (2020): 106221\.
625
 
626
  * **SCIN:** Ward, Abbi, Jimmy Li, Julie Wang, Sriram Lakshminarasimhan, Ashley
627
+ Carrick, Bilson Campana, Jay Hartford, et al. 2024\. "Creating an Empirical
628
  Dermatology Dataset Through Crowdsourcing With Web Search Advertisements."
629
  *JAMA Network Open 7* (11): e2446615–e2446615.
630
 
 
634
 
635
  * **CAMELYON16:** Ehteshami Bejnordi, Babak, Mitko Veta, Paul Johannes van
636
  Diest, Bram van Ginneken, Nico Karssemeijer, Geert Litjens, Jeroen A. W. M.
637
+ van der Laak, et al. 2017\. "Diagnostic Assessment of Deep Learning
638
  Algorithms for Detection of Lymph Node Metastases in Women With Breast
639
  Cancer." *JAMA 318* (22): 2199–2210.
640
 
 
643
  10.17632/t9ndx37v5h.1
644
 
645
  * **VQA-RAD:** Lau, Jason J., Soumya Gayen, Asma Ben Abacha, and Dina
646
+ Demner-Fushman. 2018\. "A Dataset of Clinically Generated Visual Questions
647
  and Answers about Radiology Images." *Scientific Data 5* (1): 1–10.
648
 
649
  * **Chest ImaGenome:** Wu, J., Agu, N., Lourentzou, I., Sharma, A., Paguio,
650
  J., Yao, J. S., Dee, E. C., Mitchell, W., Kashyap, S., Giovannini, A., Celi,
651
  L. A., Syeda-Mahmood, T., & Moradi, M. (2021). Chest ImaGenome Dataset
652
+ (version 1.0.0). PhysioNet. RRID:SCR\_007345.
653
  [https://doi.org/10.13026/wv01-y230](https://doi.org/10.13026/wv01-y230)
654
 
655
  * **MedQA:** Jin, Di, Eileen Pan, Nassim Oufattole, Wei-Hung Weng, Hanyi Fang,
656
+ and Peter Szolovits. 2020\. "What Disease Does This Patient Have? A
657
  Large-Scale Open Domain Question Answering Dataset from Medical Exams."
658
  [http://arxiv.org/abs/2009.13081](http://arxiv.org/abs/2009.13081).
659
 
660
  * **AfrimedQA:** Olatunji, Tobi, Charles Nimo, Abraham Owodunni, Tassallah
661
+ Abdullahi, Emmanuel Ayodele, Mardhiyah Sanni, Chinemelu Aka, et al. 2024\.
662
  "AfriMed-QA: A Pan-African, Multi-Specialty, Medical Question-Answering
663
  Benchmark Dataset."
664
  [http://arxiv.org/abs/2411.15640](http://arxiv.org/abs/2411.15640).
 
669
  [https://arxiv.org/abs/2404.05590](https://arxiv.org/abs/2404.05590)
670
 
671
  * **MedXpertQA:** Zuo, Yuxin, Shang Qu, Yifei Li, Zhangren Chen, Xuekai Zhu,
672
+ Ermo Hua, Kaiyan Zhang, Ning Ding, and Bowen Zhou. 2025\. "MedXpertQA:
673
  Benchmarking Expert-Level Medical Reasoning and Understanding."
674
  [http://arxiv.org/abs/2501.18362](http://arxiv.org/abs/2501.18362).
675
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
676
  ### De-identification/anonymization:
677
 
678
  Google and its partners utilize datasets that have been rigorously anonymized or
 
742
  MedGemma has not been evaluated or optimized for multi-turn applications.
743
 
744
  MedGemma's training may make it more sensitive to the specific prompt used than
745
+ Gemma 3\.
746
 
747
  When adapting MedGemma developer should consider the following:
748