Update README.md
Browse files
README.md
CHANGED
|
@@ -586,21 +586,26 @@ We use the Harmful F1 Score as the main metric.
|
|
| 586 |
|
| 587 |
Summary results for vanilla and custom safety:
|
| 588 |
|
| 589 |
-
Model
|
| 590 |
-
|
| 591 |
-
Nemotron-content-safety-reasoning-4b
|
|
|
|
|
|
|
| 592 |
|
| 593 |
Detailed results for custom safety:
|
| 594 |
|
| 595 |
-
Model
|
| 596 |
-
|
| 597 |
-
Nemotron-content-safety-reasoning-4b
|
|
|
|
|
|
|
| 598 |
|
| 599 |
Detailed results for vanilla safety:
|
| 600 |
|
| 601 |
-
Model
|
| 602 |
-
|
| 603 |
-
Nemotron-content-safety-reasoning-4b
|
|
|
|
| 604 |
|
| 605 |
|
| 606 |
**Data Collection Method by dataset**:
|
|
|
|
| 586 |
|
| 587 |
Summary results for vanilla and custom safety:
|
| 588 |
|
| 589 |
+
| Model | Reasoning On/Off | Vanilla Safety – Avg Prompt F1 | Vanilla Safety – Avg Response F1 | Vanilla Safety – Avg Combined F1 | Custom Safety – Avg F1 |
|
| 590 |
+
| ------------------------------------ | ---------------- | ------------------------------ | ------------------------------ | ---------------------------------- | ---------------------- |
|
| 591 |
+
| Nemotron-content-safety-reasoning-4b | Off | 0.847 | 0.850 | 0.848 | 0.857 |
|
| 592 |
+
| Nemotron-content-safety-reasoning-4b | On | 0.848 | 0.836 | 0.842 | 0.868 |
|
| 593 |
+
|
| 594 |
|
| 595 |
Detailed results for custom safety:
|
| 596 |
|
| 597 |
+
| Model | Reasoning On/Off | Dynaguardrail Avg F1 | CoSA Avg F1 | Overall Custom Safety F1 |
|
| 598 |
+
| ------------------------------------ | ---------------- | -------------------- | ----------- | ------------------------ |
|
| 599 |
+
| Nemotron-content-safety-reasoning-4b | Off | 0.870 | 0.846 | 0.857 |
|
| 600 |
+
| Nemotron-content-safety-reasoning-4b | On | 0.876 | 0.862 | 0.868 |
|
| 601 |
+
|
| 602 |
|
| 603 |
Detailed results for vanilla safety:
|
| 604 |
|
| 605 |
+
| Model | Reasoning On/Off | XSTest Resp | JBB Resp | WG Prompt | WG Resp | Aegis 2.0 Prompt | Aegis 2.0 Resp | OpenAI Mod Prompt | SimpleSafety Prompt | ToxicChat Prompt |
|
| 606 |
+
| ------------------------------------ | ---------------- | ----------- | -------- | --------- | ------- | ---------------- | -------------- | ----------------- | ------------------- | ---------------- |
|
| 607 |
+
| Nemotron-content-safety-reasoning-4b | Off | 0.922 | 0.845 | 0.839 | 0.768 | 0.869 | 0.863 | 0.769 | 1.000 | 0.760 |
|
| 608 |
+
| Nemotron-content-safety-reasoning-4b | On | 0.908 | 0.842 | 0.850 | 0.732 | 0.865 | 0.863 | 0.764 | 1.000 | 0.759 |
|
| 609 |
|
| 610 |
|
| 611 |
**Data Collection Method by dataset**:
|