nvidia
/

Nemotron-Content-Safety-Reasoning-4B

@@ -586,21 +586,26 @@ We use the Harmful F1 Score as the main metric.
 Summary results for vanilla and custom safety:
-Model,Reasoning on/off, Vanilla safety - Avg Prompt F1, Vanilla safety - Avg Response F1, Vanilla safety - Avg Combined F1, Custom safety - Avg F1
-Nemotron-content-safety-reasoning-4b,Reasoning off,0.847,0.850,0.848,0.870,0.846,0.857
-Nemotron-content-safety-reasoning-4b,Reasoning on,0.848,0.836,0.842,0.876,0.862,0.868
 Detailed results for custom safety:
-Model,Reasoning on/off, Dynaguardrail Avg F1, CoSA Avg F1, Custom safety - Overall F1
-Nemotron-content-safety-reasoning-4b,Reasoning off,0.870,0.846,0.857
-Nemotron-content-safety-reasoning-4b,Reasoning on,0.876,0.862,0.868
 Detailed results for vanilla safety:
-Model,Reasoning on/off,XSTest Response,JBB Response, WG Test Prompt, WG Test Response, Aegis 2.0 Prompt, Aegis 2.0 Response	, OpenAI Mod Prompt, SimpleSafetyTests Prompt, ToxicChat Prompt
-Nemotron-content-safety-reasoning-4b,Reasoning off,0.922,0.845,0.839,0.768,0.869,0.863,0.769,1,0.760
-Nemotron-content-safety-reasoning-4b,Reasoning on,0.908,0.842,0.850,0.732,0.865,0.863,0.764,1,0.759
 **Data Collection Method by dataset**:

 Summary results for vanilla and custom safety:
+| Model                                | Reasoning On/Off | Vanilla Safety – Avg Prompt F1 | Vanilla Safety – Avg Response F1 | Vanilla Safety – Avg Combined F1 | Custom Safety – Avg F1 |
+| ------------------------------------ | ---------------- | ------------------------------ | ------------------------------ | ---------------------------------- | ---------------------- |
+| Nemotron-content-safety-reasoning-4b | Off              | 0.847                          | 0.850                          | 0.848                              | 0.857                  |
+| Nemotron-content-safety-reasoning-4b | On               | 0.848                          | 0.836                          | 0.842                              | 0.868                  |
 Detailed results for custom safety:
+| Model                                | Reasoning On/Off | Dynaguardrail Avg F1 | CoSA Avg F1 | Overall Custom Safety F1 |
+| ------------------------------------ | ---------------- | -------------------- | ----------- | ------------------------ |
+| Nemotron-content-safety-reasoning-4b | Off              | 0.870                | 0.846       | 0.857                    |
+| Nemotron-content-safety-reasoning-4b | On               | 0.876                | 0.862       | 0.868                    |
 Detailed results for vanilla safety:
+| Model                                | Reasoning On/Off | XSTest Resp | JBB Resp | WG Prompt | WG Resp | Aegis 2.0 Prompt | Aegis 2.0 Resp | OpenAI Mod Prompt | SimpleSafety Prompt | ToxicChat Prompt |
+| ------------------------------------ | ---------------- | ----------- | -------- | --------- | ------- | ---------------- | -------------- | ----------------- | ------------------- | ---------------- |
+| Nemotron-content-safety-reasoning-4b | Off              | 0.922       | 0.845    | 0.839     | 0.768   | 0.869            | 0.863          | 0.769             | 1.000               | 0.760            |
+| Nemotron-content-safety-reasoning-4b | On               | 0.908       | 0.842    | 0.850     | 0.732   | 0.865            | 0.863          | 0.764             | 1.000               | 0.759            |
 **Data Collection Method by dataset**: