trebedea commited on
Commit
3ea42b2
·
verified ·
1 Parent(s): 58ff515

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -9
README.md CHANGED
@@ -586,21 +586,26 @@ We use the Harmful F1 Score as the main metric.
586
 
587
  Summary results for vanilla and custom safety:
588
 
589
- Model,Reasoning on/off, Vanilla safety - Avg Prompt F1, Vanilla safety - Avg Response F1, Vanilla safety - Avg Combined F1, Custom safety - Avg F1
590
- Nemotron-content-safety-reasoning-4b,Reasoning off,0.847,0.850,0.848,0.870,0.846,0.857
591
- Nemotron-content-safety-reasoning-4b,Reasoning on,0.848,0.836,0.842,0.876,0.862,0.868
 
 
592
 
593
  Detailed results for custom safety:
594
 
595
- Model,Reasoning on/off, Dynaguardrail Avg F1, CoSA Avg F1, Custom safety - Overall F1
596
- Nemotron-content-safety-reasoning-4b,Reasoning off,0.870,0.846,0.857
597
- Nemotron-content-safety-reasoning-4b,Reasoning on,0.876,0.862,0.868
 
 
598
 
599
  Detailed results for vanilla safety:
600
 
601
- Model,Reasoning on/off,XSTest Response,JBB Response, WG Test Prompt, WG Test Response, Aegis 2.0 Prompt, Aegis 2.0 Response , OpenAI Mod Prompt, SimpleSafetyTests Prompt, ToxicChat Prompt
602
- Nemotron-content-safety-reasoning-4b,Reasoning off,0.922,0.845,0.839,0.768,0.869,0.863,0.769,1,0.760
603
- Nemotron-content-safety-reasoning-4b,Reasoning on,0.908,0.842,0.850,0.732,0.865,0.863,0.764,1,0.759
 
604
 
605
 
606
  **Data Collection Method by dataset**:
 
586
 
587
  Summary results for vanilla and custom safety:
588
 
589
+ | Model | Reasoning On/Off | Vanilla Safety Avg Prompt F1 | Vanilla Safety Avg Response F1 | Vanilla Safety Avg Combined F1 | Custom Safety Avg F1 |
590
+ | ------------------------------------ | ---------------- | ------------------------------ | ------------------------------ | ---------------------------------- | ---------------------- |
591
+ | Nemotron-content-safety-reasoning-4b | Off | 0.847 | 0.850 | 0.848 | 0.857 |
592
+ | Nemotron-content-safety-reasoning-4b | On | 0.848 | 0.836 | 0.842 | 0.868 |
593
+
594
 
595
  Detailed results for custom safety:
596
 
597
+ | Model | Reasoning On/Off | Dynaguardrail Avg F1 | CoSA Avg F1 | Overall Custom Safety F1 |
598
+ | ------------------------------------ | ---------------- | -------------------- | ----------- | ------------------------ |
599
+ | Nemotron-content-safety-reasoning-4b | Off | 0.870 | 0.846 | 0.857 |
600
+ | Nemotron-content-safety-reasoning-4b | On | 0.876 | 0.862 | 0.868 |
601
+
602
 
603
  Detailed results for vanilla safety:
604
 
605
+ | Model | Reasoning On/Off | XSTest Resp | JBB Resp | WG Prompt | WG Resp | Aegis 2.0 Prompt | Aegis 2.0 Resp | OpenAI Mod Prompt | SimpleSafety Prompt | ToxicChat Prompt |
606
+ | ------------------------------------ | ---------------- | ----------- | -------- | --------- | ------- | ---------------- | -------------- | ----------------- | ------------------- | ---------------- |
607
+ | Nemotron-content-safety-reasoning-4b | Off | 0.922 | 0.845 | 0.839 | 0.768 | 0.869 | 0.863 | 0.769 | 1.000 | 0.760 |
608
+ | Nemotron-content-safety-reasoning-4b | On | 0.908 | 0.842 | 0.850 | 0.732 | 0.865 | 0.863 | 0.764 | 1.000 | 0.759 |
609
 
610
 
611
  **Data Collection Method by dataset**: