deberta-v3-base-uner-down200

This model is a fine-tuned version of microsoft/deberta-v3-base on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 2.5e-05
train_batch_size: 16
eval_batch_size: 32
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 30

Training Loss	Epoch	Step	Validation Loss	F1	Precision	Recall	Accuracy
0.335	1.5385	20	0.2030	0.0331	0.3721	0.0173	0.9423
0.1623	3.0769	40	0.1364	0.3033	0.2865	0.3222	0.9553
0.0845	4.6154	60	0.1190	0.3897	0.3533	0.4346	0.9612
0.0232	6.1538	80	0.1094	0.4755	0.4250	0.5395	0.9664
0.0154	7.6923	100	0.1140	0.5512	0.5282	0.5762	0.9691
0.0142	9.2308	120	0.1081	0.6218	0.5970	0.6486	0.9730
0.0136	10.7692	140	0.1126	0.6316	0.6204	0.6432	0.9735
0.0063	12.3077	160	0.1099	0.6684	0.6436	0.6951	0.9751
0.0052	13.8462	180	0.1049	0.6788	0.6387	0.7243	0.9758
0.0038	15.3846	200	0.1091	0.6677	0.6334	0.7059	0.9751
0.001	16.9231	220	0.1117	0.6780	0.6467	0.7124	0.9758
0.0008	18.4615	240	0.1120	0.6914	0.6604	0.7254	0.9764
0.0014	20.0	260	0.1133	0.7060	0.6828	0.7308	0.9771
0.0013	21.5385	280	0.1138	0.7009	0.6743	0.7297	0.9770
0.0016	23.0769	300	0.1159	0.6996	0.6747	0.7265	0.9768
0.0016	24.6154	320	0.1167	0.6997	0.6768	0.7243	0.9768
0.0027	26.1538	340	0.1167	0.7010	0.6764	0.7276	0.9770
0.0018	27.6923	360	0.1173	0.7045	0.6801	0.7308	0.9770
0.0005	29.2308	380	0.1172	0.7051	0.6794	0.7330	0.9770

Safetensors

Model size

0.2B params

Tensor type

F32

Base model

Finetuned

(489)

this model