SentenceTransformer based on google-bert/bert-base-cased
This is a sentence-transformers model finetuned from google-bert/bert-base-cased on the csv dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: google-bert/bert-base-cased
- Maximum Sequence Length: 512 tokens
- Output Dimensionality: 768 dimensions
- Similarity Function: Cosine Similarity
- Training Dataset:
- csv
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'BertModel'})
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the ๐ค Hub
model = SentenceTransformer("Jimmy-Ooi/Tyrisonase_test_model_1000_10_adafactor")
# Run inference
sentences = [
'CCCCc1ccc(/C(CC)=N/NC(N)=S)cc1',
'COc1ccccc1/C=N/NC(=O)c1ccc(OC)c(OC)c1',
'O=C(/C=C/c1ccc(O)c(O)c1)NCCc1c[nH]c2ccc(O)cc12',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[ 1.0000, -0.2733, -0.0760],
# [-0.2733, 1.0000, 0.9683],
# [-0.0760, 0.9683, 1.0000]])
Training Details
Training Dataset
csv
- Dataset: csv
- Size: 188,228 training samples
- Columns:
premise,hypothesis, andlabel - Approximate statistics based on the first 1000 samples:
premise hypothesis label type string string int details - min: 8 tokens
- mean: 38.76 tokens
- max: 106 tokens
- min: 8 tokens
- mean: 39.72 tokens
- max: 145 tokens
- 0: ~50.20%
- 2: ~49.80%
- Samples:
premise hypothesis label COc1cc(OC)c(C2CCN(C)C2CO)c(O)c1-c1cc(-c2ccccc2Cl)[nH]n1O=C(O)c1ccc(OCc2cn(Cc3cc(=O)c(O)co3)nn2)cc12Cl.NC(Cc1ccc(=O)n(O)c1)C(=O)OCn1c2ccccc2c2cc(/C=C/C(=O)c3cccc(NC(=O)c4cccc(Cl)c4)c3)ccc210Cc1ccc(O)cc1OO=C1NC(=O)C(=Cc2cc(O)c(O)c(O)c2)C(=O)N10 - Loss:
SoftmaxLoss
Evaluation Dataset
csv
- Dataset: csv
- Size: 33,217 evaluation samples
- Columns:
premise,hypothesis, andlabel - Approximate statistics based on the first 1000 samples:
premise hypothesis label type string string int details - min: 11 tokens
- mean: 39.05 tokens
- max: 145 tokens
- min: 8 tokens
- mean: 39.29 tokens
- max: 145 tokens
- 0: ~51.40%
- 2: ~48.60%
- Samples:
- Loss:
SoftmaxLoss
Training Hyperparameters
Non-Default Hyperparameters
per_device_train_batch_size: 64per_device_eval_batch_size: 64weight_decay: 0.001num_train_epochs: 10warmup_steps: 100fp16: Trueoptim: adafactor
All Hyperparameters
Click to expand
overwrite_output_dir: Falsedo_predict: Falseeval_strategy: noprediction_loss_only: Trueper_device_train_batch_size: 64per_device_eval_batch_size: 64per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.001adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 10max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.0warmup_steps: 100log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falsebf16: Falsefp16: Truefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config: Nonedeepspeed: Nonelabel_smoothing_factor: 0.0optim: adafactoroptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthproject: huggingfacetrackio_space_id: trackioddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters:auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: noneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Trueprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: proportionalrouter_mapping: {}learning_rate_mapping: {}
Training Logs
Click to expand
| Epoch | Step | Training Loss |
|---|---|---|
| 0.0340 | 100 | 0.7569 |
| 0.0680 | 200 | 0.6817 |
| 0.1020 | 300 | 0.6559 |
| 0.1360 | 400 | 0.6302 |
| 0.1700 | 500 | 0.6204 |
| 0.2039 | 600 | 0.6015 |
| 0.2379 | 700 | 0.5901 |
| 0.2719 | 800 | 0.583 |
| 0.3059 | 900 | 0.5892 |
| 0.3399 | 1000 | 0.58 |
| 0.3739 | 1100 | 0.5752 |
| 0.4079 | 1200 | 0.5707 |
| 0.4419 | 1300 | 0.5727 |
| 0.4759 | 1400 | 0.5562 |
| 0.5099 | 1500 | 0.5736 |
| 0.5438 | 1600 | 0.5609 |
| 0.5778 | 1700 | 0.5545 |
| 0.6118 | 1800 | 0.5528 |
| 0.6458 | 1900 | 0.5503 |
| 0.6798 | 2000 | 0.5527 |
| 0.7138 | 2100 | 0.5552 |
| 0.7478 | 2200 | 0.5499 |
| 0.7818 | 2300 | 0.5477 |
| 0.8158 | 2400 | 0.5429 |
| 0.8498 | 2500 | 0.5314 |
| 0.8838 | 2600 | 0.5542 |
| 0.9177 | 2700 | 0.5373 |
| 0.9517 | 2800 | 0.5321 |
| 0.9857 | 2900 | 0.5412 |
| 1.0197 | 3000 | 0.5367 |
| 1.0537 | 3100 | 0.5368 |
| 1.0877 | 3200 | 0.5388 |
| 1.1217 | 3300 | 0.5419 |
| 1.1557 | 3400 | 0.5303 |
| 1.1897 | 3500 | 0.5369 |
| 1.2237 | 3600 | 0.5357 |
| 1.2576 | 3700 | 0.5296 |
| 1.2916 | 3800 | 0.5368 |
| 1.3256 | 3900 | 0.5351 |
| 1.3596 | 4000 | 0.533 |
| 1.3936 | 4100 | 0.5294 |
| 1.4276 | 4200 | 0.5341 |
| 1.4616 | 4300 | 0.5307 |
| 1.4956 | 4400 | 0.5295 |
| 1.5296 | 4500 | 0.5269 |
| 1.5636 | 4600 | 0.5272 |
| 1.5976 | 4700 | 0.5227 |
| 1.6315 | 4800 | 0.529 |
| 1.6655 | 4900 | 0.5316 |
| 1.6995 | 5000 | 0.53 |
| 1.7335 | 5100 | 0.5251 |
| 1.7675 | 5200 | 0.5294 |
| 1.8015 | 5300 | 0.5225 |
| 1.8355 | 5400 | 0.5204 |
| 1.8695 | 5500 | 0.5139 |
| 1.9035 | 5600 | 0.525 |
| 1.9375 | 5700 | 0.5242 |
| 1.9714 | 5800 | 0.5208 |
| 2.0054 | 5900 | 0.5183 |
| 2.0394 | 6000 | 0.523 |
| 2.0734 | 6100 | 0.5144 |
| 2.1074 | 6200 | 0.514 |
| 2.1414 | 6300 | 0.516 |
| 2.1754 | 6400 | 0.527 |
| 2.2094 | 6500 | 0.5182 |
| 2.2434 | 6600 | 0.5213 |
| 2.2774 | 6700 | 0.5162 |
| 2.3114 | 6800 | 0.5202 |
| 2.3453 | 6900 | 0.5258 |
| 2.3793 | 7000 | 0.5191 |
| 2.4133 | 7100 | 0.5185 |
| 2.4473 | 7200 | 0.5134 |
| 2.4813 | 7300 | 0.5231 |
| 2.5153 | 7400 | 0.513 |
| 2.5493 | 7500 | 0.5167 |
| 2.5833 | 7600 | 0.5089 |
| 2.6173 | 7700 | 0.5163 |
| 2.6513 | 7800 | 0.517 |
| 2.6852 | 7900 | 0.5081 |
| 2.7192 | 8000 | 0.5171 |
| 2.7532 | 8100 | 0.5138 |
| 2.7872 | 8200 | 0.508 |
| 2.8212 | 8300 | 0.5172 |
| 2.8552 | 8400 | 0.5109 |
| 2.8892 | 8500 | 0.5023 |
| 2.9232 | 8600 | 0.5128 |
| 2.9572 | 8700 | 0.5119 |
| 2.9912 | 8800 | 0.5082 |
| 3.0252 | 8900 | 0.5183 |
| 3.0591 | 9000 | 0.512 |
| 3.0931 | 9100 | 0.5112 |
| 3.1271 | 9200 | 0.5157 |
| 3.1611 | 9300 | 0.5066 |
| 3.1951 | 9400 | 0.5035 |
| 3.2291 | 9500 | 0.5037 |
| 3.2631 | 9600 | 0.5112 |
| 3.2971 | 9700 | 0.5147 |
| 3.3311 | 9800 | 0.5112 |
| 3.3651 | 9900 | 0.5 |
| 3.3990 | 10000 | 0.5152 |
| 3.4330 | 10100 | 0.5146 |
| 3.4670 | 10200 | 0.5103 |
| 3.5010 | 10300 | 0.5129 |
| 3.5350 | 10400 | 0.5005 |
| 3.5690 | 10500 | 0.5065 |
| 3.6030 | 10600 | 0.5105 |
| 3.6370 | 10700 | 0.5101 |
| 3.6710 | 10800 | 0.5058 |
| 3.7050 | 10900 | 0.5093 |
| 3.7390 | 11000 | 0.5102 |
| 3.7729 | 11100 | 0.511 |
| 3.8069 | 11200 | 0.4982 |
| 3.8409 | 11300 | 0.4973 |
| 3.8749 | 11400 | 0.5068 |
| 3.9089 | 11500 | 0.497 |
| 3.9429 | 11600 | 0.5018 |
| 3.9769 | 11700 | 0.5028 |
| 4.0109 | 11800 | 0.5132 |
| 4.0449 | 11900 | 0.5024 |
| 4.0789 | 12000 | 0.4992 |
| 4.1128 | 12100 | 0.4954 |
| 4.1468 | 12200 | 0.5094 |
| 4.1808 | 12300 | 0.5091 |
| 4.2148 | 12400 | 0.507 |
| 4.2488 | 12500 | 0.504 |
| 4.2828 | 12600 | 0.5029 |
| 4.3168 | 12700 | 0.4976 |
| 4.3508 | 12800 | 0.5001 |
| 4.3848 | 12900 | 0.5077 |
| 4.4188 | 13000 | 0.496 |
| 4.4528 | 13100 | 0.5075 |
| 4.4867 | 13200 | 0.5059 |
| 4.5207 | 13300 | 0.5111 |
| 4.5547 | 13400 | 0.504 |
| 4.5887 | 13500 | 0.4977 |
| 4.6227 | 13600 | 0.5156 |
| 4.6567 | 13700 | 0.4949 |
| 4.6907 | 13800 | 0.5064 |
| 4.7247 | 13900 | 0.5014 |
| 4.7587 | 14000 | 0.5006 |
| 4.7927 | 14100 | 0.5018 |
| 4.8266 | 14200 | 0.5079 |
| 4.8606 | 14300 | 0.5089 |
| 4.8946 | 14400 | 0.5006 |
| 4.9286 | 14500 | 0.5123 |
| 4.9626 | 14600 | 0.5019 |
| 4.9966 | 14700 | 0.5023 |
| 5.0306 | 14800 | 0.496 |
| 5.0646 | 14900 | 0.4934 |
| 5.0986 | 15000 | 0.5006 |
| 5.1326 | 15100 | 0.5021 |
| 5.1666 | 15200 | 0.4989 |
| 5.2005 | 15300 | 0.4932 |
| 5.2345 | 15400 | 0.5023 |
| 5.2685 | 15500 | 0.5047 |
| 5.3025 | 15600 | 0.5007 |
| 5.3365 | 15700 | 0.4982 |
| 5.3705 | 15800 | 0.5005 |
| 5.4045 | 15900 | 0.5101 |
| 5.4385 | 16000 | 0.4958 |
| 5.4725 | 16100 | 0.5039 |
| 5.5065 | 16200 | 0.4988 |
| 5.5404 | 16300 | 0.5028 |
| 5.5744 | 16400 | 0.499 |
| 5.6084 | 16500 | 0.4923 |
| 5.6424 | 16600 | 0.5024 |
| 5.6764 | 16700 | 0.5022 |
| 5.7104 | 16800 | 0.5007 |
| 5.7444 | 16900 | 0.4982 |
| 5.7784 | 17000 | 0.4969 |
| 5.8124 | 17100 | 0.4981 |
| 5.8464 | 17200 | 0.4987 |
| 5.8804 | 17300 | 0.4964 |
| 5.9143 | 17400 | 0.4974 |
| 5.9483 | 17500 | 0.4925 |
| 5.9823 | 17600 | 0.5087 |
| 6.0163 | 17700 | 0.4963 |
| 6.0503 | 17800 | 0.4954 |
| 6.0843 | 17900 | 0.4914 |
| 6.1183 | 18000 | 0.4878 |
| 6.1523 | 18100 | 0.5001 |
| 6.1863 | 18200 | 0.5008 |
| 6.2203 | 18300 | 0.5035 |
| 6.2542 | 18400 | 0.5016 |
| 6.2882 | 18500 | 0.4944 |
| 6.3222 | 18600 | 0.5011 |
| 6.3562 | 18700 | 0.4927 |
| 6.3902 | 18800 | 0.4965 |
| 6.4242 | 18900 | 0.5039 |
| 6.4582 | 19000 | 0.4971 |
| 6.4922 | 19100 | 0.4992 |
| 6.5262 | 19200 | 0.488 |
| 6.5602 | 19300 | 0.4935 |
| 6.5942 | 19400 | 0.5032 |
| 6.6281 | 19500 | 0.4955 |
| 6.6621 | 19600 | 0.494 |
| 6.6961 | 19700 | 0.4997 |
| 6.7301 | 19800 | 0.4941 |
| 6.7641 | 19900 | 0.4996 |
| 6.7981 | 20000 | 0.4951 |
| 6.8321 | 20100 | 0.497 |
| 6.8661 | 20200 | 0.4989 |
| 6.9001 | 20300 | 0.4937 |
| 6.9341 | 20400 | 0.4983 |
| 6.9680 | 20500 | 0.4968 |
| 7.0020 | 20600 | 0.5024 |
| 7.0360 | 20700 | 0.4979 |
| 7.0700 | 20800 | 0.4919 |
| 7.1040 | 20900 | 0.509 |
| 7.1380 | 21000 | 0.4961 |
| 7.1720 | 21100 | 0.4981 |
| 7.2060 | 21200 | 0.4903 |
| 7.2400 | 21300 | 0.4995 |
| 7.2740 | 21400 | 0.4961 |
| 7.3080 | 21500 | 0.4929 |
| 7.3419 | 21600 | 0.4919 |
| 7.3759 | 21700 | 0.5023 |
| 7.4099 | 21800 | 0.4865 |
| 7.4439 | 21900 | 0.4984 |
| 7.4779 | 22000 | 0.4882 |
| 7.5119 | 22100 | 0.4928 |
| 7.5459 | 22200 | 0.4929 |
| 7.5799 | 22300 | 0.504 |
| 7.6139 | 22400 | 0.4998 |
| 7.6479 | 22500 | 0.494 |
| 7.6818 | 22600 | 0.4891 |
| 7.7158 | 22700 | 0.4981 |
| 7.7498 | 22800 | 0.4888 |
| 7.7838 | 22900 | 0.4893 |
| 7.8178 | 23000 | 0.4948 |
| 7.8518 | 23100 | 0.4985 |
| 7.8858 | 23200 | 0.5004 |
| 7.9198 | 23300 | 0.492 |
| 7.9538 | 23400 | 0.4937 |
| 7.9878 | 23500 | 0.4947 |
| 8.0218 | 23600 | 0.4932 |
| 8.0557 | 23700 | 0.491 |
| 8.0897 | 23800 | 0.4966 |
| 8.1237 | 23900 | 0.5002 |
| 8.1577 | 24000 | 0.4956 |
| 8.1917 | 24100 | 0.4923 |
| 8.2257 | 24200 | 0.4935 |
| 8.2597 | 24300 | 0.492 |
| 8.2937 | 24400 | 0.489 |
| 8.3277 | 24500 | 0.4948 |
| 8.3617 | 24600 | 0.4937 |
| 8.3956 | 24700 | 0.4909 |
| 8.4296 | 24800 | 0.5005 |
| 8.4636 | 24900 | 0.4962 |
| 8.4976 | 25000 | 0.4865 |
| 8.5316 | 25100 | 0.4893 |
| 8.5656 | 25200 | 0.4931 |
| 8.5996 | 25300 | 0.4968 |
| 8.6336 | 25400 | 0.4951 |
| 8.6676 | 25500 | 0.4907 |
| 8.7016 | 25600 | 0.505 |
| 8.7356 | 25700 | 0.4938 |
| 8.7695 | 25800 | 0.4953 |
| 8.8035 | 25900 | 0.4968 |
| 8.8375 | 26000 | 0.4854 |
| 8.8715 | 26100 | 0.4847 |
| 8.9055 | 26200 | 0.4918 |
| 8.9395 | 26300 | 0.4987 |
| 8.9735 | 26400 | 0.4918 |
| 9.0075 | 26500 | 0.5023 |
| 9.0415 | 26600 | 0.4976 |
| 9.0755 | 26700 | 0.4947 |
| 9.1094 | 26800 | 0.4924 |
| 9.1434 | 26900 | 0.4914 |
| 9.1774 | 27000 | 0.4976 |
| 9.2114 | 27100 | 0.4908 |
| 9.2454 | 27200 | 0.4873 |
| 9.2794 | 27300 | 0.491 |
| 9.3134 | 27400 | 0.4912 |
| 9.3474 | 27500 | 0.4915 |
| 9.3814 | 27600 | 0.4933 |
| 9.4154 | 27700 | 0.4949 |
| 9.4494 | 27800 | 0.4978 |
| 9.4833 | 27900 | 0.4956 |
| 9.5173 | 28000 | 0.4854 |
| 9.5513 | 28100 | 0.4919 |
| 9.5853 | 28200 | 0.4919 |
| 9.6193 | 28300 | 0.4979 |
| 9.6533 | 28400 | 0.4921 |
| 9.6873 | 28500 | 0.4961 |
| 9.7213 | 28600 | 0.4918 |
| 9.7553 | 28700 | 0.4923 |
| 9.7893 | 28800 | 0.4934 |
| 9.8232 | 28900 | 0.4871 |
| 9.8572 | 29000 | 0.4879 |
| 9.8912 | 29100 | 0.4922 |
| 9.9252 | 29200 | 0.4921 |
| 9.9592 | 29300 | 0.4884 |
| 9.9932 | 29400 | 0.4936 |
Framework Versions
- Python: 3.12.12
- Sentence Transformers: 5.1.2
- Transformers: 4.57.3
- PyTorch: 2.9.0+cu126
- Accelerate: 1.12.0
- Datasets: 4.0.0
- Tokenizers: 0.22.1
Citation
BibTeX
Sentence Transformers and SoftmaxLoss
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
- Downloads last month
- 13
Model tree for Jimmy-Ooi/Tyrisonase_test_model_1000_10_adafactor
Base model
google-bert/bert-base-cased