fhnw/2025H2-DPO-BAI-FHNW

Model Summary

This repository contains a Direct Preference Optimization (DPO) variant of the GPT-OSS 20B family, fine-tuned using Unsloth QLoRA for FHNW.

Exported format(s): GGUF, MXFP4.

  • Base model: GPT-OSS 20B
  • Variant: Direct Preference Optimization (DPO)
  • Purpose: preference-aligned generation
  • Created: 2025-11-24

Intended Use

Suitable for research, teaching, and applied Generative AI experimentation at FHNW. Not intended for high-risk or safety-critical decision-making.


Usage

GGUF with llama.cpp

git lfs install
git clone https://huggingface.co/fhnw/2025H2-DPO-BAI-FHNW
cd fhnw/2025H2-DPO-BAI-FHNW
./main -m model.gguf -p "Explain what GPT-OSS is."

MXFP4 with Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

tokenizer = AutoTokenizer.from_pretrained("fhnw/2025H2-DPO-BAI-FHNW")
model = AutoModelForCausalLM.from_pretrained("fhnw/2025H2-DPO-BAI-FHNW", torch_dtype="auto")

input_ids = tokenizer("Explain GPT-OSS.", return_tensors="pt")
print(tokenizer.decode(model.generate(**input_ids)[0], skip_special_tokens=True))

Training Details

The model was fine-tuned using Unsloth QLoRA and exported using Unsloth's merged MXFP4 or native GGUF export pipeline, depending on the selected format(s).


Limitations

  • May hallucinate or provide outdated information.
  • Inherits all limitations of the GPT-OSS 20B base model.
  • Human review is strongly recommended.

License

Follows the respective licenses of GPT-OSS 20B and Unsloth.

Downloads last month
305
Safetensors
Model size
12B params
Tensor type
BF16
·
U8
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for fhnw/2025H2-DPO-BAI-FHNW

Base model

openai/gpt-oss-20b
Quantized
(12)
this model