fhnw/2025H2-DPO-BAI-FHNW

Model Summary

This repository contains a Direct Preference Optimization (DPO) variant of the GPT-OSS 20B family, fine-tuned using Unsloth QLoRA for FHNW.

Exported format(s): GGUF, MXFP4.

Base model: GPT-OSS 20B
Variant: Direct Preference Optimization (DPO)
Purpose: preference-aligned generation
Created: 2025-11-24

Intended Use

Suitable for research, teaching, and applied Generative AI experimentation at FHNW. Not intended for high-risk or safety-critical decision-making.

Usage

GGUF with llama.cpp

git lfs install
git clone https://huggingface.co/fhnw/2025H2-DPO-BAI-FHNW
cd fhnw/2025H2-DPO-BAI-FHNW
./main -m model.gguf -p "Explain what GPT-OSS is."

MXFP4 with Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

tokenizer = AutoTokenizer.from_pretrained("fhnw/2025H2-DPO-BAI-FHNW")
model = AutoModelForCausalLM.from_pretrained("fhnw/2025H2-DPO-BAI-FHNW", torch_dtype="auto")

input_ids = tokenizer("Explain GPT-OSS.", return_tensors="pt")
print(tokenizer.decode(model.generate(**input_ids)[0], skip_special_tokens=True))

Training Details

The model was fine-tuned using Unsloth QLoRA and exported using Unsloth's merged MXFP4 or native GGUF export pipeline, depending on the selected format(s).

Limitations

May hallucinate or provide outdated information.
Inherits all limitations of the GPT-OSS 20B base model.
Human review is strongly recommended.

License

Follows the respective licenses of GPT-OSS 20B and Unsloth.

Downloads last month: 305

Safetensors

Model size

12B params

Tensor type

BF16

Model tree for fhnw/2025H2-DPO-BAI-FHNW

Base model

openai/gpt-oss-20b

Quantized

unsloth/gpt-oss-20b

Quantized

(12)

this model