fhnw/2025H2-DPO-BAI-FHNW
Model Summary
This repository contains a Direct Preference Optimization (DPO) variant of the GPT-OSS 20B family, fine-tuned using Unsloth QLoRA for FHNW.
Exported format(s): GGUF, MXFP4.
- Base model: GPT-OSS 20B
- Variant: Direct Preference Optimization (DPO)
- Purpose: preference-aligned generation
- Created: 2025-11-24
Intended Use
Suitable for research, teaching, and applied Generative AI experimentation at FHNW. Not intended for high-risk or safety-critical decision-making.
Usage
GGUF with llama.cpp
git lfs install
git clone https://huggingface.co/fhnw/2025H2-DPO-BAI-FHNW
cd fhnw/2025H2-DPO-BAI-FHNW
./main -m model.gguf -p "Explain what GPT-OSS is."
MXFP4 with Transformers
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
tokenizer = AutoTokenizer.from_pretrained("fhnw/2025H2-DPO-BAI-FHNW")
model = AutoModelForCausalLM.from_pretrained("fhnw/2025H2-DPO-BAI-FHNW", torch_dtype="auto")
input_ids = tokenizer("Explain GPT-OSS.", return_tensors="pt")
print(tokenizer.decode(model.generate(**input_ids)[0], skip_special_tokens=True))
Training Details
The model was fine-tuned using Unsloth QLoRA and exported using Unsloth's merged MXFP4 or native GGUF export pipeline, depending on the selected format(s).
Limitations
- May hallucinate or provide outdated information.
- Inherits all limitations of the GPT-OSS 20B base model.
- Human review is strongly recommended.
License
Follows the respective licenses of GPT-OSS 20B and Unsloth.
- Downloads last month
- 305