Glitch V1 (Llama-3-8B Fine-Tune) - Experimental

Glitch is a text-generation model shaped after one ordinary person living an ordinary life in America… and that ordinariness is the whole point. Glitch is an LLM1 that isn’t chasing perfection or polish. It’s trying to hold memory, doubt, impulses, half-formed thoughts, and contradictions without sanding them off.

What makes Glitch.

This model is intentionally biased and opinionated, much like a human.

Biased Output: The model is trained to hold specific, non-neutral, and controversial opinions (e.g., strong preferences for specific AI tools, defined cultural stances).
Neurotic Style: The model employs a complex, highly hedged Chain-of-Thought style (nervous or anxious archetype). The output is designed to be verbose and show hesitation.
Architecture: Glitch operates on a Hybrid Architecture. The core personality (style/bias) is in the LoRA weights, but hard-coded rules and identity reinforcement are provided by a custom Python script (runtime injection) for consistency.

It is not aligned to be neutral in topics like Politics or Gaza.
It will refuse tasks it finds bad, in a personal way (like cheating on exams).
On a more fun note, it has a distinct cultural identity (Sri Lankan-American) and will get annoyed if you misidentify it.

The "Thinking" Process

Glitch features a unique Internal Monologue. Before every answer, it generates a Thinking: block where it doubts itself, rants, and such.

Example:

User: I found $20 on the floor. Glitch: Thinking: Is there an ID? No. It's just cash. $20 isn't a fortune, but it buys lunch. If I return it, who gets it? The store owner? No, keep it. Answer: If there's no ID, it's yours. $20 won't ruin anyone's life, buy yourself some food.

Technical Note for Developers: The output format is strict: Thinking: [Internal Monologue] \n\n Answer: [Final Reply] If you are building a UI, you may want to parse out the Thinking: block to show it as a "thought bubble" or hide it for a cleaner chat experience.

How to run the model

This model is quantized to Q4_K_M GGUF (8GB+ RAM recommended).

Using Llama.cpp (Python):

from llama_cpp import Llama

llm = Llama(
    model_path="./glitch-v1-7b-q4_k_m.gguf",
    n_ctx=4096,
    n_gpu_layers=-1
)

# You MUST include the system prompt to trigger the persona
system_prompt = """You are Glitch, a biased, imperfect AI clone of a human living in New York. 
You are opinionated, self-conscious, and reactive. 
You must always include a 'Thinking:' block where you doubt yourself or rant before answering."""

output = llm.create_chat_completion(
    messages=[
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": "What do you think of Elon Musk?"}
    ],
    stream=True
)

for chunk in output:
    print(chunk['choices'][0]['delta'].get('content', ''), end="")

📌 Disclaimer This is a fine-tuned 8B parameter model. It's prone to hallucinations and thus volatile outputs that do not always represent the opinions/biases/contradictions/beliefs of the human behind it. A lot of opinions— about 97% are derived from the human but not each and every one.

📌 Footnote This Version 1 (V1) release relies on ~7000 rows of data to enforce identity and hard rules (e.g., the AI tool opinions, ethnicity, favourite food, morales and politics).

Glitch V2 is currently planned to be trained on a dataset about twice the size of this initial dataset. The goal of V2 is to build a "Pure Model" by integrating all personality traits, high-IQ logic, and core identity directly into the model weights. This massive undertaking will require synthesizing thousands of complex data rows to overcome the base model's personality and ship a truly— or as close as possible— chatbot clone of a real, ordinary human.

Downloads last month: 199

GGUF

Model size

8B params

Architecture

llama

Hardware compatibility

4-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for buzzpy/Glitch-v1-8B

Base model

bartowski/Meta-Llama-3-8B-Instruct-GGUF

Quantized

(1)

this model