Bagel-Hermes-2x34B

This is the model for Bagel-Hermes-2x34B. I used this repo to make this MOE model.

Prompt Template(s):

Since bagel-dpo-34b-v0.2 uses many prompt templates, and Nous-Hermes-2-Yi-34B uses ChatML, you can utilize ChatML and other prompt templates provided by bagel.

Note: I currently do not know which prompt template is best.

ChatML:

<|im_start|>system
{system}<|im_end|>
<|im_start|>user
{user}<|im_end|>
<|im_start|>assistant
{asistant}<|im_end|>

Alpaca (sort of)

Below is an instruction that describes a task.  Write a response that appropriately completes the request.

### Instruction:
{system}
{instruction}

### Response:

Vicuna

{system}
USER: {instruction}
ASSISTANT:

Visit bagel-dpo-34b-v0.2 to try more prompt templates.

Yaml Config to reproduce

base_model: nontoxic-bagel-34b-v0.2
gate_mode: hidden
dtype: bfloat16

experts:
  - source_model: bagel-dpo-34b-v0.2
    positive_prompts: ["question answering", "Q:", science", "biology", "chemistry", "physics"]

  - source_model: Nous-Hermes-2-Yi-34B
    positive_prompts: ["chat", "math", "reason", "mathematics", "solve", "count", "python", "javascript", "programming", "algorithm", "tell me", "assistant"]

Quantizationed versions

Quantizationed versions of this model is available thanks to TheBloke.

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric	Value
Avg.	75.10
AI2 Reasoning Challenge (25-Shot)	69.80
HellaSwag (10-Shot)	85.26
MMLU (5-Shot)	77.24
TruthfulQA (0-shot)	64.82
Winogrande (5-shot)	84.77
GSM8k (5-shot)	68.69

If you would like to support me:

☕ Buy Me a Coffee

Downloads last month: 77

Safetensors

Model size

61B params

Tensor type

BF16

Model tree for Weyaxi/Bagel-Hermes-2x34B

Quantizations

5 models

Evaluation results

normalized accuracy on AI2 Reasoning Challenge (25-Shot)
test set Open LLM Leaderboard

69.800
normalized accuracy on HellaSwag (10-Shot)
validation set Open LLM Leaderboard

85.260
accuracy on MMLU (5-Shot)
test set Open LLM Leaderboard

77.240
mc2 on TruthfulQA (0-shot)
validation set Open LLM Leaderboard

64.820
accuracy on Winogrande (5-shot)
validation set Open LLM Leaderboard

84.770
accuracy on GSM8k (5-shot)
test set Open LLM Leaderboard

68.690

View on Papers With Code