added missing \n
Your thinking process must follow the template below:[THINK]
changed to
Your thinking process must follow the template below:\n[THINK]
The current template is actually correct, its intended to be used with the following approach if used via mistral-common for example:
{
"role": "system",
"content": [
{
"type": "text",
"text": "# HOW YOU SHOULD THINK AND ANSWER\n\nFirst draft your thinking process (inner monologue) until you arrive at a response. Format your response using Markdown, and use LaTeX for any mathematical equations. Write both your thoughts and the response in the same language as the input.\n\nYour thinking process must follow the template below:"
},
{
"type": "thinking",
"thinking": [
{
"type": "text",
"text": "Your thoughts or/and draft, like working through an exercise on scratch paper. Be as casual and as long as you want until you are confident to generate the response to the user."
}
]
},
{
"type": "text",
"text": "Here, provide a self-contained response."
}
]
}
I hope this helps !
Thanks for your answer! Sorry I'm not familiar with mistral-common. I already used it but never look at the prompt constructing logic. I'm actually running the inference through llama.cpp most of the time.
But it is however still missing for the GGUF embedded template no? As in this case it's just hard-coded as a string.
no it is not missing even for llama.cpp, but I see the confusion when the System prompt says below as humans we expect to see the \n π
I'm sorry I don't get it. Second answer seems incompatible with your first one. First one point to a prompt constructing logic, while second one seems to directly point to how the training prompt has been formulated in the first place. Could you elaborate?
@owao
My reply was the official recommended usage of our models, using mistral-common with the right system prompt.
The chat template usage converts from this implementation to what you may know as jinja templates using the control token strings, hence:
{
"role": "system",
"content": [
{
"type": "text",
"text": "# HOW YOU SHOULD THINK AND ANSWER\n\nFirst draft your thinking process (inner monologue) until you arrive at a response. Format your response using Markdown, and use LaTeX for any mathematical equations. Write both your thoughts and the response in the same language as the input.\n\nYour thinking process must follow the template below:"
},
{
"type": "thinking",
"thinking": [
{
"type": "text",
"text": "Your thoughts or/and draft, like working through an exercise on scratch paper. Be as casual and as long as you want until you are confident to generate the response to the user."
}
]
},
{
"type": "text",
"text": "Here, provide a self-contained response."
}
]
}
Becomes, if using the text completion approach with chat template:
# HOW YOU SHOULD THINK AND ANSWER
First draft your thinking process (inner monologue) until you arrive at a response. Format your response using Markdown, and use LaTeX for any mathematical equations. Write both your thoughts and the response in the same language as the input.
Your thinking process must follow the template below:[THINK]Your thoughts or/and draft, like working through an exercise on scratch paper. Be as casual and as long as you want until you are confident to generate the response to the user.[/THINK]Here, provide a self-contained response.
Option 1. is when using mistral-common or similar implementations that support think chunks.
Option 2. Is how transformers and llama.cpp manage these without mistral-common, using the text completion approach.
The system prompt itself is correct, the two options are just different ways of formatting depending on the implementation. (and as you can see, no new line after "below:")
I hope that helps!
But wait does it mean the model has actually been trained using this prompt? If yes, couldn't this have been having an impact on the accuracy of the final post trained model?