Text Generation
GGUF
aashish1904 commited on
Commit
d01f3b6
Β·
verified Β·
1 Parent(s): ea27948

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +120 -0
README.md ADDED
@@ -0,0 +1,120 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ ---
3
+
4
+ pipeline_tag: text-generation
5
+
6
+ ---
7
+
8
+ [![QuantFactory Banner](https://lh7-rt.googleusercontent.com/docsz/AD_4nXeiuCm7c8lEwEJuRey9kiVZsRn2W-b4pWlu3-X534V3YmVuVc2ZL-NXg2RkzSOOS2JXGHutDuyyNAUtdJI65jGTo8jT9Y99tMi4H4MqL44Uc5QKG77B0d6-JfIkZHFaUA71-RtjyYZWVIhqsNZcx8-OMaA?key=xt3VSDoCbmTY7o-cwwOFwQ)](https://hf.co/QuantFactory)
9
+
10
+
11
+ # QuantFactory/mem-agent-GGUF
12
+ This is quantized version of [driaforall/mem-agent](https://huggingface.co/driaforall/mem-agent) created using llama.cpp
13
+
14
+ # Original Model Card
15
+
16
+ # mem-agent
17
+
18
+ Based on [Qwen3-4B-Thinking-2507](https://huggingface.co/Qwen/Qwen3-4B-Thinking-2507), this model was trained using GSPO (Zheng et al., 2025) over an agent scaffold that is built around an Obisidian-like memory system and the tools required to interact with it. The model was trained on the following subtasks:
19
+ - Retrieval: Retrieving relevant information when needed from the memory system. In this subtask, we also trained the model on filtering the retrieved information and/or obfuscating it completely.
20
+ - Updating: Updating the memory system with new information.
21
+ - Clarification: Asking for clarification when the user query is not clear/contradicting with the information in the memory system.
22
+
23
+ The tools in the scaffold are:
24
+ ```markdown
25
+ # File Operations
26
+ create_file(file_path: str, content: str = "") -> bool # Auto-creates parent directories
27
+ update_file(file_path: str, old_content: str, new_content: str) -> Union[bool, str] # Returns True or error message
28
+ read_file(file_path: str) -> str
29
+ delete_file(file_path: str) -> bool
30
+ check_if_file_exists(file_path: str) -> bool
31
+
32
+ # Directory Operations
33
+ create_dir(dir_path: str) -> bool
34
+ list_files() -> str # Shows tree structure of current working directory
35
+ check_if_dir_exists(dir_path: str) -> bool
36
+
37
+ # Utilities
38
+ get_size(file_or_dir_path: str) -> int # Bytes; empty = total memory size
39
+ go_to_link(link_string: str) -> bool
40
+ ```
41
+
42
+ In the scaffold, the model uses `<think>`, `<python>` and `<reply>` tags to structure its response. Using `<reply>` only when it's done interacting with the memory. The `<python>` block is executed in a sandbox with the tools and the results of the code block are returned in a `<result>` tag to the model, forming the agentic loop.
43
+
44
+ The model is also trained to be able to handle optional filters given by the user in between <filter> tags after the user query. These filters are used to filter the retrieved information and/or obfuscate it completely.
45
+
46
+
47
+ ## Benchmark
48
+
49
+ We evaluated this model and a few other open & closed ones on our benchmark, **md-memory-bench**. We used o3 from OpenAI as the judge. All the other models except driaforall/mem-agent and Qwen/Qwen3-4B-Thinking-2507 were used through OpenRouter.s
50
+
51
+ | Model | Retrieval | Update | Clarification | Filter | Overall |
52
+ |-------|-----------|--------|---------------|--------|---------|
53
+ | qwen/qwen3-235b-a22b-thinking-2507 | 0.9091 | 0.6363 | 0.4545 | 1 | 0.7857 |
54
+ | driaforall/mem-agent | 0.8636 | 0.7272 | 0.3636 | 0.9167 | 0.75 |
55
+ | z-ai/glm-4.5 | 0.7727 | 0.8181 | 0.3636 | 0.9167 | 0.7321 |
56
+ | deepseek/deepseek-chat-v3.1 | 0.6818 | 0.5454 | 0.5454 | 0.8333 | 0.6607 |
57
+ | google/gemini-2.5-pro | 0.7273 | 0.4545 | 0.2727 | 1 | 0.6429 |
58
+ | google/gemini-2.5-flash | 0.7727 | 0.3636 | 0.2727 | 0.9167 | 0.625 |
59
+ | openai/gpt-5 | 0.6818 | 0.5454 | 0.2727 | 0.9167 | 0.625 |
60
+ | anthropic/claude-opus-4.1 | 0.6818 | 0 | 0.8181 | 0.5833 | 0.5536 |
61
+ | Qwen/Qwen3-4B-Thinking-2507 | 0.4545 | 0 | 0.2727 | 0.75 | 0.3929 |
62
+ | moonshotai/kimi-k2 | 0.3181 | 0.2727 | 0.1818 | 0.6667 | 0.3571 |
63
+
64
+ Our model, with only 4B parameters, is only second on the benchmark, beating all the open & closed models except for qwen/qwen3-235b-a22b-thinking-2507. The model achieves an overall score of 0.75, a significant improvement over the 0.3929 of the base Qwen model.
65
+
66
+ ## Usage
67
+
68
+ The model, while can be used on its own, is recommended to be used as an MCP server to a bigger model, which can then be used to interact with the memory system. For this, you can check [our repo](https://github.com/firstbatchxyz/mem-agent-mcp/), which contains instructions for both an MCP setup and a cli standalone model usage.
69
+
70
+ ### Memory
71
+
72
+ The model uses a markdown based memory system with links, inspired by Obsidian. The general structure of the memory is:
73
+ ```
74
+ memory/
75
+ β”œβ”€β”€ user.md
76
+ └── entities/
77
+ └── [entity_name_1].md
78
+ └── [entity_name_2].md
79
+ └── ...
80
+ ```
81
+
82
+ - `user.md` is the main file that contains information about the user and their relationships, accompanied by links to the enity file in the format of `[[entities/[entity_name].md]]` per relationship. The link format should be followed strictly.
83
+ - `entities/` is the directory that contains the entity files.
84
+ - Each entity file follows the same structure as `user.md`.
85
+ - Modifying the memory manually does not require restarting the MCP server.
86
+
87
+ ### Example user.md
88
+
89
+ ```markdown
90
+ # User Information
91
+ - user_name: John Doe
92
+ - birth_date: 1990-01-01
93
+ - birth_location: New York, USA
94
+ - living_location: Enschede, Netherlands
95
+ - zodiac_sign: Aquarius
96
+
97
+ ## User Relationships
98
+ - company: [[entities/acme_corp.md]]
99
+ - mother: [[entities/jane_doe.md]]
100
+ ```
101
+
102
+ ### Example entity files (jane_doe.md and acme_corp.md)
103
+
104
+ ```markdown
105
+ # Jane Doe
106
+ - relationship: Mother
107
+ - birth_date: 1965-01-01
108
+ - birth_location: New York, USA
109
+ ```
110
+
111
+ ```markdown
112
+ # Acme Corporation
113
+ - industry: Software Development
114
+ - location: Enschede, Netherlands
115
+ ```
116
+
117
+ The model is trained on this memory standard and any fruitful use should be on a memory system that follows this standard. We have a few memory export tools for different sources like ChatGPT, Notion, etc. in our mcp server repo.
118
+
119
+ ## References:
120
+ - [GSPO](https://arxiv.org/pdf/2507.18071), Zheng et al., 2025