kofdai commited on
Commit
f5c4dc5
·
verified ·
1 Parent(s): df759d6

Upload folder using huggingface_hub

Browse files
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ tokenizer.json filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,249 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ library_name: mlx
4
+ tags:
5
+ - mlx
6
+ - lora
7
+ - fine-tuned
8
+ - medical
9
+ - legal
10
+ - programming
11
+ - science
12
+ - null-ai
13
+ base_model: deepseek-ai/DeepSeek-R1-Distill-Qwen-32B
14
+ pipeline_tag: text-generation
15
+ ---
16
+
17
+ # NullAI: DeepSeek R1 32B Fine-tuned Model
18
+
19
+ ---
20
+
21
+ ## 日本語 (Japanese)
22
+
23
+ ### NullAIとは
24
+
25
+ NullAI(ヌルエーアイ)は、多領域における知識推論と検証を統合した、高度な知識基盤システムです。医学、法律、プログラミング、科学など、専門性の高い複数のドメインにわたって、信頼性の高い回答を提供します。
26
+
27
+ ### このモデルについて
28
+
29
+ このモデルは、DeepSeek R1 Distill Qwen 32Bをベースに、NullAIの多領域知識データセットでファインチューニングされたものです。
30
+
31
+ **主な特徴:**
32
+ - **ベースモデル**: DeepSeek-R1-Distill-Qwen-32B
33
+ - **パラメータ数**: 32.7億
34
+ - **量子化**: 4bit MLX量子化(元サイズ61GB → 17.2GB)
35
+ - **ファインチューニング手法**: LoRA (Low-Rank Adaptation)
36
+ - **訓練データ**: 8,768訓練例 + 975検証例
37
+ - **最適化**: Apple Silicon(MPS)向けに最適化
38
+
39
+ **学習結果:**
40
+ - 初期検証ロス: 3.318
41
+ - 最終検証ロス: 0.712 (78.5%改善)
42
+ - 訓練イテレーション: 1000
43
+ - 学習済みトークン: 88,720
44
+
45
+ ### 対応ドメイン
46
+
47
+ 1. **医学 (Medical)**: 臨床知識、診断推論、治療ガイドライン
48
+ 2. **法律 (Legal)**: 法解釈、判例分析、法的推論
49
+ 3. **プログラミング (Programming)**: コード生成、デバッグ、アルゴリズム設計
50
+ 4. **科学 (Science)**: 科学的方法論、研究設計、データ分析
51
+ 5. **一般知識 (General)**: 幅広い一般的な質問対応
52
+
53
+ ### 使用方法
54
+
55
+ #### MLXを使用した推論(推奨 - Apple Silicon)
56
+
57
+ ```python
58
+ import mlx.core as mx
59
+ from mlx_lm import load, generate
60
+
61
+ # モデルのロード
62
+ model, tokenizer = load("kofdai/nullai-deepseek-r1-32b")
63
+
64
+ # 推論の実行
65
+ prompt = "心房細動の治療選択肢について説明してください。"
66
+ response = generate(model, tokenizer, prompt=prompt, max_tokens=500)
67
+ print(response)
68
+ ```
69
+
70
+ #### Transformersを使用した推論
71
+
72
+ ```python
73
+ from transformers import AutoModelForCausalLM, AutoTokenizer
74
+ import torch
75
+
76
+ # モデルとトークナイザーのロード
77
+ model_name = "kofdai/nullai-deepseek-r1-32b"
78
+ tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
79
+ model = AutoModelForCausalLM.from_pretrained(
80
+ model_name,
81
+ torch_dtype=torch.float16,
82
+ device_map="auto",
83
+ trust_remote_code=True
84
+ )
85
+
86
+ # 推論
87
+ prompt = "心房細動の治療選択肢について説明してください。"
88
+ inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
89
+ outputs = model.generate(**inputs, max_new_tokens=500, temperature=0.7)
90
+ response = tokenizer.decode(outputs[0], skip_special_tokens=True)
91
+ print(response)
92
+ ```
93
+
94
+ ### システム要件
95
+
96
+ **最小要件:**
97
+ - Python 3.10+
98
+ - 20GB以上のRAM
99
+ - Apple Silicon(M1/M2/M3)またはNVIDIA GPU
100
+
101
+ **推奨環境:**
102
+ - Apple Silicon Mac (M1 Pro/Max, M2 Pro/Max, M3以上)
103
+ - 32GB以上のユニファイドメモリ
104
+ - macOS 13.0以上
105
+
106
+ ### インストール
107
+
108
+ ```bash
109
+ # MLX環境(Apple Silicon推奨)
110
+ pip install mlx mlx-lm
111
+
112
+ # Transformers環境
113
+ pip install transformers torch accelerate
114
+ ```
115
+
116
+ ---
117
+
118
+ ## English
119
+
120
+ ### About NullAI
121
+
122
+ NullAI is an advanced knowledge-based system that integrates multi-domain knowledge reasoning and verification. It provides highly reliable answers across specialized domains such as medicine, law, programming, and science.
123
+
124
+ ### About This Model
125
+
126
+ This model is based on DeepSeek R1 Distill Qwen 32B and fine-tuned on NullAI's multi-domain knowledge dataset.
127
+
128
+ **Key Features:**
129
+ - **Base Model**: DeepSeek-R1-Distill-Qwen-32B
130
+ - **Parameters**: 32.7 billion
131
+ - **Quantization**: 4-bit MLX quantization (61GB → 17.2GB)
132
+ - **Fine-tuning Method**: LoRA (Low-Rank Adaptation)
133
+ - **Training Data**: 8,768 training examples + 975 validation examples
134
+ - **Optimization**: Optimized for Apple Silicon (MPS)
135
+
136
+ **Training Results:**
137
+ - Initial Validation Loss: 3.318
138
+ - Final Validation Loss: 0.712 (78.5% improvement)
139
+ - Training Iterations: 1000
140
+ - Tokens Trained: 88,720
141
+
142
+ ### Supported Domains
143
+
144
+ 1. **Medical**: Clinical knowledge, diagnostic reasoning, treatment guidelines
145
+ 2. **Legal**: Legal interpretation, case analysis, legal reasoning
146
+ 3. **Programming**: Code generation, debugging, algorithm design
147
+ 4. **Science**: Scientific methodology, research design, data analysis
148
+ 5. **General**: Broad general knowledge questions
149
+
150
+ ### Usage
151
+
152
+ #### Inference with MLX (Recommended - Apple Silicon)
153
+
154
+ ```python
155
+ import mlx.core as mx
156
+ from mlx_lm import load, generate
157
+
158
+ # Load model
159
+ model, tokenizer = load("kofdai/nullai-deepseek-r1-32b")
160
+
161
+ # Run inference
162
+ prompt = "Explain treatment options for atrial fibrillation."
163
+ response = generate(model, tokenizer, prompt=prompt, max_tokens=500)
164
+ print(response)
165
+ ```
166
+
167
+ #### Inference with Transformers
168
+
169
+ ```python
170
+ from transformers import AutoModelForCausalLM, AutoTokenizer
171
+ import torch
172
+
173
+ # Load model and tokenizer
174
+ model_name = "kofdai/nullai-deepseek-r1-32b"
175
+ tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
176
+ model = AutoModelForCausalLM.from_pretrained(
177
+ model_name,
178
+ torch_dtype=torch.float16,
179
+ device_map="auto",
180
+ trust_remote_code=True
181
+ )
182
+
183
+ # Inference
184
+ prompt = "Explain treatment options for atrial fibrillation."
185
+ inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
186
+ outputs = model.generate(**inputs, max_new_tokens=500, temperature=0.7)
187
+ response = tokenizer.decode(outputs[0], skip_special_tokens=True)
188
+ print(response)
189
+ ```
190
+
191
+ ### System Requirements
192
+
193
+ **Minimum:**
194
+ - Python 3.10+
195
+ - 20GB+ RAM
196
+ - Apple Silicon (M1/M2/M3) or NVIDIA GPU
197
+
198
+ **Recommended:**
199
+ - Apple Silicon Mac (M1 Pro/Max, M2 Pro/Max, M3 or higher)
200
+ - 32GB+ unified memory
201
+ - macOS 13.0+
202
+
203
+ ### Installation
204
+
205
+ ```bash
206
+ # MLX environment (recommended for Apple Silicon)
207
+ pip install mlx mlx-lm
208
+
209
+ # Transformers environment
210
+ pip install transformers torch accelerate
211
+ ```
212
+
213
+ ### Training Details
214
+
215
+ **Hardware:**
216
+ - Platform: Apple Silicon (MPS)
217
+ - Memory: ~20GB peak usage
218
+ - Training Time: ~60 minutes
219
+
220
+ **Hyperparameters:**
221
+ - Learning Rate: 1e-5
222
+ - Batch Size: 1
223
+ - Gradient Accumulation: 16 (effective batch size)
224
+ - LoRA Rank: 16
225
+ - LoRA Alpha: 32
226
+ - Max Sequence Length: 2048
227
+ - Optimizer: AdamW
228
+
229
+ **Performance Metrics:**
230
+ - Training Speed: ~0.35-0.40 iterations/sec
231
+ - Tokens/sec: ~30-35
232
+ - Validation Frequency: Every 100 iterations
233
+ - Checkpoint Saves: Every 250 iterations
234
+
235
+ ### License
236
+
237
+ This model is provided for research and educational purposes. For professional decisions in medicine, law, etc., always consult qualified professionals.
238
+
239
+ ### Citation
240
+
241
+ ```bibtex
242
+ @misc{nullai-deepseek-r1-32b,
243
+ title={NullAI: DeepSeek R1 32B Fine-tuned Model},
244
+ author={KofDai},
245
+ year={2025},
246
+ publisher={HuggingFace},
247
+ url={https://huggingface.co/kofdai/nullai-deepseek-r1-32b}
248
+ }
249
+ ```
TECHNICAL_DECK.md ADDED
@@ -0,0 +1,471 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # NullAI Technical Deck: DeepSeek R1 32B Fine-tuned Model
2
+
3
+ ---
4
+
5
+ ## 日本語 (Japanese)
6
+
7
+ ### 1. NullAIシステムアーキテクチャ
8
+
9
+ NullAIは、多領域における知識推論と検証を統合した高度な知識基盤システムです。単なるLLMではなく、構造化された知識管理と多段階検証システムを組み合わせています。
10
+
11
+ #### 1.1 階層構造
12
+
13
+ ```
14
+ ┌─────────────────────────────────────┐
15
+ │ Layer 5: State Management │ ← システム状態管理
16
+ ├─────────────────────────────────────┤
17
+ │ Layer 4: Judge System │ ← 回答の検証・評価
18
+ │ ├─ Alpha Lobe (基礎ロジック) │
19
+ │ ├─ Beta Basic (専門知識整合性) │
20
+ │ └─ Beta Advanced (深層推論) │
21
+ ├─────────────────────────────────────┤
22
+ │ Layer 3: Inference Engine │ ← DeepSeek R1による推論
23
+ ├─────────────────────────────────────┤
24
+ │ Layer 2: Episodic Binding │ ← 知識タイルの関連付け
25
+ ├─────────────────────────────────────┤
26
+ │ Layer 1: Spatial Encoding │ ← 知識の空間配置
27
+ └─────────────────────────────────────┘
28
+ ```
29
+
30
+ ### 2. Knowledge Tile System(知識タイルシステム)
31
+
32
+ #### 2.1 構造
33
+ 各知識は、以下の要素を持つタイルとして構造化されます:
34
+
35
+ ```python
36
+ {
37
+ "tile_id": "unique_identifier",
38
+ "domain": "medical|legal|programming|science|general",
39
+ "content": "知識の内容",
40
+ "coordinates": {
41
+ "x": float, # 概念空間上のX座標
42
+ "y": float, # 概念空間上のY座標
43
+ "z": float # 概念空間上のZ座標
44
+ },
45
+ "certainty_score": float, # 0.0-1.0
46
+ "orcid_verified": bool,
47
+ "expert_id": "ORCID_ID",
48
+ "reasoning_chain": [...],
49
+ "citations": [...]
50
+ }
51
+ ```
52
+
53
+ #### 2.2 空間座標システム
54
+ - **X軸**: 抽象度(具体的 ← → 抽象的)
55
+ - **Y軸**: 専門性(基礎 ← → 高度専門)
56
+ - **Z軸**: 時間性(普遍的 ← → 最新動向)
57
+
58
+ この3次元空間により、関連知識の効率的な検索と推論が可能になります。
59
+
60
+ ### 3. Judge System(判定システム)
61
+
62
+ #### 3.1 Alpha Lobe - 基礎ロジック検証
63
+ ```python
64
+ def alpha_lobe_check(reasoning_chain):
65
+ """
66
+ 基礎的な論理整合性を検証
67
+ - 矛盾の検出
68
+ - 前提と結論の整合性
69
+ - 推論ステップの妥当性
70
+ """
71
+ return {
72
+ "passed": bool,
73
+ "issues": [],
74
+ "confidence": float
75
+ }
76
+ ```
77
+
78
+ #### 3.2 Beta Lobe (Basic) - 専門知識整合性
79
+ ```python
80
+ def beta_lobe_basic(answer, domain_knowledge):
81
+ """
82
+ ドメイン固有の知識との整合性を確認
83
+ - 専門用語の正確性
84
+ - ドメイン常識との一致
85
+ - 標準プロトコルの遵守
86
+ """
87
+ return {
88
+ "domain_consistency": float,
89
+ "terminology_accuracy": float,
90
+ "protocol_compliance": bool
91
+ }
92
+ ```
93
+
94
+ #### 3.3 Beta Lobe (Advanced) - 深層推論検証
95
+ ```python
96
+ def beta_lobe_advanced(answer, reasoning_chain, meta_knowledge):
97
+ """
98
+ 高度な推論プロセスを検証
99
+ - 多段階推論の妥当性
100
+ - 因果関係の正確性
101
+ - エッジケースの考慮
102
+ """
103
+ return {
104
+ "reasoning_depth": int,
105
+ "causal_accuracy": float,
106
+ "edge_case_coverage": float
107
+ }
108
+ ```
109
+
110
+ ### 4. ファインチューニング詳細
111
+
112
+ #### 4.1 トレーニングプロセス
113
+
114
+ **フェーズ1: データ準備**
115
+ ```bash
116
+ # データセットの分割(8:1:1の比率)
117
+ - 訓練データ: 8,768例
118
+ - 検証データ: 975例
119
+ - テストデータ: 保留
120
+
121
+ # データ形式
122
+ {
123
+ "text": "システムプロンプト + 質問 + 回答",
124
+ "domain": "medical|legal|programming|science|general",
125
+ "difficulty": 1-5,
126
+ "requires_reasoning": bool
127
+ }
128
+ ```
129
+
130
+ **フェーズ2: モデル量子化**
131
+ ```bash
132
+ # MLXでの4bit量子化
133
+ python -m mlx_lm.convert \
134
+ --hf-path deepseek-ai/DeepSeek-R1-Distill-Qwen-32B \
135
+ --mlx-path ./deepseek-r1-32b-mlx-4bit \
136
+ --quantize \
137
+ --q-bits 4 \
138
+ --q-group-size 64 \
139
+ --trust-remote-code
140
+
141
+ # 結果: 61GB → 17.2GB (72%削減)
142
+ ```
143
+
144
+ **フェーズ3: LoRAファインチューニング**
145
+ ```bash
146
+ python -m mlx_lm lora \
147
+ --model ./deepseek-r1-32b-mlx-4bit \
148
+ --train \
149
+ --data . \
150
+ --iters 1000 \
151
+ --adapter-path ./adapters \
152
+ --batch-size 1 \
153
+ --learning-rate 1e-5 \
154
+ --steps-per-report 10 \
155
+ --steps-per-eval 100 \
156
+ --save-every 250 \
157
+ --grad-checkpoint \
158
+ --max-seq-length 2048
159
+ ```
160
+
161
+ #### 4.2 ハイパーパラメータ選択の根拠
162
+
163
+ | パラメータ | 値 | 理由 |
164
+ |-----------|-----|------|
165
+ | Learning Rate | 1e-5 | 大規模モデルの安定した学習のため |
166
+ | Batch Size | 1 | メモリ制約下での最大効率 |
167
+ | LoRA Rank | 16 | パラメータ効率と品質のバランス |
168
+ | LoRA Alpha | 32 | Rank×2の標準設定 |
169
+ | Max Seq Length | 2048 | 長文推論に対応 |
170
+ | Gradient Checkpointing | True | メモリ使用量削減 |
171
+
172
+ #### 4.3 学習曲線解析
173
+
174
+ ```
175
+ Iteration Train Loss Val Loss Improvement
176
+ ----------------------------------------------
177
+ 0 - 3.318 -
178
+ 100 1.548 1.583 52.3%
179
+ 200 0.860 0.934 71.9%
180
+ 300 0.682 1.113 66.5%
181
+ 400 1.260 0.741 77.7%
182
+ 500 0.681 0.832 74.9%
183
+ 600 0.561 0.885 73.3%
184
+ 700 0.710 0.897 73.0%
185
+ 800 0.589 0.621 81.3%
186
+ 900 0.574 0.705 78.7%
187
+ 1000 0.583 0.712 78.5%
188
+ ```
189
+
190
+ **観察結果:**
191
+ - 初期100イテレーションで急激な改善(52.3%)
192
+ - 200-500イテレーションで安定した学習
193
+ - 800イテレーション付近で最良の検証ロス
194
+ - 最終的に78.5%の改善を達成
195
+
196
+ ### 5. 推論最適化
197
+
198
+ #### 5.1 Apple Silicon (MPS) 最適化
199
+ ```python
200
+ # MLXは自動的にApple Siliconに最適化
201
+ - Unified Memory Architecture活用
202
+ - Metal Performance Shaders使用
203
+ - Neural Engine活用(一部演算)
204
+ ```
205
+
206
+ #### 5.2 推論速度
207
+
208
+ | メトリクス | 値 |
209
+ |----------|-----|
210
+ | トークン/秒 | 30-35 |
211
+ | イテレーション/秒 | 0.35-0.40 |
212
+ | ピークメモリ | 19.9GB |
213
+ | 平均レイテンシ | ~2.8秒/iteration |
214
+
215
+ ---
216
+
217
+ ## English
218
+
219
+ ### 1. NullAI System Architecture
220
+
221
+ NullAI is an advanced knowledge-based system that integrates multi-domain knowledge reasoning and verification. It's not just an LLM, but combines structured knowledge management with multi-stage verification systems.
222
+
223
+ #### 1.1 Hierarchical Structure
224
+
225
+ ```
226
+ ┌─────────────────────────────────────┐
227
+ │ Layer 5: State Management │ ← System state management
228
+ ├─────────────────────────────────────┤
229
+ │ Layer 4: Judge System │ ← Answer verification & evaluation
230
+ │ ├─ Alpha Lobe (Basic Logic) │
231
+ │ ├─ Beta Basic (Domain Consistency)│
232
+ │ └─ Beta Advanced (Deep Reasoning) │
233
+ ├─────────────────────────────────────┤
234
+ │ Layer 3: Inference Engine │ ← DeepSeek R1 inference
235
+ ├─────────────────────────────────────┤
236
+ │ Layer 2: Episodic Binding │ ← Knowledge tile association
237
+ ├─────────────────────────────────────┤
238
+ │ Layer 1: Spatial Encoding │ ← Knowledge spatial placement
239
+ └─────────────────────────────────────┘
240
+ ```
241
+
242
+ ### 2. Knowledge Tile System
243
+
244
+ #### 2.1 Structure
245
+ Each piece of knowledge is structured as a tile with the following elements:
246
+
247
+ ```python
248
+ {
249
+ "tile_id": "unique_identifier",
250
+ "domain": "medical|legal|programming|science|general",
251
+ "content": "Knowledge content",
252
+ "coordinates": {
253
+ "x": float, # X coordinate in concept space
254
+ "y": float, # Y coordinate in concept space
255
+ "z": float # Z coordinate in concept space
256
+ },
257
+ "certainty_score": float, # 0.0-1.0
258
+ "orcid_verified": bool,
259
+ "expert_id": "ORCID_ID",
260
+ "reasoning_chain": [...],
261
+ "citations": [...]
262
+ }
263
+ ```
264
+
265
+ #### 2.2 Spatial Coordinate System
266
+ - **X-axis**: Abstraction level (Concrete ← → Abstract)
267
+ - **Y-axis**: Expertise level (Basic ← → Advanced)
268
+ - **Z-axis**: Temporality (Universal ← → Latest trends)
269
+
270
+ This 3D space enables efficient retrieval and reasoning of related knowledge.
271
+
272
+ ### 3. Judge System
273
+
274
+ #### 3.1 Alpha Lobe - Basic Logic Verification
275
+ ```python
276
+ def alpha_lobe_check(reasoning_chain):
277
+ """
278
+ Verifies basic logical consistency
279
+ - Contradiction detection
280
+ - Premise-conclusion consistency
281
+ - Reasoning step validity
282
+ """
283
+ return {
284
+ "passed": bool,
285
+ "issues": [],
286
+ "confidence": float
287
+ }
288
+ ```
289
+
290
+ #### 3.2 Beta Lobe (Basic) - Domain Knowledge Consistency
291
+ ```python
292
+ def beta_lobe_basic(answer, domain_knowledge):
293
+ """
294
+ Checks consistency with domain-specific knowledge
295
+ - Terminology accuracy
296
+ - Domain common sense alignment
297
+ - Standard protocol compliance
298
+ """
299
+ return {
300
+ "domain_consistency": float,
301
+ "terminology_accuracy": float,
302
+ "protocol_compliance": bool
303
+ }
304
+ ```
305
+
306
+ #### 3.3 Beta Lobe (Advanced) - Deep Reasoning Verification
307
+ ```python
308
+ def beta_lobe_advanced(answer, reasoning_chain, meta_knowledge):
309
+ """
310
+ Verifies advanced reasoning processes
311
+ - Multi-step reasoning validity
312
+ - Causal relationship accuracy
313
+ - Edge case consideration
314
+ """
315
+ return {
316
+ "reasoning_depth": int,
317
+ "causal_accuracy": float,
318
+ "edge_case_coverage": float
319
+ }
320
+ ```
321
+
322
+ ### 4. Fine-tuning Details
323
+
324
+ #### 4.1 Training Process
325
+
326
+ **Phase 1: Data Preparation**
327
+ ```bash
328
+ # Dataset split (8:1:1 ratio)
329
+ - Training data: 8,768 examples
330
+ - Validation data: 975 examples
331
+ - Test data: Withheld
332
+
333
+ # Data format
334
+ {
335
+ "text": "System prompt + Question + Answer",
336
+ "domain": "medical|legal|programming|science|general",
337
+ "difficulty": 1-5,
338
+ "requires_reasoning": bool
339
+ }
340
+ ```
341
+
342
+ **Phase 2: Model Quantization**
343
+ ```bash
344
+ # 4-bit quantization with MLX
345
+ python -m mlx_lm.convert \
346
+ --hf-path deepseek-ai/DeepSeek-R1-Distill-Qwen-32B \
347
+ --mlx-path ./deepseek-r1-32b-mlx-4bit \
348
+ --quantize \
349
+ --q-bits 4 \
350
+ --q-group-size 64 \
351
+ --trust-remote-code
352
+
353
+ # Result: 61GB → 17.2GB (72% reduction)
354
+ ```
355
+
356
+ **Phase 3: LoRA Fine-tuning**
357
+ ```bash
358
+ python -m mlx_lm lora \
359
+ --model ./deepseek-r1-32b-mlx-4bit \
360
+ --train \
361
+ --data . \
362
+ --iters 1000 \
363
+ --adapter-path ./adapters \
364
+ --batch-size 1 \
365
+ --learning-rate 1e-5 \
366
+ --steps-per-report 10 \
367
+ --steps-per-eval 100 \
368
+ --save-every 250 \
369
+ --grad-checkpoint \
370
+ --max-seq-length 2048
371
+ ```
372
+
373
+ #### 4.2 Hyperparameter Selection Rationale
374
+
375
+ | Parameter | Value | Reasoning |
376
+ |-----------|-------|-----------|
377
+ | Learning Rate | 1e-5 | Stable learning for large models |
378
+ | Batch Size | 1 | Maximum efficiency under memory constraints |
379
+ | LoRA Rank | 16 | Balance between parameter efficiency and quality |
380
+ | LoRA Alpha | 32 | Standard setting of Rank×2 |
381
+ | Max Seq Length | 2048 | Support for long-form reasoning |
382
+ | Gradient Checkpointing | True | Reduced memory usage |
383
+
384
+ #### 4.3 Learning Curve Analysis
385
+
386
+ ```
387
+ Iteration Train Loss Val Loss Improvement
388
+ ----------------------------------------------
389
+ 0 - 3.318 -
390
+ 100 1.548 1.583 52.3%
391
+ 200 0.860 0.934 71.9%
392
+ 300 0.682 1.113 66.5%
393
+ 400 1.260 0.741 77.7%
394
+ 500 0.681 0.832 74.9%
395
+ 600 0.561 0.885 73.3%
396
+ 700 0.710 0.897 73.0%
397
+ 800 0.589 0.621 81.3%
398
+ 900 0.574 0.705 78.7%
399
+ 1000 0.583 0.712 78.5%
400
+ ```
401
+
402
+ **Observations:**
403
+ - Rapid improvement in first 100 iterations (52.3%)
404
+ - Stable learning from iterations 200-500
405
+ - Best validation loss around iteration 800
406
+ - Final improvement of 78.5% achieved
407
+
408
+ ### 5. Inference Optimization
409
+
410
+ #### 5.1 Apple Silicon (MPS) Optimization
411
+ ```python
412
+ # MLX automatically optimizes for Apple Silicon
413
+ - Unified Memory Architecture utilization
414
+ - Metal Performance Shaders usage
415
+ - Neural Engine utilization (partial operations)
416
+ ```
417
+
418
+ #### 5.2 Inference Speed
419
+
420
+ | Metric | Value |
421
+ |--------|-------|
422
+ | Tokens/sec | 30-35 |
423
+ | Iterations/sec | 0.35-0.40 |
424
+ | Peak Memory | 19.9GB |
425
+ | Average Latency | ~2.8s/iteration |
426
+
427
+ ### 6. Model Capabilities by Domain
428
+
429
+ **Medical Domain:**
430
+ - Diagnostic reasoning pathways
431
+ - Treatment protocol recommendations
432
+ - Drug interaction analysis
433
+ - Clinical guideline interpretation
434
+
435
+ **Legal Domain:**
436
+ - Legal precedent analysis
437
+ - Statutory interpretation
438
+ - Contract clause analysis
439
+ - Regulatory compliance guidance
440
+
441
+ **Programming Domain:**
442
+ - Code generation and optimization
443
+ - Bug detection and debugging
444
+ - Algorithm design and analysis
445
+ - Software architecture recommendations
446
+
447
+ **Scientific Domain:**
448
+ - Research methodology design
449
+ - Statistical analysis guidance
450
+ - Experimental design optimization
451
+ - Data interpretation support
452
+
453
+ **General Domain:**
454
+ - Broad knowledge retrieval
455
+ - Multi-domain reasoning
456
+ - Explanation generation
457
+ - Knowledge synthesis
458
+
459
+ ### 7. Limitations and Future Work
460
+
461
+ **Current Limitations:**
462
+ - Requires significant RAM (20GB+) for inference
463
+ - Response latency on non-optimized hardware
464
+ - Domain-specific accuracy varies
465
+
466
+ **Future Improvements:**
467
+ - Further quantization experiments (3-bit, 2-bit)
468
+ - Domain-specific adapter modules
469
+ - Real-time ORCID verification integration
470
+ - Expanded training dataset across domains
471
+ - Multi-lingual support expansion
chat_template.jinja ADDED
@@ -0,0 +1 @@
 
 
1
+ {% if not add_generation_prompt is defined %}{% set add_generation_prompt = false %}{% endif %}{% set ns = namespace(is_first=false, is_tool=false, is_output_first=true, system_prompt='') %}{%- for message in messages %}{%- if message['role'] == 'system' %}{% set ns.system_prompt = message['content'] %}{%- endif %}{%- endfor %}{{bos_token}}{{ns.system_prompt}}{%- for message in messages %}{%- if message['role'] == 'user' %}{%- set ns.is_tool = false -%}{{'<|User|>' + message['content']}}{%- endif %}{%- if message['role'] == 'assistant' and message['content'] is none %}{%- set ns.is_tool = false -%}{%- for tool in message['tool_calls']%}{%- if not ns.is_first %}{{'<|Assistant|><|tool▁calls▁begin|><|tool▁call▁begin|>' + tool['type'] + '<|tool▁sep|>' + tool['function']['name'] + '\n' + '```json' + '\n' + tool['function']['arguments'] + '\n' + '```' + '<|tool▁call▁end|>'}}{%- set ns.is_first = true -%}{%- else %}{{'\n' + '<|tool▁call▁begin|>' + tool['type'] + '<|tool▁sep|>' + tool['function']['name'] + '\n' + '```json' + '\n' + tool['function']['arguments'] + '\n' + '```' + '<|tool▁call▁end|>'}}{{'<|tool▁calls▁end|><|end▁of▁sentence|>'}}{%- endif %}{%- endfor %}{%- endif %}{%- if message['role'] == 'assistant' and message['content'] is not none %}{%- if ns.is_tool %}{{'<|tool▁outputs▁end|>' + message['content'] + '<|end▁of▁sentence|>'}}{%- set ns.is_tool = false -%}{%- else %}{% set content = message['content'] %}{% if '</think>' in content %}{% set content = content.split('</think>')[-1] %}{% endif %}{{'<|Assistant|>' + content + '<|end▁of▁sentence|>'}}{%- endif %}{%- endif %}{%- if message['role'] == 'tool' %}{%- set ns.is_tool = true -%}{%- if ns.is_output_first %}{{'<|tool▁outputs▁begin|><|tool▁output▁begin|>' + message['content'] + '<|tool▁output▁end|>'}}{%- set ns.is_output_first = false %}{%- else %}{{'\n<|tool▁output▁begin|>' + message['content'] + '<|tool▁output▁end|>'}}{%- endif %}{%- endif %}{%- endfor -%}{% if ns.is_tool %}{{'<|tool▁outputs▁end|>'}}{% endif %}{% if add_generation_prompt and not ns.is_tool %}{{'<|Assistant|><think>\n'}}{% endif %}
config.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "Qwen2ForCausalLM"
4
+ ],
5
+ "attention_dropout": 0.0,
6
+ "bos_token_id": 151643,
7
+ "eos_token_id": 151643,
8
+ "hidden_act": "silu",
9
+ "hidden_size": 5120,
10
+ "initializer_range": 0.02,
11
+ "intermediate_size": 27648,
12
+ "max_position_embeddings": 131072,
13
+ "max_window_layers": 64,
14
+ "model_type": "qwen2",
15
+ "num_attention_heads": 40,
16
+ "num_hidden_layers": 64,
17
+ "num_key_value_heads": 8,
18
+ "quantization": {
19
+ "group_size": 64,
20
+ "bits": 4,
21
+ "mode": "affine"
22
+ },
23
+ "quantization_config": {
24
+ "group_size": 64,
25
+ "bits": 4,
26
+ "mode": "affine"
27
+ },
28
+ "rms_norm_eps": 1e-05,
29
+ "rope_theta": 1000000.0,
30
+ "sliding_window": 131072,
31
+ "tie_word_embeddings": false,
32
+ "torch_dtype": "bfloat16",
33
+ "transformers_version": "4.43.1",
34
+ "use_cache": true,
35
+ "use_sliding_window": false,
36
+ "vocab_size": 152064
37
+ }
generation_config.json ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_from_model_config": true,
3
+ "bos_token_id": 151646,
4
+ "eos_token_id": 151643,
5
+ "do_sample": true,
6
+ "temperature": 0.6,
7
+ "top_p": 0.95,
8
+ "transformers_version": "4.39.3"
9
+ }
model-00001-of-00004.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b8945fcfa6f506b8b62f949c76f58ceb5e99b79fd3e28c964c9e79fbadd98f9e
3
+ size 5366583057
model-00002-of-00004.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2098646673c2329a65f6ca6efd7e4ec15d5fbd81be2a21b339d0ad39db583376
3
+ size 5335713350
model-00003-of-00004.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:66092cfcae6835603479a186cd3b72ad50888b93504533f515459c944bc43cde
3
+ size 5366642308
model-00004-of-00004.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b870547218faf53c714092e2cbe1ee94f898fae6a09afbda9aa63b823fcd63a3
3
+ size 2362540963
model.safetensors.index.json ADDED
The diff for this file is too large to render. See raw diff
 
special_tokens_map.json ADDED
@@ -0,0 +1,23 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "<|begin▁of▁sentence|>",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "eos_token": {
10
+ "content": "<|end▁of▁sentence|>",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "<|end▁of▁sentence|>",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ }
23
+ }
tokenizer.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e20ddafc659ba90242154b55275402edeca0715e5dbb30f56815a4ce081f4893
3
+ size 11422778
tokenizer_config.json ADDED
@@ -0,0 +1,194 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_bos_token": true,
3
+ "add_eos_token": false,
4
+ "add_prefix_space": null,
5
+ "added_tokens_decoder": {
6
+ "151643": {
7
+ "content": "<|end▁of▁sentence|>",
8
+ "lstrip": false,
9
+ "normalized": false,
10
+ "rstrip": false,
11
+ "single_word": false,
12
+ "special": true
13
+ },
14
+ "151644": {
15
+ "content": "<|User|>",
16
+ "lstrip": false,
17
+ "normalized": false,
18
+ "rstrip": false,
19
+ "single_word": false,
20
+ "special": false
21
+ },
22
+ "151645": {
23
+ "content": "<|Assistant|>",
24
+ "lstrip": false,
25
+ "normalized": false,
26
+ "rstrip": false,
27
+ "single_word": false,
28
+ "special": false
29
+ },
30
+ "151646": {
31
+ "content": "<|begin▁of▁sentence|>",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false,
36
+ "special": true
37
+ },
38
+ "151647": {
39
+ "content": "<|EOT|>",
40
+ "lstrip": false,
41
+ "normalized": false,
42
+ "rstrip": false,
43
+ "single_word": false,
44
+ "special": false
45
+ },
46
+ "151648": {
47
+ "content": "<think>",
48
+ "lstrip": false,
49
+ "normalized": false,
50
+ "rstrip": false,
51
+ "single_word": false,
52
+ "special": false
53
+ },
54
+ "151649": {
55
+ "content": "</think>",
56
+ "lstrip": false,
57
+ "normalized": false,
58
+ "rstrip": false,
59
+ "single_word": false,
60
+ "special": false
61
+ },
62
+ "151650": {
63
+ "content": "<|quad_start|>",
64
+ "lstrip": false,
65
+ "normalized": false,
66
+ "rstrip": false,
67
+ "single_word": false,
68
+ "special": true
69
+ },
70
+ "151651": {
71
+ "content": "<|quad_end|>",
72
+ "lstrip": false,
73
+ "normalized": false,
74
+ "rstrip": false,
75
+ "single_word": false,
76
+ "special": true
77
+ },
78
+ "151652": {
79
+ "content": "<|vision_start|>",
80
+ "lstrip": false,
81
+ "normalized": false,
82
+ "rstrip": false,
83
+ "single_word": false,
84
+ "special": true
85
+ },
86
+ "151653": {
87
+ "content": "<|vision_end|>",
88
+ "lstrip": false,
89
+ "normalized": false,
90
+ "rstrip": false,
91
+ "single_word": false,
92
+ "special": true
93
+ },
94
+ "151654": {
95
+ "content": "<|vision_pad|>",
96
+ "lstrip": false,
97
+ "normalized": false,
98
+ "rstrip": false,
99
+ "single_word": false,
100
+ "special": true
101
+ },
102
+ "151655": {
103
+ "content": "<|image_pad|>",
104
+ "lstrip": false,
105
+ "normalized": false,
106
+ "rstrip": false,
107
+ "single_word": false,
108
+ "special": true
109
+ },
110
+ "151656": {
111
+ "content": "<|video_pad|>",
112
+ "lstrip": false,
113
+ "normalized": false,
114
+ "rstrip": false,
115
+ "single_word": false,
116
+ "special": true
117
+ },
118
+ "151657": {
119
+ "content": "<tool_call>",
120
+ "lstrip": false,
121
+ "normalized": false,
122
+ "rstrip": false,
123
+ "single_word": false,
124
+ "special": false
125
+ },
126
+ "151658": {
127
+ "content": "</tool_call>",
128
+ "lstrip": false,
129
+ "normalized": false,
130
+ "rstrip": false,
131
+ "single_word": false,
132
+ "special": false
133
+ },
134
+ "151659": {
135
+ "content": "<|fim_prefix|>",
136
+ "lstrip": false,
137
+ "normalized": false,
138
+ "rstrip": false,
139
+ "single_word": false,
140
+ "special": false
141
+ },
142
+ "151660": {
143
+ "content": "<|fim_middle|>",
144
+ "lstrip": false,
145
+ "normalized": false,
146
+ "rstrip": false,
147
+ "single_word": false,
148
+ "special": false
149
+ },
150
+ "151661": {
151
+ "content": "<|fim_suffix|>",
152
+ "lstrip": false,
153
+ "normalized": false,
154
+ "rstrip": false,
155
+ "single_word": false,
156
+ "special": false
157
+ },
158
+ "151662": {
159
+ "content": "<|fim_pad|>",
160
+ "lstrip": false,
161
+ "normalized": false,
162
+ "rstrip": false,
163
+ "single_word": false,
164
+ "special": false
165
+ },
166
+ "151663": {
167
+ "content": "<|repo_name|>",
168
+ "lstrip": false,
169
+ "normalized": false,
170
+ "rstrip": false,
171
+ "single_word": false,
172
+ "special": false
173
+ },
174
+ "151664": {
175
+ "content": "<|file_sep|>",
176
+ "lstrip": false,
177
+ "normalized": false,
178
+ "rstrip": false,
179
+ "single_word": false,
180
+ "special": false
181
+ }
182
+ },
183
+ "bos_token": "<|begin▁of▁sentence|>",
184
+ "clean_up_tokenization_spaces": false,
185
+ "eos_token": "<|end▁of▁sentence|>",
186
+ "extra_special_tokens": {},
187
+ "legacy": true,
188
+ "model_max_length": 16384,
189
+ "pad_token": "<|end▁of▁sentence|>",
190
+ "sp_model_kwargs": {},
191
+ "tokenizer_class": "LlamaTokenizerFast",
192
+ "unk_token": null,
193
+ "use_default_system_prompt": false
194
+ }