Upload docs/FEATURES.md with huggingface_hub
Browse files- docs/FEATURES.md +272 -0
docs/FEATURES.md
ADDED
|
@@ -0,0 +1,272 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# NullAI 機能解説
|
| 2 |
+
|
| 3 |
+
## システム概要
|
| 4 |
+
|
| 5 |
+
NullAIは、専門家検証機能付きのマルチドメイン知識推論システムです。
|
| 6 |
+
HuggingFace Transformersベースで、外部API依存なしに動作します。
|
| 7 |
+
|
| 8 |
+
---
|
| 9 |
+
|
| 10 |
+
## 1. 推論システム (Inference System)
|
| 11 |
+
|
| 12 |
+
### 1.1 ModelRouter (`null_ai/model_router.py`)
|
| 13 |
+
**機能**: ドメインに応じた最適なモデルの自動選択と推論実行
|
| 14 |
+
|
| 15 |
+
```python
|
| 16 |
+
# 使用例
|
| 17 |
+
router = ModelRouter(config_manager)
|
| 18 |
+
result = await router.infer(
|
| 19 |
+
prompt="心筋梗塞の初期症状は?",
|
| 20 |
+
domain_id="medical",
|
| 21 |
+
save_to_memory=True
|
| 22 |
+
)
|
| 23 |
+
```
|
| 24 |
+
|
| 25 |
+
**効果**:
|
| 26 |
+
- 55以上のドメインに対応した専門的な回答
|
| 27 |
+
- 自動モデル選択による最適化
|
| 28 |
+
- 推論結果の自動メモリ保存
|
| 29 |
+
|
| 30 |
+
### 1.2 HuggingFace Integration
|
| 31 |
+
**対応プロバイダー**:
|
| 32 |
+
| プロバイダー | 説明 | 推奨用途 |
|
| 33 |
+
|-------------|------|---------|
|
| 34 |
+
| `huggingface` | ローカルTransformers | 高品質推論 |
|
| 35 |
+
| `huggingface_api` | Inference API | 低リソース環境 |
|
| 36 |
+
| `gguf` | llama.cpp互換 | 高速CPU推論 |
|
| 37 |
+
|
| 38 |
+
### 1.3 ストリーミング生成
|
| 39 |
+
**機能**: トークン単位でのリアルタイム配信
|
| 40 |
+
|
| 41 |
+
```python
|
| 42 |
+
async for chunk in service.stream_tokens(session_id, question, domain_id):
|
| 43 |
+
if chunk["type"] == "token":
|
| 44 |
+
print(chunk["content"], end="", flush=True)
|
| 45 |
+
```
|
| 46 |
+
|
| 47 |
+
---
|
| 48 |
+
|
| 49 |
+
## 2. 知識ベースシステム (Knowledge Base)
|
| 50 |
+
|
| 51 |
+
### 2.1 IATH形式データベース
|
| 52 |
+
**機能**: 高圧縮・高速アクセスの独自バイナリ形式
|
| 53 |
+
|
| 54 |
+
```
|
| 55 |
+
┌─────────────────────────────────────┐
|
| 56 |
+
│ IATH Header (Magic + Version) │
|
| 57 |
+
├─────────────────────────────────────┤
|
| 58 |
+
│ Metadata (JSON, zstd compressed) │
|
| 59 |
+
├─────────────────────────────────────┤
|
| 60 |
+
│ Tiles Index │
|
| 61 |
+
├─────────────────────────────────────┤
|
| 62 |
+
│ Knowledge Tiles (zstd compressed) │
|
| 63 |
+
└─────────────────────────────────────┘
|
| 64 |
+
```
|
| 65 |
+
|
| 66 |
+
**効果**:
|
| 67 |
+
- 70-90%の圧縮率
|
| 68 |
+
- O(1)のタイルアクセス
|
| 69 |
+
- 増分更新サポート
|
| 70 |
+
|
| 71 |
+
### 2.2 Knowledge Tile
|
| 72 |
+
**構造**:
|
| 73 |
+
```json
|
| 74 |
+
{
|
| 75 |
+
"tile_id": "med_001",
|
| 76 |
+
"domain_id": "medical",
|
| 77 |
+
"topic": "心筋梗塞の初期症状",
|
| 78 |
+
"content": "...",
|
| 79 |
+
"spatial_coords": [0.7, 0.3, 0.85],
|
| 80 |
+
"confidence_score": 0.95,
|
| 81 |
+
"verification_mark": {
|
| 82 |
+
"type": "expert",
|
| 83 |
+
"expert_orcid": "0000-0002-1234-5678"
|
| 84 |
+
}
|
| 85 |
+
}
|
| 86 |
+
```
|
| 87 |
+
|
| 88 |
+
---
|
| 89 |
+
|
| 90 |
+
## 3. 検証マークシステム (Verification System)
|
| 91 |
+
|
| 92 |
+
### 3.1 検証レベル
|
| 93 |
+
| レベル | 条件 | 信頼度 |
|
| 94 |
+
|--------|------|--------|
|
| 95 |
+
| `multi_expert` | 2人以上の専門家が検証 | 最高 |
|
| 96 |
+
| `expert` | ORCID認証済み専門家が検証 | 高 |
|
| 97 |
+
| `community` | 認証ユーザーがレビュー | 中 |
|
| 98 |
+
| `none` | 未検証(ゲスト編集含む) | 低 |
|
| 99 |
+
|
| 100 |
+
### 3.2 ORCID認証
|
| 101 |
+
**フロー**:
|
| 102 |
+
```
|
| 103 |
+
ユーザー → ORCID OAuth → 認証コード → JWTトークン → 専門家ステータス
|
| 104 |
+
```
|
| 105 |
+
|
| 106 |
+
**効果**:
|
| 107 |
+
- 学術的な信頼性の担保
|
| 108 |
+
- 専門家の所属機関・業績の確認可能
|
| 109 |
+
- 改ざん防止
|
| 110 |
+
|
| 111 |
+
---
|
| 112 |
+
|
| 113 |
+
## 4. 判断システム (Judgment System)
|
| 114 |
+
|
| 115 |
+
### 4.1 α-Lobe(生成ローブ)
|
| 116 |
+
**機能**: 質問に対する初期回答の生成
|
| 117 |
+
|
| 118 |
+
```python
|
| 119 |
+
class JudgeAlphaLobe:
|
| 120 |
+
def generate_initial_response(self, question, context):
|
| 121 |
+
# 空間座標の計算
|
| 122 |
+
coords = self.calculate_spatial_coordinates(question)
|
| 123 |
+
# 関連知識の検索
|
| 124 |
+
relevant_tiles = self.search_knowledge_base(coords)
|
| 125 |
+
# 回答生成
|
| 126 |
+
return self.generate_response(question, relevant_tiles)
|
| 127 |
+
```
|
| 128 |
+
|
| 129 |
+
### 4.2 β-Lobe(検証ローブ)
|
| 130 |
+
**機能**: 生成された回答の品質検証
|
| 131 |
+
|
| 132 |
+
```python
|
| 133 |
+
class JudgeBetaLobe:
|
| 134 |
+
def verify_response(self, response, question, context):
|
| 135 |
+
# 事実性チェック
|
| 136 |
+
factual_score = self.check_factuality(response)
|
| 137 |
+
# 一貫性チェック
|
| 138 |
+
consistency_score = self.check_consistency(response, context)
|
| 139 |
+
# 信頼度計算
|
| 140 |
+
confidence = self.calculate_confidence(factual_score, consistency_score)
|
| 141 |
+
return {"verified": confidence > 0.7, "confidence": confidence}
|
| 142 |
+
```
|
| 143 |
+
|
| 144 |
+
### 4.3 Correction Flow
|
| 145 |
+
**機能**: β-Lobeが不合格とした場合の再生成
|
| 146 |
+
|
| 147 |
+
```
|
| 148 |
+
α-Lobe生成 → β-Lobe検証 → 不合格 → 修正プロンプト → 再生成 → 再検証
|
| 149 |
+
```
|
| 150 |
+
|
| 151 |
+
---
|
| 152 |
+
|
| 153 |
+
## 5. 記憶システム (Memory System)
|
| 154 |
+
|
| 155 |
+
### 5.1 DendriticMemorySpace
|
| 156 |
+
**機能**: 円柱座標系での知識表現
|
| 157 |
+
|
| 158 |
+
```
|
| 159 |
+
r (半径): 抽象度 (0=具体的, 1=抽象的)
|
| 160 |
+
θ (角度): ドメイン (0-2π)
|
| 161 |
+
z (高さ): 時間/重要度
|
| 162 |
+
```
|
| 163 |
+
|
| 164 |
+
**効果**:
|
| 165 |
+
- 意味的に近い知識の効率的検索
|
| 166 |
+
- ドメイン横断的な関連付け
|
| 167 |
+
- ��系列での知識管理
|
| 168 |
+
|
| 169 |
+
### 5.2 自動メモリ保存
|
| 170 |
+
**条件**:
|
| 171 |
+
- 信頼度が閾値(デフォルト0.6)以上
|
| 172 |
+
- 検証済みコンテンツ
|
| 173 |
+
- 新規または更新された知識
|
| 174 |
+
|
| 175 |
+
---
|
| 176 |
+
|
| 177 |
+
## 6. 継続学習システム (NurseLog System)
|
| 178 |
+
|
| 179 |
+
### 6.1 倒木更新(世代交代)
|
| 180 |
+
**コンセプト**: 森林の倒木が次世代の栄養となるように、古いモデルの知識を新モデルに継承
|
| 181 |
+
|
| 182 |
+
```python
|
| 183 |
+
class NurseLogSystem:
|
| 184 |
+
def prepare_succession(self):
|
| 185 |
+
# 1. 現行モデルの知識を抽出
|
| 186 |
+
knowledge = self.extract_model_knowledge()
|
| 187 |
+
# 2. 継承データセットの作成
|
| 188 |
+
dataset = self.create_succession_dataset(knowledge)
|
| 189 |
+
# 3. 新モデルのファインチューニング
|
| 190 |
+
new_model = self.finetune_apprentice(dataset)
|
| 191 |
+
# 4. 世代交代
|
| 192 |
+
self.switch_generation(new_model)
|
| 193 |
+
```
|
| 194 |
+
|
| 195 |
+
### 6.2 Dream Mode(夢モード)
|
| 196 |
+
**機能**: 低負荷時に知識の整理・統合を実行
|
| 197 |
+
|
| 198 |
+
```python
|
| 199 |
+
# 50会話ごとに自動実行
|
| 200 |
+
if conversation_count % 50 == 0:
|
| 201 |
+
await nurse_log_system.enter_dream_mode()
|
| 202 |
+
```
|
| 203 |
+
|
| 204 |
+
---
|
| 205 |
+
|
| 206 |
+
## 7. API エンドポイント
|
| 207 |
+
|
| 208 |
+
### 7.1 推論API
|
| 209 |
+
```
|
| 210 |
+
POST /api/questions/
|
| 211 |
+
WebSocket /api/questions/ws/{session_id}
|
| 212 |
+
```
|
| 213 |
+
|
| 214 |
+
### 7.2 知識ベースAPI
|
| 215 |
+
```
|
| 216 |
+
GET /api/knowledge/ # タイル一覧
|
| 217 |
+
GET /api/knowledge/{tile_id} # タイル詳細
|
| 218 |
+
PUT /api/knowledge/{tile_id} # 編集(検証マーク付与)
|
| 219 |
+
GET /api/knowledge/stats/summary # 統計
|
| 220 |
+
```
|
| 221 |
+
|
| 222 |
+
### 7.3 認証API
|
| 223 |
+
```
|
| 224 |
+
POST /api/auth/token # ログイン
|
| 225 |
+
POST /api/auth/signup # 登録
|
| 226 |
+
GET /api/auth/orcid/authorize # ORCID認証開始
|
| 227 |
+
GET /api/auth/orcid/callback # ORCID認証コールバック
|
| 228 |
+
```
|
| 229 |
+
|
| 230 |
+
### 7.4 システムAPI
|
| 231 |
+
```
|
| 232 |
+
GET /api/system/status # システム状態
|
| 233 |
+
GET /api/system/health # ヘルスチェック
|
| 234 |
+
GET /api/system/providers # 対応プロバイダー
|
| 235 |
+
```
|
| 236 |
+
|
| 237 |
+
---
|
| 238 |
+
|
| 239 |
+
## 8. ドメイン一覧(55ドメイン)
|
| 240 |
+
|
| 241 |
+
### 医療・健康 (8)
|
| 242 |
+
medical, cardiology, neurology, oncology, pediatrics, psychiatry, pharmacology, nutrition
|
| 243 |
+
|
| 244 |
+
### 法律 (8)
|
| 245 |
+
legal, labor_law, corporate_law, intellectual_property, contract_law, tax_law, family_law, criminal_law
|
| 246 |
+
|
| 247 |
+
### 経済・金融 (6)
|
| 248 |
+
economics, finance, accounting, banking, cryptocurrency, real_estate
|
| 249 |
+
|
| 250 |
+
### テクノロジー (8)
|
| 251 |
+
programming, web_development, machine_learning, cybersecurity, cloud_computing, devops, databases, mobile_development
|
| 252 |
+
|
| 253 |
+
### 科学 (6)
|
| 254 |
+
physics, chemistry, biology, mathematics, environmental_science, astronomy
|
| 255 |
+
|
| 256 |
+
### ビジネス (6)
|
| 257 |
+
business_strategy, marketing, human_resources, project_management, entrepreneurship, supply_chain
|
| 258 |
+
|
| 259 |
+
### 教育 (4)
|
| 260 |
+
education, language_learning, history, philosophy
|
| 261 |
+
|
| 262 |
+
### 工学 (6)
|
| 263 |
+
mechanical_engineering, electrical_engineering, civil_engineering, architecture, manufacturing, robotics
|
| 264 |
+
|
| 265 |
+
### ライフスタイル (5)
|
| 266 |
+
cooking, fitness, travel, gardening, diy
|
| 267 |
+
|
| 268 |
+
### 芸術 (5)
|
| 269 |
+
music, visual_arts, photography, writing, game_development
|
| 270 |
+
|
| 271 |
+
### 一般 (1)
|
| 272 |
+
general
|