Upload folder using huggingface_hub

Browse files

Files changed (14) hide show

.gitattributes +1 -0
README.md +249 -0
TECHNICAL_DECK.md +471 -0
chat_template.jinja +1 -0
config.json +37 -0
generation_config.json +9 -0
model-00001-of-00004.safetensors +3 -0
model-00002-of-00004.safetensors +3 -0
model-00003-of-00004.safetensors +3 -0
model-00004-of-00004.safetensors +3 -0
model.safetensors.index.json +0 -0
special_tokens_map.json +23 -0
tokenizer.json +3 -0
tokenizer_config.json +194 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+tokenizer.json filter=lfs diff=lfs merge=lfs -text

README.md ADDED Viewed

	@@ -0,0 +1,249 @@

+---
+license: mit
+library_name: mlx
+tags:
+- mlx
+- lora
+- fine-tuned
+- medical
+- legal
+- programming
+- science
+- null-ai
+base_model: deepseek-ai/DeepSeek-R1-Distill-Qwen-32B
+pipeline_tag: text-generation
+---
+# NullAI: DeepSeek R1 32B Fine-tuned Model
+---
+## 日本語 (Japanese)
+### NullAIとは
+NullAI（ヌルエーアイ）は、多領域における知識推論と検証を統合した、高度な知識基盤システムです。医学、法律、プログラミング、科学など、専門性の高い複数のドメインにわたって、信頼性の高い回答を提供します。
+### このモデルについて
+このモデルは、DeepSeek R1 Distill Qwen 32Bをベースに、NullAIの多領域知識データセットでファインチューニングされたものです。
+**主な特徴:**
+- **ベースモデル**: DeepSeek-R1-Distill-Qwen-32B
+- **パラメータ数**: 32.7億
+- **量子化**: 4bit MLX量子化（元サイズ61GB → 17.2GB）
+- **ファインチューニング手法**: LoRA (Low-Rank Adaptation)
+- **訓練データ**: 8,768訓練例 + 975検証例
+- **最適化**: Apple Silicon（MPS）向けに最適化
+**学習結果:**
+- 初期検証ロス: 3.318
+- 最終検証ロス: 0.712 (78.5%改善)
+- 訓練イテレーション: 1000
+- 学習済みトークン: 88,720
+### 対応ドメイン
+1. **医学 (Medical)**: 臨床知識、診断推論、治療ガイドライン
+2. **法律 (Legal)**: 法解釈、判例分析、法的推論
+3. **プログラミング (Programming)**: コード生成、デバッグ、アルゴリズム設計
+4. **科学 (Science)**: 科学的方法論、研究設計、データ分析
+5. **一般知識 (General)**: 幅広い一般的な質問対応
+### 使用方法
+#### MLXを使用した推論（推奨 - Apple Silicon）
+```python
+import mlx.core as mx
+from mlx_lm import load, generate
+# モデルのロード
+model, tokenizer = load("kofdai/nullai-deepseek-r1-32b")
+# 推論の実行
+prompt = "心房細動の治療選択肢について説明してください。"
+response = generate(model, tokenizer, prompt=prompt, max_tokens=500)
+print(response)
+```
+#### Transformersを使用した推論
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+import torch
+# モデルとトークナイザーのロード
+model_name = "kofdai/nullai-deepseek-r1-32b"
+tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
+model = AutoModelForCausalLM.from_pretrained(
+    model_name,
+    torch_dtype=torch.float16,
+    device_map="auto",
+    trust_remote_code=True
+)
+# 推論
+prompt = "心房細動の治療選択肢について説明してください。"
+inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
+outputs = model.generate(**inputs, max_new_tokens=500, temperature=0.7)
+response = tokenizer.decode(outputs[0], skip_special_tokens=True)
+print(response)
+```
+### システム要件
+**最小要件:**
+- Python 3.10+
+- 20GB以上のRAM
+- Apple Silicon（M1/M2/M3）またはNVIDIA GPU
+**推奨環境:**
+- Apple Silicon Mac (M1 Pro/Max, M2 Pro/Max, M3以上)
+- 32GB以上のユニファイドメモリ
+- macOS 13.0以上
+### インストール
+```bash
+# MLX環境（Apple Silicon推奨）
+pip install mlx mlx-lm
+# Transformers環境
+pip install transformers torch accelerate
+```
+---
+## English
+### About NullAI
+NullAI is an advanced knowledge-based system that integrates multi-domain knowledge reasoning and verification. It provides highly reliable answers across specialized domains such as medicine, law, programming, and science.
+### About This Model
+This model is based on DeepSeek R1 Distill Qwen 32B and fine-tuned on NullAI's multi-domain knowledge dataset.
+**Key Features:**
+- **Base Model**: DeepSeek-R1-Distill-Qwen-32B
+- **Parameters**: 32.7 billion
+- **Quantization**: 4-bit MLX quantization (61GB → 17.2GB)
+- **Fine-tuning Method**: LoRA (Low-Rank Adaptation)
+- **Training Data**: 8,768 training examples + 975 validation examples
+- **Optimization**: Optimized for Apple Silicon (MPS)
+**Training Results:**
+- Initial Validation Loss: 3.318
+- Final Validation Loss: 0.712 (78.5% improvement)
+- Training Iterations: 1000
+- Tokens Trained: 88,720
+### Supported Domains
+1. **Medical**: Clinical knowledge, diagnostic reasoning, treatment guidelines
+2. **Legal**: Legal interpretation, case analysis, legal reasoning
+3. **Programming**: Code generation, debugging, algorithm design
+4. **Science**: Scientific methodology, research design, data analysis
+5. **General**: Broad general knowledge questions
+### Usage
+#### Inference with MLX (Recommended - Apple Silicon)
+```python
+import mlx.core as mx
+from mlx_lm import load, generate
+# Load model
+model, tokenizer = load("kofdai/nullai-deepseek-r1-32b")
+# Run inference
+prompt = "Explain treatment options for atrial fibrillation."
+response = generate(model, tokenizer, prompt=prompt, max_tokens=500)
+print(response)
+```
+#### Inference with Transformers
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+import torch
+# Load model and tokenizer
+model_name = "kofdai/nullai-deepseek-r1-32b"
+tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
+model = AutoModelForCausalLM.from_pretrained(
+    model_name,
+    torch_dtype=torch.float16,
+    device_map="auto",
+    trust_remote_code=True
+)
+# Inference
+prompt = "Explain treatment options for atrial fibrillation."
+inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
+outputs = model.generate(**inputs, max_new_tokens=500, temperature=0.7)
+response = tokenizer.decode(outputs[0], skip_special_tokens=True)
+print(response)
+```
+### System Requirements
+**Minimum:**
+- Python 3.10+
+- 20GB+ RAM
+- Apple Silicon (M1/M2/M3) or NVIDIA GPU
+**Recommended:**
+- Apple Silicon Mac (M1 Pro/Max, M2 Pro/Max, M3 or higher)
+- 32GB+ unified memory
+- macOS 13.0+
+### Installation
+```bash
+# MLX environment (recommended for Apple Silicon)
+pip install mlx mlx-lm
+# Transformers environment
+pip install transformers torch accelerate
+```
+### Training Details
+**Hardware:**
+- Platform: Apple Silicon (MPS)
+- Memory: ~20GB peak usage
+- Training Time: ~60 minutes
+**Hyperparameters:**
+- Learning Rate: 1e-5
+- Batch Size: 1
+- Gradient Accumulation: 16 (effective batch size)
+- LoRA Rank: 16
+- LoRA Alpha: 32
+- Max Sequence Length: 2048
+- Optimizer: AdamW
+**Performance Metrics:**
+- Training Speed: ~0.35-0.40 iterations/sec
+- Tokens/sec: ~30-35
+- Validation Frequency: Every 100 iterations
+- Checkpoint Saves: Every 250 iterations
+### License
+This model is provided for research and educational purposes. For professional decisions in medicine, law, etc., always consult qualified professionals.
+### Citation
+```bibtex
+@misc{nullai-deepseek-r1-32b,
+  title={NullAI: DeepSeek R1 32B Fine-tuned Model},
+  author={KofDai},
+  year={2025},
+  publisher={HuggingFace},
+  url={https://huggingface.co/kofdai/nullai-deepseek-r1-32b}
+}
+```

TECHNICAL_DECK.md ADDED Viewed

	@@ -0,0 +1,471 @@

+# NullAI Technical Deck: DeepSeek R1 32B Fine-tuned Model
+---
+## 日本語 (Japanese)
+### 1. NullAIシステムアーキテクチャ
+NullAIは、多領域における知識推論と検証を統合した高度な知識基盤システムです。単なるLLMではなく、構造化された知識管理と多段階検証システムを組み合わせています。
+#### 1.1 階層構造
+```
+┌─────────────────────────────────────┐
+│   Layer 5: State Management        │  ← システム状態管理
+├─────────────────────────────────────┤
+│   Layer 4: Judge System             │  ← 回答の検証・評価
+│   ├─ Alpha Lobe  (基礎ロジック)     │
+│   ├─ Beta Basic  (専門知識整合性)   │
+│   └─ Beta Advanced (深層推論)       │
+├─────────────────────────────────────┤
+│   Layer 3: Inference Engine         │  ← DeepSeek R1による推論
+├─────────────────────────────────────┤
+│   Layer 2: Episodic Binding         │  ← 知識タイルの関連付け
+├─────────────────────────────────────┤
+│   Layer 1: Spatial Encoding         │  ← 知識の空間配置
+└─────────────────────────────────────┘
+```
+### 2. Knowledge Tile System（知識タイルシステム）
+#### 2.1 構造
+各知識は、以下の要素を持つタイルとして構造化されます：
+```python
+{
+    "tile_id": "unique_identifier",
+    "domain": "medical|legal|programming|science|general",
+    "content": "知識の内容",
+    "coordinates": {
+        "x": float,  # 概念空間上のX座標
+        "y": float,  # 概念空間上のY座標
+        "z": float   # 概念空間上のZ座標
+    },
+    "certainty_score": float,  # 0.0-1.0
+    "orcid_verified": bool,
+    "expert_id": "ORCID_ID",
+    "reasoning_chain": [...],
+    "citations": [...]
+}
+```
+#### 2.2 空間座標システム
+- **X軸**: 抽象度（具体的 ← → 抽象的）
+- **Y軸**: 専門性（基礎 ← → 高度専門）
+- **Z軸**: 時間性（普遍的 ← → 最新動向）
+この3次元空間により、関連知識の効率的な検索と推論が可能になります。
+### 3. Judge System（判定システム）
+#### 3.1 Alpha Lobe - 基礎ロジック検証
+```python
+def alpha_lobe_check(reasoning_chain):
+    """
+    基礎的な論理整合性を検証
+    - 矛盾の検出
+    - 前提と結論の整合性
+    - 推論ステップの妥当性
+    """
+    return {
+        "passed": bool,
+        "issues": [],
+        "confidence": float
+    }
+```
+#### 3.2 Beta Lobe (Basic) - 専門知識整合性
+```python
+def beta_lobe_basic(answer, domain_knowledge):
+    """
+    ドメイン固有の知識との整合性を確認
+    - 専門用語の正確性
+    - ドメイン常識との一致
+    - 標準プロトコルの遵守
+    """
+    return {
+        "domain_consistency": float,
+        "terminology_accuracy": float,
+        "protocol_compliance": bool
+    }
+```
+#### 3.3 Beta Lobe (Advanced) - 深層推論検証
+```python
+def beta_lobe_advanced(answer, reasoning_chain, meta_knowledge):
+    """
+    高度な推論プロセスを検証
+    - 多段階推論の妥当性
+    - 因果関係の正確性
+    - エッジケースの考慮
+    """
+    return {
+        "reasoning_depth": int,
+        "causal_accuracy": float,
+        "edge_case_coverage": float
+    }
+```
+### 4. ファインチューニング詳細
+#### 4.1 トレーニングプロセス
+**フェーズ1: データ準備**
+```bash
+# データセットの分割（8:1:1の比率）
+- 訓練データ: 8,768例
+- 検証データ: 975例
+- テストデータ: 保留
+# データ形式
+{
+    "text": "システムプロンプト + 質問 + 回答",
+    "domain": "medical|legal|programming|science|general",
+    "difficulty": 1-5,
+    "requires_reasoning": bool
+}
+```
+**フェーズ2: モデル量子化**
+```bash
+# MLXでの4bit量子化
+python -m mlx_lm.convert \
+    --hf-path deepseek-ai/DeepSeek-R1-Distill-Qwen-32B \
+    --mlx-path ./deepseek-r1-32b-mlx-4bit \
+    --quantize \
+    --q-bits 4 \
+    --q-group-size 64 \
+    --trust-remote-code
+# 結果: 61GB → 17.2GB (72%削減)
+```
+**フェーズ3: LoRAファインチューニング**
+```bash
+python -m mlx_lm lora \
+    --model ./deepseek-r1-32b-mlx-4bit \
+    --train \
+    --data . \
+    --iters 1000 \
+    --adapter-path ./adapters \
+    --batch-size 1 \
+    --learning-rate 1e-5 \
+    --steps-per-report 10 \
+    --steps-per-eval 100 \
+    --save-every 250 \
+    --grad-checkpoint \
+    --max-seq-length 2048
+```
+#### 4.2 ハイパーパラメータ選択の根拠
+| パラメータ | 値 | 理由 |
+|-----------|-----|------|
+| Learning Rate | 1e-5 | 大規模モデルの安定した学習のため |
+| Batch Size | 1 | メモリ制約下での最大効率 |
+| LoRA Rank | 16 | パラメータ効率と品質のバランス |
+| LoRA Alpha | 32 | Rank×2の標準設定 |
+| Max Seq Length | 2048 | 長文推論に対応 |
+| Gradient Checkpointing | True | メモリ使用量削減 |
+#### 4.3 学習曲線解析
+```
+Iteration   Train Loss   Val Loss   Improvement
+----------------------------------------------
+    0          -         3.318         -
+  100        1.548       1.583       52.3%
+  200        0.860       0.934       71.9%
+  300        0.682       1.113       66.5%
+  400        1.260       0.741       77.7%
+  500        0.681       0.832       74.9%
+  600        0.561       0.885       73.3%
+  700        0.710       0.897       73.0%
+  800        0.589       0.621       81.3%
+  900        0.574       0.705       78.7%
+ 1000        0.583       0.712       78.5%
+```
+**観察結果:**
+- 初期100イテレーションで急激な改善（52.3%）
+- 200-500イテレーションで安定した学習
+- 800イテレーション付近で最良の検証ロス
+- 最終的に78.5%の改善を達成
+### 5. 推論最適化
+#### 5.1 Apple Silicon (MPS) 最適化
+```python
+# MLXは自動的にApple Siliconに最適化
+- Unified Memory Architecture活用
+- Metal Performance Shaders使用
+- Neural Engine活用（一部演算）
+```
+#### 5.2 推論速度
+| メトリクス | 値 |
+|----------|-----|
+| トークン/秒 | 30-35 |
+| イテレーション/秒 | 0.35-0.40 |
+| ピークメモリ | 19.9GB |
+| 平均レイテンシ | ~2.8秒/iteration |
+---
+## English
+### 1. NullAI System Architecture
+NullAI is an advanced knowledge-based system that integrates multi-domain knowledge reasoning and verification. It's not just an LLM, but combines structured knowledge management with multi-stage verification systems.
+#### 1.1 Hierarchical Structure
+```
+┌─────────────────────────────────────┐
+│   Layer 5: State Management        │  ← System state management
+├─────────────────────────────────────┤
+│   Layer 4: Judge System             │  ← Answer verification & evaluation
+│   ├─ Alpha Lobe  (Basic Logic)      │
+│   ├─ Beta Basic  (Domain Consistency)│
+│   └─ Beta Advanced (Deep Reasoning)  │
+├─────────────────────────────────────┤
+│   Layer 3: Inference Engine         │  ← DeepSeek R1 inference
+├─────────────────────────────────────┤
+│   Layer 2: Episodic Binding         │  ← Knowledge tile association
+├─────────────────────────────────────┤
+│   Layer 1: Spatial Encoding         │  ← Knowledge spatial placement
+└─────────────────────────────────────┘
+```
+### 2. Knowledge Tile System
+#### 2.1 Structure
+Each piece of knowledge is structured as a tile with the following elements:
+```python
+{
+    "tile_id": "unique_identifier",
+    "domain": "medical|legal|programming|science|general",
+    "content": "Knowledge content",
+    "coordinates": {
+        "x": float,  # X coordinate in concept space
+        "y": float,  # Y coordinate in concept space
+        "z": float   # Z coordinate in concept space
+    },
+    "certainty_score": float,  # 0.0-1.0
+    "orcid_verified": bool,
+    "expert_id": "ORCID_ID",
+    "reasoning_chain": [...],
+    "citations": [...]
+}
+```
+#### 2.2 Spatial Coordinate System
+- **X-axis**: Abstraction level (Concrete ← → Abstract)
+- **Y-axis**: Expertise level (Basic ← → Advanced)
+- **Z-axis**: Temporality (Universal ← → Latest trends)
+This 3D space enables efficient retrieval and reasoning of related knowledge.
+### 3. Judge System
+#### 3.1 Alpha Lobe - Basic Logic Verification
+```python
+def alpha_lobe_check(reasoning_chain):
+    """
+    Verifies basic logical consistency
+    - Contradiction detection
+    - Premise-conclusion consistency
+    - Reasoning step validity
+    """
+    return {
+        "passed": bool,
+        "issues": [],
+        "confidence": float
+    }
+```
+#### 3.2 Beta Lobe (Basic) - Domain Knowledge Consistency
+```python
+def beta_lobe_basic(answer, domain_knowledge):
+    """
+    Checks consistency with domain-specific knowledge
+    - Terminology accuracy
+    - Domain common sense alignment
+    - Standard protocol compliance
+    """
+    return {
+        "domain_consistency": float,
+        "terminology_accuracy": float,
+        "protocol_compliance": bool
+    }
+```
+#### 3.3 Beta Lobe (Advanced) - Deep Reasoning Verification
+```python
+def beta_lobe_advanced(answer, reasoning_chain, meta_knowledge):
+    """
+    Verifies advanced reasoning processes
+    - Multi-step reasoning validity
+    - Causal relationship accuracy
+    - Edge case consideration
+    """
+    return {
+        "reasoning_depth": int,
+        "causal_accuracy": float,
+        "edge_case_coverage": float
+    }
+```
+### 4. Fine-tuning Details
+#### 4.1 Training Process
+**Phase 1: Data Preparation**
+```bash
+# Dataset split (8:1:1 ratio)
+- Training data: 8,768 examples
+- Validation data: 975 examples
+- Test data: Withheld
+# Data format
+{
+    "text": "System prompt + Question + Answer",
+    "domain": "medical|legal|programming|science|general",
+    "difficulty": 1-5,
+    "requires_reasoning": bool
+}
+```
+**Phase 2: Model Quantization**
+```bash
+# 4-bit quantization with MLX
+python -m mlx_lm.convert \
+    --hf-path deepseek-ai/DeepSeek-R1-Distill-Qwen-32B \
+    --mlx-path ./deepseek-r1-32b-mlx-4bit \
+    --quantize \
+    --q-bits 4 \
+    --q-group-size 64 \
+    --trust-remote-code
+# Result: 61GB → 17.2GB (72% reduction)
+```
+**Phase 3: LoRA Fine-tuning**
+```bash
+python -m mlx_lm lora \
+    --model ./deepseek-r1-32b-mlx-4bit \
+    --train \
+    --data . \
+    --iters 1000 \
+    --adapter-path ./adapters \
+    --batch-size 1 \
+    --learning-rate 1e-5 \
+    --steps-per-report 10 \
+    --steps-per-eval 100 \
+    --save-every 250 \
+    --grad-checkpoint \
+    --max-seq-length 2048
+```
+#### 4.2 Hyperparameter Selection Rationale
+| Parameter | Value | Reasoning |
+|-----------|-------|-----------|
+| Learning Rate | 1e-5 | Stable learning for large models |
+| Batch Size | 1 | Maximum efficiency under memory constraints |
+| LoRA Rank | 16 | Balance between parameter efficiency and quality |
+| LoRA Alpha | 32 | Standard setting of Rank×2 |
+| Max Seq Length | 2048 | Support for long-form reasoning |
+| Gradient Checkpointing | True | Reduced memory usage |
+#### 4.3 Learning Curve Analysis
+```
+Iteration   Train Loss   Val Loss   Improvement
+----------------------------------------------
+    0          -         3.318         -
+  100        1.548       1.583       52.3%
+  200        0.860       0.934       71.9%
+  300        0.682       1.113       66.5%
+  400        1.260       0.741       77.7%
+  500        0.681       0.832       74.9%
+  600        0.561       0.885       73.3%
+  700        0.710       0.897       73.0%
+  800        0.589       0.621       81.3%
+  900        0.574       0.705       78.7%
+ 1000        0.583       0.712       78.5%
+```
+**Observations:**
+- Rapid improvement in first 100 iterations (52.3%)
+- Stable learning from iterations 200-500
+- Best validation loss around iteration 800
+- Final improvement of 78.5% achieved
+### 5. Inference Optimization
+#### 5.1 Apple Silicon (MPS) Optimization
+```python
+# MLX automatically optimizes for Apple Silicon
+- Unified Memory Architecture utilization
+- Metal Performance Shaders usage
+- Neural Engine utilization (partial operations)
+```
+#### 5.2 Inference Speed
+| Metric | Value |
+|--------|-------|
+| Tokens/sec | 30-35 |
+| Iterations/sec | 0.35-0.40 |
+| Peak Memory | 19.9GB |
+| Average Latency | ~2.8s/iteration |
+### 6. Model Capabilities by Domain
+**Medical Domain:**
+- Diagnostic reasoning pathways
+- Treatment protocol recommendations
+- Drug interaction analysis
+- Clinical guideline interpretation
+**Legal Domain:**
+- Legal precedent analysis
+- Statutory interpretation
+- Contract clause analysis
+- Regulatory compliance guidance
+**Programming Domain:**
+- Code generation and optimization
+- Bug detection and debugging
+- Algorithm design and analysis
+- Software architecture recommendations
+**Scientific Domain:**
+- Research methodology design
+- Statistical analysis guidance
+- Experimental design optimization
+- Data interpretation support
+**General Domain:**
+- Broad knowledge retrieval
+- Multi-domain reasoning
+- Explanation generation
+- Knowledge synthesis
+### 7. Limitations and Future Work
+**Current Limitations:**
+- Requires significant RAM (20GB+) for inference
+- Response latency on non-optimized hardware
+- Domain-specific accuracy varies
+**Future Improvements:**
+- Further quantization experiments (3-bit, 2-bit)
+- Domain-specific adapter modules
+- Real-time ORCID verification integration
+- Expanded training dataset across domains
+- Multi-lingual support expansion

chat_template.jinja ADDED Viewed

	@@ -0,0 +1 @@

+ {% if not add_generation_prompt is defined %}{% set add_generation_prompt = false %}{% endif %}{% set ns = namespace(is_first=false, is_tool=false, is_output_first=true, system_prompt='') %}{%- for message in messages %}{%- if message['role'] == 'system' %}{% set ns.system_prompt = message['content'] %}{%- endif %}{%- endfor %}{{bos_token}}{{ns.system_prompt}}{%- for message in messages %}{%- if message['role'] == 'user' %}{%- set ns.is_tool = false -%}{{'<｜User｜>' + message['content']}}{%- endif %}{%- if message['role'] == 'assistant' and message['content'] is none %}{%- set ns.is_tool = false -%}{%- for tool in message['tool_calls']%}{%- if not ns.is_first %}{{'<｜Assistant｜><｜tool▁calls▁begin｜><｜tool▁call▁begin｜>' + tool['type'] + '<｜tool▁sep｜>' + tool['function']['name'] + '\n' + '```json' + '\n' + tool['function']['arguments'] + '\n' + '```' + '<｜tool▁call▁end｜>'}}{%- set ns.is_first = true -%}{%- else %}{{'\n' + '<｜tool▁call▁begin｜>' + tool['type'] + '<｜tool▁sep｜>' + tool['function']['name'] + '\n' + '```json' + '\n' + tool['function']['arguments'] + '\n' + '```' + '<｜tool▁call▁end｜>'}}{{'<｜tool▁calls▁end｜><｜end▁of▁sentence｜>'}}{%- endif %}{%- endfor %}{%- endif %}{%- if message['role'] == 'assistant' and message['content'] is not none %}{%- if ns.is_tool %}{{'<｜tool▁outputs▁end｜>' + message['content'] + '<｜end▁of▁sentence｜>'}}{%- set ns.is_tool = false -%}{%- else %}{% set content = message['content'] %}{% if '</think>' in content %}{% set content = content.split('</think>')[-1] %}{% endif %}{{'<｜Assistant｜>' + content + '<｜end▁of▁sentence｜>'}}{%- endif %}{%- endif %}{%- if message['role'] == 'tool' %}{%- set ns.is_tool = true -%}{%- if ns.is_output_first %}{{'<｜tool▁outputs▁begin｜><｜tool▁output▁begin｜>' + message['content'] + '<｜tool▁output▁end｜>'}}{%- set ns.is_output_first = false %}{%- else %}{{'\n<｜tool▁output▁begin｜>' + message['content'] + '<｜tool▁output▁end｜>'}}{%- endif %}{%- endif %}{%- endfor -%}{% if ns.is_tool %}{{'<｜tool▁outputs▁end｜>'}}{% endif %}{% if add_generation_prompt and not ns.is_tool %}{{'<｜Assistant｜><think>\n'}}{% endif %}

config.json ADDED Viewed

	@@ -0,0 +1,37 @@

+{
+    "architectures": [
+        "Qwen2ForCausalLM"
+    ],
+    "attention_dropout": 0.0,
+    "bos_token_id": 151643,
+    "eos_token_id": 151643,
+    "hidden_act": "silu",
+    "hidden_size": 5120,
+    "initializer_range": 0.02,
+    "intermediate_size": 27648,
+    "max_position_embeddings": 131072,
+    "max_window_layers": 64,
+    "model_type": "qwen2",
+    "num_attention_heads": 40,
+    "num_hidden_layers": 64,
+    "num_key_value_heads": 8,
+    "quantization": {
+        "group_size": 64,
+        "bits": 4,
+        "mode": "affine"
+    },
+    "quantization_config": {
+        "group_size": 64,
+        "bits": 4,
+        "mode": "affine"
+    },
+    "rms_norm_eps": 1e-05,
+    "rope_theta": 1000000.0,
+    "sliding_window": 131072,
+    "tie_word_embeddings": false,
+    "torch_dtype": "bfloat16",
+    "transformers_version": "4.43.1",
+    "use_cache": true,
+    "use_sliding_window": false,
+    "vocab_size": 152064
+}

generation_config.json ADDED Viewed

	@@ -0,0 +1,9 @@

+{
+  "_from_model_config": true,
+  "bos_token_id": 151646,
+  "eos_token_id": 151643,
+  "do_sample": true,
+  "temperature": 0.6,
+  "top_p": 0.95,
+  "transformers_version": "4.39.3"
+}

model-00001-of-00004.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:b8945fcfa6f506b8b62f949c76f58ceb5e99b79fd3e28c964c9e79fbadd98f9e
+size 5366583057

model-00002-of-00004.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:2098646673c2329a65f6ca6efd7e4ec15d5fbd81be2a21b339d0ad39db583376
+size 5335713350

model-00003-of-00004.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:66092cfcae6835603479a186cd3b72ad50888b93504533f515459c944bc43cde
+size 5366642308

model-00004-of-00004.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:b870547218faf53c714092e2cbe1ee94f898fae6a09afbda9aa63b823fcd63a3
+size 2362540963

model.safetensors.index.json ADDED Viewed

The diff for this file is too large to render. See raw diff

special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,23 @@

+{
+  "bos_token": {
+    "content": "<｜begin▁of▁sentence｜>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "eos_token": {
+    "content": "<｜end▁of▁sentence｜>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "pad_token": {
+    "content": "<｜end▁of▁sentence｜>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  }
+}

tokenizer.json ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:e20ddafc659ba90242154b55275402edeca0715e5dbb30f56815a4ce081f4893
+size 11422778

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,194 @@

+{
+  "add_bos_token": true,
+  "add_eos_token": false,
+  "add_prefix_space": null,
+  "added_tokens_decoder": {
+    "151643": {
+      "content": "<｜end▁of▁sentence｜>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151644": {
+      "content": "<｜User｜>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151645": {
+      "content": "<｜Assistant｜>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151646": {
+      "content": "<｜begin▁of▁sentence｜>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151647": {
+      "content": "<|EOT|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151648": {
+      "content": "<think>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151649": {
+      "content": "</think>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151650": {
+      "content": "<|quad_start|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151651": {
+      "content": "<|quad_end|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151652": {
+      "content": "<|vision_start|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151653": {
+      "content": "<|vision_end|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151654": {
+      "content": "<|vision_pad|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151655": {
+      "content": "<|image_pad|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151656": {
+      "content": "<|video_pad|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151657": {
+      "content": "<tool_call>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151658": {
+      "content": "</tool_call>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151659": {
+      "content": "<|fim_prefix|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151660": {
+      "content": "<|fim_middle|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151661": {
+      "content": "<|fim_suffix|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151662": {
+      "content": "<|fim_pad|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151663": {
+      "content": "<|repo_name|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151664": {
+      "content": "<|file_sep|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    }
+  },
+  "bos_token": "<｜begin▁of▁sentence｜>",
+  "clean_up_tokenization_spaces": false,
+  "eos_token": "<｜end▁of▁sentence｜>",
+  "extra_special_tokens": {},
+  "legacy": true,
+  "model_max_length": 16384,
+  "pad_token": "<｜end▁of▁sentence｜>",
+  "sp_model_kwargs": {},
+  "tokenizer_class": "LlamaTokenizerFast",
+  "unk_token": null,
+  "use_default_system_prompt": false
+}