upload imatrix log file has cossim per layer
Browse files- README.md +2 -2
- logs/imatrix-GLM-4.7-BF16.log +667 -0
README.md
CHANGED
|
@@ -19,12 +19,12 @@ Currently cooking this now!
|
|
| 19 |
|
| 20 |
- [x] download bf16 safetensors https://huggingface.co/zai-org/GLM-4.7
|
| 21 |
- [x] use llama.cpp/convert_hf_to_gguf.py to create bf16 GGUF
|
| 22 |
-
- [
|
| 23 |
- [ ] cook Q8_0 and test perplexity of BF16 and Q8_0 for baseline data
|
| 24 |
- [ ] cook IQ5_K with full q8_0 attn/shexp/first 3 dense layers and test
|
| 25 |
- [ ] upload IQ5_K if all looking good
|
| 26 |
- [ ] continue with smaller quants
|
| 27 |
-
- [ ]
|
| 28 |
|
| 29 |
## `ik_llama.cpp` imatrix Quantizations of zai-org/GLM-4.7
|
| 30 |
*NOTE* `ik_llama.cpp` can also run your existing GGUFs from bartowski, unsloth, mradermacher, etc if you want to try it out before downloading my quants.
|
|
|
|
| 19 |
|
| 20 |
- [x] download bf16 safetensors https://huggingface.co/zai-org/GLM-4.7
|
| 21 |
- [x] use llama.cpp/convert_hf_to_gguf.py to create bf16 GGUF
|
| 22 |
+
- [x] calculate imatrix and upload to HF first so others can use as desired
|
| 23 |
- [ ] cook Q8_0 and test perplexity of BF16 and Q8_0 for baseline data
|
| 24 |
- [ ] cook IQ5_K with full q8_0 attn/shexp/first 3 dense layers and test
|
| 25 |
- [ ] upload IQ5_K if all looking good
|
| 26 |
- [ ] continue with smaller quants
|
| 27 |
+
- [ ] check if any folks open discussions with desired RAM/VRAM breakpoints
|
| 28 |
|
| 29 |
## `ik_llama.cpp` imatrix Quantizations of zai-org/GLM-4.7
|
| 30 |
*NOTE* `ik_llama.cpp` can also run your existing GGUFs from bartowski, unsloth, mradermacher, etc if you want to try it out before downloading my quants.
|
logs/imatrix-GLM-4.7-BF16.log
ADDED
|
@@ -0,0 +1,667 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
model=/mnt/data/models/ubergarm/GLM-4.7-GGUF/GLM-160x21B-4.7-BF16-00001-of-00015.gguf
|
| 2 |
+
|
| 3 |
+
numactl -N ${SOCKET} -m ${SOCKET} \
|
| 4 |
+
./build/bin/llama-imatrix \
|
| 5 |
+
--model "$model"\
|
| 6 |
+
-f ubergarm-imatrix-calibration-corpus-v02.txt \
|
| 7 |
+
-o /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat \
|
| 8 |
+
--no-fused-moe \
|
| 9 |
+
--no-fused-up-gate \
|
| 10 |
+
--no-fused-mul-multiadd \
|
| 11 |
+
--ctx-size 512 \
|
| 12 |
+
-ub 4096 -b 4096 \
|
| 13 |
+
--threads 96 \
|
| 14 |
+
--threads-batch 128 \
|
| 15 |
+
--no-mmap \
|
| 16 |
+
--numa numactl \
|
| 17 |
+
--verbosity 1 \
|
| 18 |
+
--layer-similarity
|
| 19 |
+
|
| 20 |
+
CPU: using device CPU - 0 MiB free
|
| 21 |
+
llama_model_loader: additional 14 GGUFs metadata loaded.
|
| 22 |
+
llama_model_loader: loaded meta data with 49 key-value pairs and 1761 tensors from /mnt/data/models/ubergarm/GLM-4.7-GGUF/GLM-160x21B-4.7-BF16-00001-of-00015.gguf (version GGUF V3 (latest))
|
| 23 |
+
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
|
| 24 |
+
llama_model_loader: - kv 0: general.architecture str = glm4moe
|
| 25 |
+
llama_model_loader: - kv 1: general.type str = model
|
| 26 |
+
llama_model_loader: - kv 2: general.sampling.temp f32 = 1.000000
|
| 27 |
+
llama_model_loader: - kv 3: general.name str = GLM 4.7
|
| 28 |
+
llama_model_loader: - kv 4: general.version str = 4.7
|
| 29 |
+
llama_model_loader: - kv 5: general.basename str = GLM
|
| 30 |
+
llama_model_loader: - kv 6: general.size_label str = 160x21B
|
| 31 |
+
llama_model_loader: - kv 7: general.license str = mit
|
| 32 |
+
llama_model_loader: - kv 8: general.tags arr[str,1] = ["text-generation"]
|
| 33 |
+
llama_model_loader: - kv 9: general.languages arr[str,2] = ["en", "zh"]
|
| 34 |
+
llama_model_loader: - kv 10: glm4moe.block_count u32 = 93
|
| 35 |
+
llama_model_loader: - kv 11: glm4moe.context_length u32 = 202752
|
| 36 |
+
llama_model_loader: - kv 12: glm4moe.embedding_length u32 = 5120
|
| 37 |
+
llama_model_loader: - kv 13: glm4moe.feed_forward_length u32 = 12288
|
| 38 |
+
llama_model_loader: - kv 14: glm4moe.attention.head_count u32 = 96
|
| 39 |
+
llama_model_loader: - kv 15: glm4moe.attention.head_count_kv u32 = 8
|
| 40 |
+
llama_model_loader: - kv 16: glm4moe.rope.freq_base f32 = 1000000.000000
|
| 41 |
+
llama_model_loader: - kv 17: glm4moe.attention.layer_norm_rms_epsilon f32 = 0.000010
|
| 42 |
+
llama_model_loader: - kv 18: glm4moe.expert_used_count u32 = 8
|
| 43 |
+
llama_model_loader: - kv 19: glm4moe.expert_group_count u32 = 1
|
| 44 |
+
llama_model_loader: - kv 20: glm4moe.expert_group_used_count u32 = 1
|
| 45 |
+
llama_model_loader: - kv 21: glm4moe.attention.key_length u32 = 128
|
| 46 |
+
llama_model_loader: - kv 22: glm4moe.attention.value_length u32 = 128
|
| 47 |
+
llama_model_loader: - kv 23: general.file_type u32 = 32
|
| 48 |
+
llama_model_loader: - kv 24: glm4moe.rope.dimension_count u32 = 64
|
| 49 |
+
llama_model_loader: - kv 25: glm4moe.expert_count u32 = 160
|
| 50 |
+
llama_model_loader: - kv 26: glm4moe.expert_feed_forward_length u32 = 1536
|
| 51 |
+
llama_model_loader: - kv 27: glm4moe.expert_shared_count u32 = 1
|
| 52 |
+
llama_model_loader: - kv 28: glm4moe.leading_dense_block_count u32 = 3
|
| 53 |
+
llama_model_loader: - kv 29: glm4moe.expert_gating_func u32 = 2
|
| 54 |
+
llama_model_loader: - kv 30: glm4moe.expert_weights_scale f32 = 2.500000
|
| 55 |
+
llama_model_loader: - kv 31: glm4moe.expert_weights_norm bool = true
|
| 56 |
+
llama_model_loader: - kv 32: glm4moe.nextn_predict_layers u32 = 1
|
| 57 |
+
llama_model_loader: - kv 33: general.quantization_version u32 = 2
|
| 58 |
+
llama_model_loader: - kv 34: tokenizer.ggml.model str = gpt2
|
| 59 |
+
llama_model_loader: - kv 35: tokenizer.ggml.pre str = glm4
|
| 60 |
+
llama_model_loader: - kv 36: tokenizer.ggml.tokens arr[str,151552] = ["!", "\"", "#", "$", "%", "&", "'", ...
|
| 61 |
+
llama_model_loader: - kv 37: tokenizer.ggml.token_type arr[i32,151552] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
|
| 62 |
+
llama_model_loader: - kv 38: tokenizer.ggml.merges arr[str,318088] = ["Ġ Ġ", "Ġ ĠĠĠ", "ĠĠ ĠĠ", "...
|
| 63 |
+
llama_model_loader: - kv 39: tokenizer.ggml.eos_token_id u32 = 151329
|
| 64 |
+
llama_model_loader: - kv 40: tokenizer.ggml.padding_token_id u32 = 151329
|
| 65 |
+
llama_model_loader: - kv 41: tokenizer.ggml.bos_token_id u32 = 151331
|
| 66 |
+
llama_model_loader: - kv 42: tokenizer.ggml.eot_token_id u32 = 151336
|
| 67 |
+
llama_model_loader: - kv 43: tokenizer.ggml.unknown_token_id u32 = 151329
|
| 68 |
+
llama_model_loader: - kv 44: tokenizer.ggml.eom_token_id u32 = 151338
|
| 69 |
+
llama_model_loader: - kv 45: tokenizer.chat_template str = [gMASK]<sop>\n{%- if tools -%}\n<|syste...
|
| 70 |
+
llama_model_loader: - kv 46: split.no u16 = 0
|
| 71 |
+
llama_model_loader: - kv 47: split.count u16 = 15
|
| 72 |
+
llama_model_loader: - kv 48: split.tensors.count i32 = 1761
|
| 73 |
+
llama_model_loader: - type f32: 835 tensors
|
| 74 |
+
llama_model_loader: - type bf16: 926 tensors
|
| 75 |
+
load: special_eot_id is not in special_eog_ids - the tokenizer config may be incorrect
|
| 76 |
+
load: special_eom_id is not in special_eog_ids - the tokenizer config may be incorrect
|
| 77 |
+
load: printing all EOG tokens:
|
| 78 |
+
load: - 151329 ('<|endoftext|>')
|
| 79 |
+
load: - 151336 ('<|user|>')
|
| 80 |
+
load: - 151338 ('<|observation|>')
|
| 81 |
+
load: special tokens cache size = 36
|
| 82 |
+
load: token to piece cache size = 0.9713 MB
|
| 83 |
+
llm_load_print_meta: format = GGUF V3 (latest)
|
| 84 |
+
llm_load_print_meta: arch = glm4moe
|
| 85 |
+
llm_load_print_meta: n_ctx_train = 202752
|
| 86 |
+
llm_load_print_meta: n_embd = 5120
|
| 87 |
+
llm_load_print_meta: n_layer = 93
|
| 88 |
+
llm_load_print_meta: n_head = 96
|
| 89 |
+
llm_load_print_meta: n_head_kv = 8
|
| 90 |
+
llm_load_print_meta: n_rot = 64
|
| 91 |
+
llm_load_print_meta: n_swa = 0
|
| 92 |
+
llm_load_print_meta: n_swa_pattern = 1
|
| 93 |
+
llm_load_print_meta: n_embd_head_k = 128
|
| 94 |
+
llm_load_print_meta: n_embd_head_v = 128
|
| 95 |
+
llm_load_print_meta: n_gqa = 12
|
| 96 |
+
llm_load_print_meta: n_embd_k_gqa = 1024
|
| 97 |
+
llm_load_print_meta: n_embd_v_gqa = 1024
|
| 98 |
+
llm_load_print_meta: f_norm_eps = 0.0e+00
|
| 99 |
+
llm_load_print_meta: f_norm_rms_eps = 1.0e-05
|
| 100 |
+
llm_load_print_meta: f_clamp_kqv = 0.0e+00
|
| 101 |
+
llm_load_print_meta: f_max_alibi_bias = 0.0e+00
|
| 102 |
+
llm_load_print_meta: f_logit_scale = 0.0e+00
|
| 103 |
+
llm_load_print_meta: n_ff = 12288
|
| 104 |
+
llm_load_print_meta: n_expert = 160
|
| 105 |
+
llm_load_print_meta: n_expert_used = 8
|
| 106 |
+
llm_load_print_meta: causal attn = 1
|
| 107 |
+
llm_load_print_meta: pooling type = 0
|
| 108 |
+
llm_load_print_meta: rope type = 2
|
| 109 |
+
llm_load_print_meta: rope scaling = linear
|
| 110 |
+
llm_load_print_meta: freq_base_train = 1000000.0
|
| 111 |
+
llm_load_print_meta: freq_scale_train = 1
|
| 112 |
+
llm_load_print_meta: n_ctx_orig_yarn = 202752
|
| 113 |
+
llm_load_print_meta: rope_finetuned = unknown
|
| 114 |
+
llm_load_print_meta: ssm_d_conv = 0
|
| 115 |
+
llm_load_print_meta: ssm_d_inner = 0
|
| 116 |
+
llm_load_print_meta: ssm_d_state = 0
|
| 117 |
+
llm_load_print_meta: ssm_dt_rank = 0
|
| 118 |
+
llm_load_print_meta: model type = 355B.A32B
|
| 119 |
+
llm_load_print_meta: model ftype = BF16
|
| 120 |
+
llm_load_print_meta: model params = 358.338 B
|
| 121 |
+
llm_load_print_meta: model size = 667.598 GiB (16.003 BPW)
|
| 122 |
+
llm_load_print_meta: repeating layers = 664.707 GiB (16.003 BPW, 356.786 B parameters)
|
| 123 |
+
llm_load_print_meta: general.name = GLM 4.7
|
| 124 |
+
print_info: vocab type = BPE
|
| 125 |
+
print_info: n_vocab = 151552
|
| 126 |
+
print_info: n_merges = 318088
|
| 127 |
+
print_info: BOS token = 151331 '[gMASK]'
|
| 128 |
+
print_info: EOS token = 151329 '<|endoftext|>'
|
| 129 |
+
print_info: EOT token = 151336 '<|user|>'
|
| 130 |
+
print_info: EOM token = 151338 '<|observation|>'
|
| 131 |
+
print_info: UNK token = 151329 '<|endoftext|>'
|
| 132 |
+
print_info: PAD token = 151329 '<|endoftext|>'
|
| 133 |
+
print_info: LF token = 198 'Ċ'
|
| 134 |
+
print_info: FIM PRE token = 151347 '<|code_prefix|>'
|
| 135 |
+
print_info: FIM SUF token = 151349 '<|code_suffix|>'
|
| 136 |
+
print_info: FIM MID token = 151348 '<|code_middle|>'
|
| 137 |
+
print_info: EOG token = 151329 '<|endoftext|>'
|
| 138 |
+
print_info: EOG token = 151336 '<|user|>'
|
| 139 |
+
print_info: EOG token = 151338 '<|observation|>'
|
| 140 |
+
print_info: max token length = 1024
|
| 141 |
+
llm_load_tensors: ggml ctx size = 0.72 MiB
|
| 142 |
+
model has unused tensor blk.92.attn_norm.weight (size = 20480 bytes) -- ignoring
|
| 143 |
+
model has unused tensor blk.92.attn_q.weight (size = 125829120 bytes) -- ignoring
|
| 144 |
+
model has unused tensor blk.92.attn_k.weight (size = 10485760 bytes) -- ignoring
|
| 145 |
+
model has unused tensor blk.92.attn_v.weight (size = 10485760 bytes) -- ignoring
|
| 146 |
+
model has unused tensor blk.92.attn_q.bias (size = 49152 bytes) -- ignoring
|
| 147 |
+
model has unused tensor blk.92.attn_k.bias (size = 4096 bytes) -- ignoring
|
| 148 |
+
model has unused tensor blk.92.attn_v.bias (size = 4096 bytes) -- ignoring
|
| 149 |
+
model has unused tensor blk.92.attn_output.weight (size = 125829120 bytes) -- ignoring
|
| 150 |
+
model has unused tensor blk.92.attn_q_norm.weight (size = 512 bytes) -- ignoring
|
| 151 |
+
model has unused tensor blk.92.attn_k_norm.weight (size = 512 bytes) -- ignoring
|
| 152 |
+
model has unused tensor blk.92.post_attention_norm.weight (size = 20480 bytes) -- ignoring
|
| 153 |
+
model has unused tensor blk.92.ffn_gate_inp.weight (size = 3276800 bytes) -- ignoring
|
| 154 |
+
model has unused tensor blk.92.exp_probs_b.bias (size = 640 bytes) -- ignoring
|
| 155 |
+
model has unused tensor blk.92.ffn_gate_exps.weight (size = 2516582400 bytes) -- ignoring
|
| 156 |
+
model has unused tensor blk.92.ffn_down_exps.weight (size = 2516582400 bytes) -- ignoring
|
| 157 |
+
model has unused tensor blk.92.ffn_up_exps.weight (size = 2516582400 bytes) -- ignoring
|
| 158 |
+
model has unused tensor blk.92.ffn_gate_shexp.weight (size = 15728640 bytes) -- ignoring
|
| 159 |
+
model has unused tensor blk.92.ffn_down_shexp.weight (size = 15728640 bytes) -- ignoring
|
| 160 |
+
model has unused tensor blk.92.ffn_up_shexp.weight (size = 15728640 bytes) -- ignoring
|
| 161 |
+
model has unused tensor blk.92.nextn.eh_proj.weight (size = 104857600 bytes) -- ignoring
|
| 162 |
+
model has unused tensor blk.92.nextn.embed_tokens.weight (size = 1551892480 bytes) -- ignoring
|
| 163 |
+
model has unused tensor blk.92.nextn.enorm.weight (size = 20480 bytes) -- ignoring
|
| 164 |
+
model has unused tensor blk.92.nextn.hnorm.weight (size = 20480 bytes) -- ignoring
|
| 165 |
+
model has unused tensor blk.92.nextn.shared_head_head.weight (size = 1551892480 bytes) -- ignoring
|
| 166 |
+
model has unused tensor blk.92.nextn.shared_head_norm.weight (size = 20480 bytes) -- ignoring
|
| 167 |
+
llm_load_tensors: offloading 0 repeating layers to GPU
|
| 168 |
+
llm_load_tensors: offloaded 0/94 layers to GPU
|
| 169 |
+
llm_load_tensors: CPU buffer size = 673051.91 MiB
|
| 170 |
+
....................................................................................................
|
| 171 |
+
llama_new_context_with_model: n_ctx = 512
|
| 172 |
+
llama_new_context_with_model: n_batch = 512
|
| 173 |
+
llama_new_context_with_model: n_ubatch = 512
|
| 174 |
+
llama_new_context_with_model: flash_attn = 1
|
| 175 |
+
llama_new_context_with_model: attn_max_b = 0
|
| 176 |
+
llama_new_context_with_model: fused_moe = 0
|
| 177 |
+
llama_new_context_with_model: grouped er = 0
|
| 178 |
+
llama_new_context_with_model: fused_up_gate = 0
|
| 179 |
+
llama_new_context_with_model: fused_mmad = 0
|
| 180 |
+
llama_new_context_with_model: rope_cache = 0
|
| 181 |
+
llama_new_context_with_model: graph_reuse = 0
|
| 182 |
+
llama_new_context_with_model: k_cache_hadam = 0
|
| 183 |
+
llama_new_context_with_model: split_mode_graph_scheduling = 0
|
| 184 |
+
llama_new_context_with_model: ser = -1, 0
|
| 185 |
+
llama_new_context_with_model: freq_base = 1000000.0
|
| 186 |
+
llama_new_context_with_model: freq_scale = 1
|
| 187 |
+
llama_kv_cache_init: CPU KV buffer size = 184.00 MiB
|
| 188 |
+
llama_new_context_with_model: KV self size = 184.00 MiB, K (f16): 92.00 MiB, V (f16): 92.00 MiB
|
| 189 |
+
llama_new_context_with_model: CPU output buffer size = 0.58 MiB
|
| 190 |
+
llama_new_context_with_model: CPU compute buffer size = 306.00 MiB
|
| 191 |
+
llama_new_context_with_model: graph nodes = 4634
|
| 192 |
+
llama_new_context_with_model: graph splits = 1
|
| 193 |
+
XXXXXXXXXXXXXXXXXXXXX Setting only active experts offload
|
| 194 |
+
|
| 195 |
+
system_info: n_threads = 96 (n_threads_batch = 128) / 512 | AVX = 1 | AVX_VNNI = 1 | AVX2 = 1 | AVX512 = 1 | AVX512_VBMI = 1 | AVX512_VNNI = 1 | AVX512_BF16 = 1 | FMA = 1 | NEON = 0 | SVE = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | MATMUL_INT8 = 0 | LLAMAFILE = 1 |
|
| 196 |
+
compute_imatrix: tokenizing the input ..
|
| 197 |
+
compute_imatrix: tokenization took 508.555 ms
|
| 198 |
+
compute_imatrix: computing over 814 chunks with batch_size 512
|
| 199 |
+
compute_imatrix: 9.95 seconds per pass - ETA 2 hours 15.02 minutes
|
| 200 |
+
======================================= HAVE_FANCY_SIMD is defined
|
| 201 |
+
[1]17.5129,[2]6.9568,[3]4.5205,[4]3.2674,[5]2.6460,[6]2.2556,[7]2.0217,[8]1.8697,[9]1.8579,
|
| 202 |
+
save_imatrix: entry ' blk.73.ffn_gate_exps.weight' has partial data (98.75%) 2 out of 160 experts are missing data Storing **but be aware**
|
| 203 |
+
save_imatrix: entry ' blk.73.ffn_up_exps.weight' has partial data (98.75%) 2 out of 160 experts are missing data Storing **but be aware**
|
| 204 |
+
save_imatrix: entry ' blk.56.ffn_down_exps.weight' has partial data (99.38%) 1 out of 160 experts are missing data Storing **but be aware**
|
| 205 |
+
save_imatrix: entry ' blk.56.ffn_gate_exps.weight' has partial data (99.38%) 1 out of 160 experts are missing data Storing **but be aware**
|
| 206 |
+
save_imatrix: entry ' blk.56.ffn_up_exps.weight' has partial data (99.38%) 1 out of 160 experts are missing data Storing **but be aware**
|
| 207 |
+
save_imatrix: entry ' blk.48.ffn_gate_exps.weight' has partial data (99.38%) 1 out of 160 experts are missing data Storing **but be aware**
|
| 208 |
+
save_imatrix: entry ' blk.6.ffn_gate_exps.weight' has partial data (99.38%) 1 out of 160 experts are missing data Storing **but be aware**
|
| 209 |
+
save_imatrix: entry ' blk.6.ffn_down_exps.weight' has partial data (99.38%) 1 out of 160 experts are missing data Storing **but be aware**
|
| 210 |
+
save_imatrix: entry ' blk.6.ffn_up_exps.weight' has partial data (99.38%) 1 out of 160 experts are missing data Storing **but be aware**
|
| 211 |
+
save_imatrix: entry ' blk.73.ffn_down_exps.weight' has partial data (98.75%) 2 out of 160 experts are missing data Storing **but be aware**
|
| 212 |
+
save_imatrix: entry ' blk.48.ffn_down_exps.weight' has partial data (99.38%) 1 out of 160 experts are missing data Storing **but be aware**
|
| 213 |
+
save_imatrix: entry ' blk.48.ffn_up_exps.weight' has partial data (99.38%) 1 out of 160 experts are missing data Storing **but be aware**
|
| 214 |
+
|
| 215 |
+
save_imatrix: stored collected data after 10 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 216 |
+
[10]1.7776,[11]1.8889,[12]1.9858,[13]2.0575,[14]2.1264,[15]2.0255,[16]1.9435,[17]1.8890,[18]1.8325,[19]1.7753,
|
| 217 |
+
save_imatrix: stored collected data after 20 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 218 |
+
[20]1.7407,[21]1.6968,[22]1.6679,[23]1.6358,[24]1.6059,[25]1.5762,[26]1.6561,[27]1.7531,[28]1.8685,[29]1.8406,
|
| 219 |
+
save_imatrix: stored collected data after 30 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 220 |
+
[30]1.8213,[31]1.8366,[32]1.8313,[33]1.9053,[34]1.8840,[35]1.8786,[36]1.8690,[37]1.8635,[38]1.8952,[39]1.9113,
|
| 221 |
+
save_imatrix: stored collected data after 40 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 222 |
+
[40]1.9002,[41]1.9285,[42]1.9373,[43]1.9526,[44]1.9645,[45]1.9712,[46]1.9569,[47]1.9671,[48]1.9661,[49]1.9672,
|
| 223 |
+
save_imatrix: stored collected data after 50 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 224 |
+
[50]1.9573,[51]1.9761,[52]1.9892,[53]1.9766,[54]1.9847,[55]1.9872,[56]1.9924,[57]1.9852,[58]2.0341,[59]2.0846,
|
| 225 |
+
save_imatrix: stored collected data after 60 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 226 |
+
[60]2.1336,[61]2.1480,[62]2.1927,[63]2.2231,[64]2.2163,[65]2.2162,[66]2.2193,[67]2.2055,[68]2.2219,[69]2.2626,
|
| 227 |
+
save_imatrix: stored collected data after 70 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 228 |
+
[70]2.3155,[71]2.3455,[72]2.3846,[73]2.4166,[74]2.4360,[75]2.4648,[76]2.4808,[77]2.5090,[78]2.5058,[79]2.4882,
|
| 229 |
+
save_imatrix: stored collected data after 80 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 230 |
+
[80]2.4855,[81]2.4882,[82]2.5159,[83]2.5581,[84]2.5770,[85]2.5819,[86]2.5855,[87]2.5763,[88]2.5781,[89]2.5664,
|
| 231 |
+
save_imatrix: stored collected data after 90 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 232 |
+
[90]2.5561,[91]2.5523,[92]2.5356,[93]2.5154,[94]2.5440,[95]2.5928,[96]2.6132,[97]2.6158,[98]2.6236,[99]2.6440,
|
| 233 |
+
save_imatrix: stored collected data after 100 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 234 |
+
[100]2.6613,[101]2.6685,[102]2.6709,[103]2.7057,[104]2.7300,[105]2.7232,[106]2.7660,[107]2.8099,[108]2.8406,[109]2.8811,
|
| 235 |
+
save_imatrix: stored collected data after 110 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 236 |
+
[110]2.9123,[111]2.9467,[112]2.9798,[113]2.9740,[114]2.9899,[115]3.0050,[116]3.0137,[117]3.0245,[118]3.0569,[119]3.0943,
|
| 237 |
+
save_imatrix: stored collected data after 120 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 238 |
+
[120]3.1353,[121]3.1299,[122]3.1032,[123]3.0869,[124]3.1069,[125]3.0955,[126]3.0717,[127]3.0709,[128]3.0689,[129]3.0750,
|
| 239 |
+
save_imatrix: stored collected data after 130 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 240 |
+
[130]3.0827,[131]3.0998,[132]3.1160,[133]3.1226,[134]3.1614,[135]3.1798,[136]3.1540,[137]3.1290,[138]3.1061,[139]3.0821,
|
| 241 |
+
save_imatrix: stored collected data after 140 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 242 |
+
[140]3.0908,[141]3.1033,[142]3.1438,[143]3.1734,[144]3.1791,[145]3.2029,[146]3.2291,[147]3.2515,[148]3.2845,[149]3.3145,
|
| 243 |
+
save_imatrix: stored collected data after 150 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 244 |
+
[150]3.3451,[151]3.3640,[152]3.3849,[153]3.4019,[154]3.4113,[155]3.4073,[156]3.4248,[157]3.4352,[158]3.4461,[159]3.4588,
|
| 245 |
+
save_imatrix: stored collected data after 160 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 246 |
+
[160]3.4724,[161]3.4748,[162]3.4794,[163]3.4943,[164]3.4998,[165]3.5079,[166]3.5212,[167]3.5227,[168]3.5252,[169]3.5323,
|
| 247 |
+
save_imatrix: stored collected data after 170 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 248 |
+
[170]3.5418,[171]3.5468,[172]3.5522,[173]3.5591,[174]3.5774,[175]3.5892,[176]3.5948,[177]3.6013,[178]3.6183,[179]3.6066,
|
| 249 |
+
save_imatrix: stored collected data after 180 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 250 |
+
[180]3.6147,[181]3.6290,[182]3.6530,[183]3.6691,[184]3.6754,[185]3.6775,[186]3.6758,[187]3.6737,[188]3.6741,[189]3.6748,
|
| 251 |
+
save_imatrix: stored collected data after 190 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 252 |
+
[190]3.6751,[191]3.6712,[192]3.6937,[193]3.7245,[194]3.7498,[195]3.7776,[196]3.7992,[197]3.8353,[198]3.8457,[199]3.8632,
|
| 253 |
+
save_imatrix: stored collected data after 200 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 254 |
+
[200]3.8543,[201]3.8694,[202]3.8607,[203]3.8371,[204]3.8146,[205]3.8352,[206]3.8499,[207]3.8590,[208]3.8685,[209]3.8886,
|
| 255 |
+
save_imatrix: stored collected data after 210 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 256 |
+
[210]3.9044,[211]3.9209,[212]3.9405,[213]3.9567,[214]3.9581,[215]3.9352,[216]3.9111,[217]3.8873,[218]3.8636,[219]3.8407,
|
| 257 |
+
save_imatrix: stored collected data after 220 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 258 |
+
[220]3.8241,[221]3.8216,[222]3.8120,[223]3.8085,[224]3.7952,[225]3.7765,[226]3.7761,[227]3.7828,[228]3.8044,[229]3.8287,
|
| 259 |
+
save_imatrix: stored collected data after 230 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 260 |
+
[230]3.8397,[231]3.8627,[232]3.8583,[233]3.8832,[234]3.9142,[235]3.9274,[236]3.9423,[237]3.9476,[238]3.9719,[239]4.0009,
|
| 261 |
+
save_imatrix: stored collected data after 240 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 262 |
+
[240]3.9975,[241]4.0078,[242]4.0223,[243]4.0432,[244]4.0635,[245]4.0782,[246]4.0913,[247]4.1018,[248]4.0917,[249]4.1182,
|
| 263 |
+
save_imatrix: stored collected data after 250 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 264 |
+
[250]4.1322,[251]4.1512,[252]4.1620,[253]4.1670,[254]4.1736,[255]4.1769,[256]4.1893,[257]4.1941,[258]4.2055,[259]4.2207,
|
| 265 |
+
save_imatrix: stored collected data after 260 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 266 |
+
[260]4.2305,[261]4.2418,[262]4.2543,[263]4.2700,[264]4.2824,[265]4.2997,[266]4.2846,[267]4.2893,[268]4.2945,[269]4.3088,
|
| 267 |
+
save_imatrix: stored collected data after 270 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 268 |
+
[270]4.3304,[271]4.3455,[272]4.3672,[273]4.3680,[274]4.3670,[275]4.3766,[276]4.3829,[277]4.3999,[278]4.4148,[279]4.4279,
|
| 269 |
+
save_imatrix: stored collected data after 280 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 270 |
+
[280]4.4372,[281]4.4395,[282]4.4538,[283]4.4654,[284]4.4684,[285]4.4848,[286]4.4863,[287]4.4904,[288]4.4993,[289]4.4958,
|
| 271 |
+
save_imatrix: stored collected data after 290 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 272 |
+
[290]4.5076,[291]4.5134,[292]4.5196,[293]4.5372,[294]4.5508,[295]4.5653,[296]4.5830,[297]4.5879,[298]4.6079,[299]4.6212,
|
| 273 |
+
save_imatrix: stored collected data after 300 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 274 |
+
[300]4.6382,[301]4.6506,[302]4.6647,[303]4.6700,[304]4.6895,[305]4.6978,[306]4.7023,[307]4.7108,[308]4.7293,[309]4.7394,
|
| 275 |
+
save_imatrix: stored collected data after 310 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 276 |
+
[310]4.7444,[311]4.7529,[312]4.7618,[313]4.7761,[314]4.7839,[315]4.7935,[316]4.8056,[317]4.8184,[318]4.8331,[319]4.8379,
|
| 277 |
+
save_imatrix: stored collected data after 320 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 278 |
+
[320]4.8422,[321]4.8356,[322]4.8464,[323]4.8297,[324]4.8477,[325]4.8512,[326]4.8283,[327]4.8413,[328]4.8518,[329]4.8579,
|
| 279 |
+
save_imatrix: stored collected data after 330 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 280 |
+
[330]4.8640,[331]4.8633,[332]4.8671,[333]4.8863,[334]4.8834,[335]4.8949,[336]4.9110,[337]4.9204,[338]4.9253,[339]4.9127,
|
| 281 |
+
save_imatrix: stored collected data after 340 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 282 |
+
[340]4.9237,[341]4.9406,[342]4.9567,[343]4.9745,[344]4.9973,[345]5.0267,[346]5.0290,[347]5.0303,[348]5.0331,[349]5.0414,
|
| 283 |
+
save_imatrix: stored collected data after 350 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 284 |
+
[350]5.0554,[351]5.0758,[352]5.0761,[353]5.0728,[354]5.0839,[355]5.0801,[356]5.0811,[357]5.0798,[358]5.0753,[359]5.0797,
|
| 285 |
+
save_imatrix: stored collected data after 360 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 286 |
+
[360]5.0912,[361]5.0878,[362]5.0861,[363]5.0686,[364]5.0505,[365]5.0334,[366]5.0192,[367]4.9996,[368]4.9826,[369]4.9654,
|
| 287 |
+
save_imatrix: stored collected data after 370 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 288 |
+
[370]4.9512,[371]4.9353,[372]4.9188,[373]4.9057,[374]4.8921,[375]4.8736,[376]4.8610,[377]4.8467,[378]4.8301,[379]4.8157,
|
| 289 |
+
save_imatrix: stored collected data after 380 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 290 |
+
[380]4.8141,[381]4.7993,[382]4.7929,[383]4.7967,[384]4.7841,[385]4.7779,[386]4.7666,[387]4.7479,[388]4.7307,[389]4.7229,
|
| 291 |
+
save_imatrix: stored collected data after 390 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 292 |
+
[390]4.7128,[391]4.6979,[392]4.6796,[393]4.6613,[394]4.6594,[395]4.6573,[396]4.6530,[397]4.6422,[398]4.6436,[399]4.6429,
|
| 293 |
+
save_imatrix: stored collected data after 400 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 294 |
+
[400]4.6261,[401]4.6110,[402]4.6038,[403]4.5899,[404]4.5786,[405]4.5680,[406]4.5586,[407]4.5421,[408]4.5262,[409]4.5115,
|
| 295 |
+
save_imatrix: stored collected data after 410 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 296 |
+
[410]4.4991,[411]4.4877,[412]4.4820,[413]4.4730,[414]4.4690,[415]4.4643,[416]4.4624,[417]4.4571,[418]4.4519,[419]4.4374,
|
| 297 |
+
save_imatrix: stored collected data after 420 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 298 |
+
[420]4.4230,[421]4.4077,[422]4.3944,[423]4.3803,[424]4.3684,[425]4.3546,[426]4.3397,[427]4.3292,[428]4.3145,[429]4.3076,
|
| 299 |
+
save_imatrix: stored collected data after 430 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 300 |
+
[430]4.2946,[431]4.2847,[432]4.2735,[433]4.2636,[434]4.2620,[435]4.2610,[436]4.2546,[437]4.2443,[438]4.2379,[439]4.2240,
|
| 301 |
+
save_imatrix: stored collected data after 440 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 302 |
+
[440]4.2110,[441]4.1987,[442]4.1868,[443]4.1755,[444]4.1721,[445]4.1629,[446]4.1593,[447]4.1535,[448]4.1430,[449]4.1400,
|
| 303 |
+
save_imatrix: stored collected data after 450 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 304 |
+
[450]4.1326,[451]4.1249,[452]4.1137,[453]4.1065,[454]4.0994,[455]4.0910,[456]4.0787,[457]4.0669,[458]4.0547,[459]4.0430,
|
| 305 |
+
save_imatrix: stored collected data after 460 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 306 |
+
[460]4.0317,[461]4.0223,[462]4.0129,[463]4.0069,[464]3.9991,[465]3.9951,[466]3.9893,[467]3.9839,[468]3.9785,[469]3.9728,
|
| 307 |
+
save_imatrix: stored collected data after 470 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 308 |
+
[470]3.9673,[471]3.9618,[472]3.9564,[473]3.9517,[474]3.9461,[475]3.9405,[476]3.9357,[477]3.9303,[478]3.9250,[479]3.9215,
|
| 309 |
+
save_imatrix: stored collected data after 480 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 310 |
+
[480]3.9109,[481]3.9015,[482]3.8973,[483]3.8903,[484]3.8828,[485]3.8726,[486]3.8630,[487]3.8537,[488]3.8443,[489]3.8383,
|
| 311 |
+
save_imatrix: stored collected data after 490 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 312 |
+
[490]3.8310,[491]3.8243,[492]3.8204,[493]3.8151,[494]3.8084,[495]3.8004,[496]3.7998,[497]3.7966,[498]3.7917,[499]3.7900,
|
| 313 |
+
save_imatrix: stored collected data after 500 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 314 |
+
[500]3.7876,[501]3.7866,[502]3.7874,[503]3.7902,[504]3.7887,[505]3.7828,[506]3.7748,[507]3.7788,[508]3.7894,[509]3.7982,
|
| 315 |
+
save_imatrix: stored collected data after 510 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 316 |
+
[510]3.8064,[511]3.8136,[512]3.8212,[513]3.8257,[514]3.8295,[515]3.8312,[516]3.8390,[517]3.8421,[518]3.8486,[519]3.8575,
|
| 317 |
+
save_imatrix: stored collected data after 520 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 318 |
+
[520]3.8708,[521]3.8874,[522]3.9011,[523]3.8995,[524]3.9063,[525]3.9102,[526]3.9165,[527]3.9179,[528]3.9201,[529]3.9289,
|
| 319 |
+
save_imatrix: stored collected data after 530 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 320 |
+
[530]3.9341,[531]3.9355,[532]3.9425,[533]3.9482,[534]3.9554,[535]3.9553,[536]3.9550,[537]3.9558,[538]3.9602,[539]3.9650,
|
| 321 |
+
save_imatrix: stored collected data after 540 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 322 |
+
[540]3.9698,[541]3.9741,[542]3.9765,[543]3.9788,[544]3.9835,[545]3.9886,[546]3.9973,[547]4.0056,[548]4.0122,[549]4.0208,
|
| 323 |
+
save_imatrix: stored collected data after 550 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 324 |
+
[550]4.0278,[551]4.0358,[552]4.0422,[553]4.0482,[554]4.0550,[555]4.0610,[556]4.0581,[557]4.0553,[558]4.0519,[559]4.0563,
|
| 325 |
+
save_imatrix: stored collected data after 560 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 326 |
+
[560]4.0622,[561]4.0664,[562]4.0717,[563]4.0722,[564]4.0769,[565]4.0773,[566]4.0819,[567]4.0827,[568]4.0828,[569]4.0823,
|
| 327 |
+
save_imatrix: stored collected data after 570 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 328 |
+
[570]4.0830,[571]4.0859,[572]4.0823,[573]4.0797,[574]4.0752,[575]4.0715,[576]4.0642,[577]4.0590,[578]4.0525,[579]4.0453,
|
| 329 |
+
save_imatrix: stored collected data after 580 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 330 |
+
[580]4.0425,[581]4.0443,[582]4.0423,[583]4.0433,[584]4.0410,[585]4.0407,[586]4.0404,[587]4.0377,[588]4.0320,[589]4.0325,
|
| 331 |
+
save_imatrix: stored collected data after 590 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 332 |
+
[590]4.0294,[591]4.0221,[592]4.0155,[593]4.0081,[594]4.0022,[595]3.9988,[596]3.9974,[597]3.9952,[598]3.9942,[599]3.9917,
|
| 333 |
+
save_imatrix: stored collected data after 600 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 334 |
+
[600]3.9871,[601]3.9813,[602]3.9814,[603]3.9815,[604]3.9813,[605]3.9772,[606]3.9751,[607]3.9720,[608]3.9753,[609]3.9744,
|
| 335 |
+
save_imatrix: stored collected data after 610 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 336 |
+
[610]3.9720,[611]3.9726,[612]3.9723,[613]3.9676,[614]3.9607,[615]3.9530,[616]3.9455,[617]3.9375,[618]3.9301,[619]3.9224,
|
| 337 |
+
save_imatrix: stored collected data after 620 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 338 |
+
[620]3.9147,[621]3.9061,[622]3.8977,[623]3.8901,[624]3.8827,[625]3.8750,[626]3.8686,[627]3.8608,[628]3.8540,[629]3.8483,
|
| 339 |
+
save_imatrix: stored collected data after 630 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 340 |
+
[630]3.8414,[631]3.8345,[632]3.8297,[633]3.8220,[634]3.8178,[635]3.8159,[636]3.8125,[637]3.8052,[638]3.7995,[639]3.7934,
|
| 341 |
+
save_imatrix: stored collected data after 640 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 342 |
+
[640]3.7862,[641]3.7807,[642]3.7742,[643]3.7684,[644]3.7622,[645]3.7555,[646]3.7488,[647]3.7428,[648]3.7423,[649]3.7357,
|
| 343 |
+
save_imatrix: stored collected data after 650 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 344 |
+
[650]3.7289,[651]3.7222,[652]3.7158,[653]3.7091,[654]3.7022,[655]3.6956,[656]3.6892,[657]3.6834,[658]3.6769,[659]3.6795,
|
| 345 |
+
save_imatrix: stored collected data after 660 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 346 |
+
[660]3.6798,[661]3.6828,[662]3.6807,[663]3.6746,[664]3.6705,[665]3.6650,[666]3.6584,[667]3.6531,[668]3.6477,[669]3.6427,
|
| 347 |
+
save_imatrix: stored collected data after 670 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 348 |
+
[670]3.6377,[671]3.6320,[672]3.6260,[673]3.6203,[674]3.6165,[675]3.6115,[676]3.6057,[677]3.6006,[678]3.5948,[679]3.5888,
|
| 349 |
+
save_imatrix: stored collected data after 680 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 350 |
+
[680]3.5861,[681]3.5802,[682]3.5752,[683]3.5704,[684]3.5649,[685]3.5604,[686]3.5584,[687]3.5571,[688]3.5532,[689]3.5485,
|
| 351 |
+
save_imatrix: stored collected data after 690 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 352 |
+
[690]3.5423,[691]3.5359,[692]3.5305,[693]3.5249,[694]3.5211,[695]3.5185,[696]3.5168,[697]3.5141,[698]3.5125,[699]3.5101,
|
| 353 |
+
save_imatrix: stored collected data after 700 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 354 |
+
[700]3.5082,[701]3.5068,[702]3.5052,[703]3.5033,[704]3.5014,[705]3.4998,[706]3.4983,[707]3.4959,[708]3.4946,[709]3.4925,
|
| 355 |
+
save_imatrix: stored collected data after 710 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 356 |
+
[710]3.4907,[711]3.4886,[712]3.4894,[713]3.4891,[714]3.4893,[715]3.4904,[716]3.4915,[717]3.4923,[718]3.4931,[719]3.4948,
|
| 357 |
+
save_imatrix: stored collected data after 720 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 358 |
+
[720]3.4969,[721]3.4973,[722]3.4981,[723]3.4991,[724]3.5006,[725]3.5017,[726]3.5034,[727]3.5048,[728]3.5068,[729]3.5068,
|
| 359 |
+
save_imatrix: stored collected data after 730 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 360 |
+
[730]3.5070,[731]3.5082,[732]3.5111,[733]3.5122,[734]3.5126,[735]3.5127,[736]3.5141,[737]3.5162,[738]3.5169,[739]3.5198,
|
| 361 |
+
save_imatrix: stored collected data after 740 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 362 |
+
[740]3.5214,[741]3.5233,[742]3.5248,[743]3.5255,[744]3.5255,[745]3.5267,[746]3.5283,[747]3.5298,[748]3.5312,[749]3.5323,
|
| 363 |
+
save_imatrix: stored collected data after 750 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 364 |
+
[750]3.5335,[751]3.5345,[752]3.5365,[753]3.5398,[754]3.5405,[755]3.5417,[756]3.5434,[757]3.5449,[758]3.5457,[759]3.5472,
|
| 365 |
+
save_imatrix: stored collected data after 760 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 366 |
+
[760]3.5482,[761]3.5489,[762]3.5507,[763]3.5511,[764]3.5530,[765]3.5540,[766]3.5556,[767]3.5563,[768]3.5573,[769]3.5577,
|
| 367 |
+
save_imatrix: stored collected data after 770 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 368 |
+
[770]3.5588,[771]3.5610,[772]3.5617,[773]3.5619,[774]3.5626,[775]3.5646,[776]3.5655,[777]3.5679,[778]3.5679,[779]3.5693,
|
| 369 |
+
save_imatrix: stored collected data after 780 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 370 |
+
[780]3.5708,[781]3.5729,[782]3.5750,[783]3.5778,[784]3.5781,[785]3.5787,[786]3.5794,[787]3.5812,[788]3.5814,[789]3.5837,
|
| 371 |
+
save_imatrix: stored collected data after 790 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 372 |
+
[790]3.5849,[791]3.5861,[792]3.5863,[793]3.5874,[794]3.5896,[795]3.5911,[796]3.5914,[797]3.5930,[798]3.5942,[799]3.5979,
|
| 373 |
+
save_imatrix: stored collected data after 800 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 374 |
+
[800]3.5984,[801]3.5983,[802]3.6000,[803]3.6018,[804]3.6027,[805]3.6036,[806]3.6041,[807]3.6050,[808]3.6054,[809]3.6062,
|
| 375 |
+
save_imatrix: stored collected data after 810 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 376 |
+
[810]3.6083,[811]3.6108,[812]3.6119,[813]3.6131,[814]3.6137,
|
| 377 |
+
save_imatrix: stored collected data after 814 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
|
| 378 |
+
|
| 379 |
+
Final estimate: PPL = 3.6137 +/- 0.01805
|
| 380 |
+
|
| 381 |
+
======================== sorted layer importances
|
| 382 |
+
0: Layer 0, <cos_sim> = 0.433589
|
| 383 |
+
1: Layer 2, <cos_sim> = 0.752289
|
| 384 |
+
2: Layer 1, <cos_sim> = 0.764358
|
| 385 |
+
3: Layer 3, <cos_sim> = 0.861103
|
| 386 |
+
4: Layer 4, <cos_sim> = 0.90387
|
| 387 |
+
5: Layer 32, <cos_sim> = 0.905589
|
| 388 |
+
6: Layer 6, <cos_sim> = 0.912358
|
| 389 |
+
7: Layer 37, <cos_sim> = 0.913118
|
| 390 |
+
8: Layer 39, <cos_sim> = 0.913941
|
| 391 |
+
9: Layer 31, <cos_sim> = 0.914878
|
| 392 |
+
10: Layer 23, <cos_sim> = 0.915726
|
| 393 |
+
11: Layer 91, <cos_sim> = 0.915909
|
| 394 |
+
12: Layer 41, <cos_sim> = 0.917222
|
| 395 |
+
13: Layer 40, <cos_sim> = 0.918507
|
| 396 |
+
14: Layer 33, <cos_sim> = 0.918549
|
| 397 |
+
15: Layer 29, <cos_sim> = 0.919203
|
| 398 |
+
16: Layer 30, <cos_sim> = 0.919353
|
| 399 |
+
17: Layer 28, <cos_sim> = 0.921385
|
| 400 |
+
18: Layer 38, <cos_sim> = 0.921396
|
| 401 |
+
19: Layer 24, <cos_sim> = 0.922245
|
| 402 |
+
20: Layer 34, <cos_sim> = 0.922372
|
| 403 |
+
21: Layer 22, <cos_sim> = 0.922432
|
| 404 |
+
22: Layer 26, <cos_sim> = 0.924714
|
| 405 |
+
23: Layer 36, <cos_sim> = 0.924901
|
| 406 |
+
24: Layer 14, <cos_sim> = 0.925139
|
| 407 |
+
25: Layer 25, <cos_sim> = 0.9268
|
| 408 |
+
26: Layer 13, <cos_sim> = 0.92694
|
| 409 |
+
27: Layer 35, <cos_sim> = 0.927297
|
| 410 |
+
28: Layer 10, <cos_sim> = 0.927834
|
| 411 |
+
29: Layer 27, <cos_sim> = 0.928177
|
| 412 |
+
30: Layer 11, <cos_sim> = 0.929866
|
| 413 |
+
31: Layer 21, <cos_sim> = 0.929894
|
| 414 |
+
32: Layer 85, <cos_sim> = 0.93049
|
| 415 |
+
33: Layer 7, <cos_sim> = 0.930774
|
| 416 |
+
34: Layer 84, <cos_sim> = 0.932103
|
| 417 |
+
35: Layer 8, <cos_sim> = 0.933102
|
| 418 |
+
36: Layer 9, <cos_sim> = 0.935479
|
| 419 |
+
37: Layer 42, <cos_sim> = 0.935862
|
| 420 |
+
38: Layer 12, <cos_sim> = 0.936215
|
| 421 |
+
39: Layer 5, <cos_sim> = 0.941695
|
| 422 |
+
40: Layer 43, <cos_sim> = 0.943382
|
| 423 |
+
41: Layer 86, <cos_sim> = 0.947319
|
| 424 |
+
42: Layer 15, <cos_sim> = 0.948505
|
| 425 |
+
43: Layer 20, <cos_sim> = 0.948549
|
| 426 |
+
44: Layer 18, <cos_sim> = 0.951088
|
| 427 |
+
45: Layer 44, <cos_sim> = 0.952598
|
| 428 |
+
46: Layer 83, <cos_sim> = 0.952599
|
| 429 |
+
47: Layer 19, <cos_sim> = 0.952615
|
| 430 |
+
48: Layer 45, <cos_sim> = 0.953287
|
| 431 |
+
49: Layer 17, <cos_sim> = 0.956447
|
| 432 |
+
50: Layer 80, <cos_sim> = 0.957907
|
| 433 |
+
51: Layer 16, <cos_sim> = 0.957981
|
| 434 |
+
52: Layer 46, <cos_sim> = 0.958118
|
| 435 |
+
53: Layer 81, <cos_sim> = 0.959244
|
| 436 |
+
54: Layer 87, <cos_sim> = 0.959352
|
| 437 |
+
55: Layer 90, <cos_sim> = 0.960285
|
| 438 |
+
56: Layer 82, <cos_sim> = 0.961087
|
| 439 |
+
57: Layer 47, <cos_sim> = 0.961475
|
| 440 |
+
58: Layer 89, <cos_sim> = 0.962276
|
| 441 |
+
59: Layer 88, <cos_sim> = 0.963196
|
| 442 |
+
60: Layer 79, <cos_sim> = 0.963523
|
| 443 |
+
61: Layer 48, <cos_sim> = 0.963567
|
| 444 |
+
62: Layer 50, <cos_sim> = 0.964597
|
| 445 |
+
63: Layer 49, <cos_sim> = 0.965508
|
| 446 |
+
64: Layer 51, <cos_sim> = 0.965609
|
| 447 |
+
65: Layer 52, <cos_sim> = 0.967696
|
| 448 |
+
66: Layer 54, <cos_sim> = 0.968009
|
| 449 |
+
67: Layer 53, <cos_sim> = 0.970224
|
| 450 |
+
68: Layer 76, <cos_sim> = 0.970396
|
| 451 |
+
69: Layer 78, <cos_sim> = 0.971591
|
| 452 |
+
70: Layer 55, <cos_sim> = 0.971771
|
| 453 |
+
71: Layer 75, <cos_sim> = 0.973436
|
| 454 |
+
72: Layer 77, <cos_sim> = 0.975951
|
| 455 |
+
73: Layer 58, <cos_sim> = 0.978094
|
| 456 |
+
74: Layer 56, <cos_sim> = 0.978404
|
| 457 |
+
75: Layer 57, <cos_sim> = 0.979015
|
| 458 |
+
76: Layer 59, <cos_sim> = 0.979639
|
| 459 |
+
77: Layer 73, <cos_sim> = 0.980629
|
| 460 |
+
78: Layer 67, <cos_sim> = 0.981126
|
| 461 |
+
79: Layer 66, <cos_sim> = 0.981658
|
| 462 |
+
80: Layer 72, <cos_sim> = 0.981951
|
| 463 |
+
81: Layer 65, <cos_sim> = 0.981978
|
| 464 |
+
82: Layer 61, <cos_sim> = 0.982014
|
| 465 |
+
83: Layer 68, <cos_sim> = 0.982152
|
| 466 |
+
84: Layer 74, <cos_sim> = 0.982164
|
| 467 |
+
85: Layer 60, <cos_sim> = 0.982302
|
| 468 |
+
86: Layer 71, <cos_sim> = 0.982914
|
| 469 |
+
87: Layer 63, <cos_sim> = 0.983344
|
| 470 |
+
88: Layer 70, <cos_sim> = 0.983749
|
| 471 |
+
89: Layer 64, <cos_sim> = 0.984071
|
| 472 |
+
90: Layer 69, <cos_sim> = 0.984258
|
| 473 |
+
91: Layer 62, <cos_sim> = 0.984467
|
| 474 |
+
|
| 475 |
+
======================== sorted attention importances
|
| 476 |
+
0: Layer 0, <cos_sim> = 0.335289
|
| 477 |
+
1: Layer 1, <cos_sim> = 0.552763
|
| 478 |
+
2: Layer 2, <cos_sim> = 0.637396
|
| 479 |
+
3: Layer 3, <cos_sim> = 0.816339
|
| 480 |
+
4: Layer 7, <cos_sim> = 0.824544
|
| 481 |
+
5: Layer 13, <cos_sim> = 0.850178
|
| 482 |
+
6: Layer 6, <cos_sim> = 0.850298
|
| 483 |
+
7: Layer 4, <cos_sim> = 0.851804
|
| 484 |
+
8: Layer 9, <cos_sim> = 0.859275
|
| 485 |
+
9: Layer 8, <cos_sim> = 0.866695
|
| 486 |
+
10: Layer 12, <cos_sim> = 0.874505
|
| 487 |
+
11: Layer 15, <cos_sim> = 0.876165
|
| 488 |
+
12: Layer 5, <cos_sim> = 0.876507
|
| 489 |
+
13: Layer 10, <cos_sim> = 0.87806
|
| 490 |
+
14: Layer 11, <cos_sim> = 0.880676
|
| 491 |
+
15: Layer 16, <cos_sim> = 0.893902
|
| 492 |
+
16: Layer 17, <cos_sim> = 0.899423
|
| 493 |
+
17: Layer 21, <cos_sim> = 0.900672
|
| 494 |
+
18: Layer 14, <cos_sim> = 0.9032
|
| 495 |
+
19: Layer 19, <cos_sim> = 0.909055
|
| 496 |
+
20: Layer 20, <cos_sim> = 0.911488
|
| 497 |
+
21: Layer 18, <cos_sim> = 0.917251
|
| 498 |
+
22: Layer 23, <cos_sim> = 0.919361
|
| 499 |
+
23: Layer 22, <cos_sim> = 0.928206
|
| 500 |
+
24: Layer 24, <cos_sim> = 0.932381
|
| 501 |
+
25: Layer 25, <cos_sim> = 0.936273
|
| 502 |
+
26: Layer 32, <cos_sim> = 0.938645
|
| 503 |
+
27: Layer 28, <cos_sim> = 0.941543
|
| 504 |
+
28: Layer 26, <cos_sim> = 0.942651
|
| 505 |
+
29: Layer 33, <cos_sim> = 0.943323
|
| 506 |
+
30: Layer 27, <cos_sim> = 0.943763
|
| 507 |
+
31: Layer 37, <cos_sim> = 0.944613
|
| 508 |
+
32: Layer 31, <cos_sim> = 0.945652
|
| 509 |
+
33: Layer 30, <cos_sim> = 0.946387
|
| 510 |
+
34: Layer 38, <cos_sim> = 0.948997
|
| 511 |
+
35: Layer 39, <cos_sim> = 0.94954
|
| 512 |
+
36: Layer 35, <cos_sim> = 0.950607
|
| 513 |
+
37: Layer 41, <cos_sim> = 0.951778
|
| 514 |
+
38: Layer 34, <cos_sim> = 0.952551
|
| 515 |
+
39: Layer 40, <cos_sim> = 0.95284
|
| 516 |
+
40: Layer 29, <cos_sim> = 0.952981
|
| 517 |
+
41: Layer 42, <cos_sim> = 0.954776
|
| 518 |
+
42: Layer 36, <cos_sim> = 0.958211
|
| 519 |
+
43: Layer 85, <cos_sim> = 0.963066
|
| 520 |
+
44: Layer 43, <cos_sim> = 0.963722
|
| 521 |
+
45: Layer 44, <cos_sim> = 0.964977
|
| 522 |
+
46: Layer 45, <cos_sim> = 0.966557
|
| 523 |
+
47: Layer 46, <cos_sim> = 0.969251
|
| 524 |
+
48: Layer 84, <cos_sim> = 0.971393
|
| 525 |
+
49: Layer 86, <cos_sim> = 0.971928
|
| 526 |
+
50: Layer 51, <cos_sim> = 0.973333
|
| 527 |
+
51: Layer 52, <cos_sim> = 0.974347
|
| 528 |
+
52: Layer 83, <cos_sim> = 0.974803
|
| 529 |
+
53: Layer 50, <cos_sim> = 0.977621
|
| 530 |
+
54: Layer 48, <cos_sim> = 0.977849
|
| 531 |
+
55: Layer 47, <cos_sim> = 0.97789
|
| 532 |
+
56: Layer 81, <cos_sim> = 0.978345
|
| 533 |
+
57: Layer 82, <cos_sim> = 0.978486
|
| 534 |
+
58: Layer 49, <cos_sim> = 0.978655
|
| 535 |
+
59: Layer 80, <cos_sim> = 0.97866
|
| 536 |
+
60: Layer 53, <cos_sim> = 0.979166
|
| 537 |
+
61: Layer 91, <cos_sim> = 0.98049
|
| 538 |
+
62: Layer 58, <cos_sim> = 0.981312
|
| 539 |
+
63: Layer 54, <cos_sim> = 0.981736
|
| 540 |
+
64: Layer 87, <cos_sim> = 0.982023
|
| 541 |
+
65: Layer 79, <cos_sim> = 0.982483
|
| 542 |
+
66: Layer 78, <cos_sim> = 0.983622
|
| 543 |
+
67: Layer 88, <cos_sim> = 0.983653
|
| 544 |
+
68: Layer 90, <cos_sim> = 0.985642
|
| 545 |
+
69: Layer 61, <cos_sim> = 0.986197
|
| 546 |
+
70: Layer 89, <cos_sim> = 0.986293
|
| 547 |
+
71: Layer 68, <cos_sim> = 0.986564
|
| 548 |
+
72: Layer 59, <cos_sim> = 0.986572
|
| 549 |
+
73: Layer 73, <cos_sim> = 0.98676
|
| 550 |
+
74: Layer 71, <cos_sim> = 0.986905
|
| 551 |
+
75: Layer 55, <cos_sim> = 0.986992
|
| 552 |
+
76: Layer 72, <cos_sim> = 0.987429
|
| 553 |
+
77: Layer 76, <cos_sim> = 0.987882
|
| 554 |
+
78: Layer 57, <cos_sim> = 0.988337
|
| 555 |
+
79: Layer 56, <cos_sim> = 0.988355
|
| 556 |
+
80: Layer 77, <cos_sim> = 0.98847
|
| 557 |
+
81: Layer 67, <cos_sim> = 0.988501
|
| 558 |
+
82: Layer 65, <cos_sim> = 0.98852
|
| 559 |
+
83: Layer 70, <cos_sim> = 0.988926
|
| 560 |
+
84: Layer 74, <cos_sim> = 0.988971
|
| 561 |
+
85: Layer 64, <cos_sim> = 0.988973
|
| 562 |
+
86: Layer 63, <cos_sim> = 0.989051
|
| 563 |
+
87: Layer 66, <cos_sim> = 0.989456
|
| 564 |
+
88: Layer 60, <cos_sim> = 0.989791
|
| 565 |
+
89: Layer 69, <cos_sim> = 0.99087
|
| 566 |
+
90: Layer 75, <cos_sim> = 0.991036
|
| 567 |
+
91: Layer 62, <cos_sim> = 0.991942
|
| 568 |
+
|
| 569 |
+
======================== sorted ffn importances
|
| 570 |
+
0: Layer 0, <cos_sim> = 0.584305
|
| 571 |
+
1: Layer 1, <cos_sim> = 0.599857
|
| 572 |
+
2: Layer 2, <cos_sim> = 0.734115
|
| 573 |
+
3: Layer 6, <cos_sim> = 0.807659
|
| 574 |
+
4: Layer 3, <cos_sim> = 0.823881
|
| 575 |
+
5: Layer 8, <cos_sim> = 0.855655
|
| 576 |
+
6: Layer 11, <cos_sim> = 0.855686
|
| 577 |
+
7: Layer 4, <cos_sim> = 0.858089
|
| 578 |
+
8: Layer 14, <cos_sim> = 0.858127
|
| 579 |
+
9: Layer 12, <cos_sim> = 0.860683
|
| 580 |
+
10: Layer 5, <cos_sim> = 0.864949
|
| 581 |
+
11: Layer 7, <cos_sim> = 0.867606
|
| 582 |
+
12: Layer 9, <cos_sim> = 0.883365
|
| 583 |
+
13: Layer 10, <cos_sim> = 0.884968
|
| 584 |
+
14: Layer 15, <cos_sim> = 0.885658
|
| 585 |
+
15: Layer 16, <cos_sim> = 0.887954
|
| 586 |
+
16: Layer 20, <cos_sim> = 0.892895
|
| 587 |
+
17: Layer 18, <cos_sim> = 0.90077
|
| 588 |
+
18: Layer 19, <cos_sim> = 0.901247
|
| 589 |
+
19: Layer 13, <cos_sim> = 0.902837
|
| 590 |
+
20: Layer 24, <cos_sim> = 0.914728
|
| 591 |
+
21: Layer 22, <cos_sim> = 0.91663
|
| 592 |
+
22: Layer 25, <cos_sim> = 0.91686
|
| 593 |
+
23: Layer 17, <cos_sim> = 0.91982
|
| 594 |
+
24: Layer 26, <cos_sim> = 0.920503
|
| 595 |
+
25: Layer 23, <cos_sim> = 0.921116
|
| 596 |
+
26: Layer 27, <cos_sim> = 0.924545
|
| 597 |
+
27: Layer 29, <cos_sim> = 0.92818
|
| 598 |
+
28: Layer 32, <cos_sim> = 0.931219
|
| 599 |
+
29: Layer 21, <cos_sim> = 0.931957
|
| 600 |
+
30: Layer 31, <cos_sim> = 0.931987
|
| 601 |
+
31: Layer 28, <cos_sim> = 0.933451
|
| 602 |
+
32: Layer 30, <cos_sim> = 0.934623
|
| 603 |
+
33: Layer 34, <cos_sim> = 0.935862
|
| 604 |
+
34: Layer 37, <cos_sim> = 0.93849
|
| 605 |
+
35: Layer 36, <cos_sim> = 0.939261
|
| 606 |
+
36: Layer 33, <cos_sim> = 0.94047
|
| 607 |
+
37: Layer 39, <cos_sim> = 0.942833
|
| 608 |
+
38: Layer 40, <cos_sim> = 0.943535
|
| 609 |
+
39: Layer 35, <cos_sim> = 0.943962
|
| 610 |
+
40: Layer 41, <cos_sim> = 0.944572
|
| 611 |
+
41: Layer 91, <cos_sim> = 0.944611
|
| 612 |
+
42: Layer 38, <cos_sim> = 0.94701
|
| 613 |
+
43: Layer 43, <cos_sim> = 0.951876
|
| 614 |
+
44: Layer 42, <cos_sim> = 0.953462
|
| 615 |
+
45: Layer 44, <cos_sim> = 0.954221
|
| 616 |
+
46: Layer 45, <cos_sim> = 0.954828
|
| 617 |
+
47: Layer 84, <cos_sim> = 0.960194
|
| 618 |
+
48: Layer 46, <cos_sim> = 0.962422
|
| 619 |
+
49: Layer 47, <cos_sim> = 0.963472
|
| 620 |
+
50: Layer 50, <cos_sim> = 0.963841
|
| 621 |
+
51: Layer 48, <cos_sim> = 0.964882
|
| 622 |
+
52: Layer 51, <cos_sim> = 0.96498
|
| 623 |
+
53: Layer 49, <cos_sim> = 0.965125
|
| 624 |
+
54: Layer 85, <cos_sim> = 0.965745
|
| 625 |
+
55: Layer 90, <cos_sim> = 0.966198
|
| 626 |
+
56: Layer 52, <cos_sim> = 0.968709
|
| 627 |
+
57: Layer 89, <cos_sim> = 0.969302
|
| 628 |
+
58: Layer 86, <cos_sim> = 0.970209
|
| 629 |
+
59: Layer 79, <cos_sim> = 0.971392
|
| 630 |
+
60: Layer 80, <cos_sim> = 0.97181
|
| 631 |
+
61: Layer 83, <cos_sim> = 0.971817
|
| 632 |
+
62: Layer 53, <cos_sim> = 0.972442
|
| 633 |
+
63: Layer 81, <cos_sim> = 0.972559
|
| 634 |
+
64: Layer 87, <cos_sim> = 0.973106
|
| 635 |
+
65: Layer 78, <cos_sim> = 0.973454
|
| 636 |
+
66: Layer 57, <cos_sim> = 0.973742
|
| 637 |
+
67: Layer 77, <cos_sim> = 0.97382
|
| 638 |
+
68: Layer 82, <cos_sim> = 0.974303
|
| 639 |
+
69: Layer 55, <cos_sim> = 0.974649
|
| 640 |
+
70: Layer 54, <cos_sim> = 0.974867
|
| 641 |
+
71: Layer 76, <cos_sim> = 0.975321
|
| 642 |
+
72: Layer 75, <cos_sim> = 0.975472
|
| 643 |
+
73: Layer 88, <cos_sim> = 0.975633
|
| 644 |
+
74: Layer 58, <cos_sim> = 0.976417
|
| 645 |
+
75: Layer 56, <cos_sim> = 0.976436
|
| 646 |
+
76: Layer 60, <cos_sim> = 0.976607
|
| 647 |
+
77: Layer 73, <cos_sim> = 0.977296
|
| 648 |
+
78: Layer 72, <cos_sim> = 0.977447
|
| 649 |
+
79: Layer 65, <cos_sim> = 0.977744
|
| 650 |
+
80: Layer 67, <cos_sim> = 0.977822
|
| 651 |
+
81: Layer 70, <cos_sim> = 0.977891
|
| 652 |
+
82: Layer 59, <cos_sim> = 0.978032
|
| 653 |
+
83: Layer 71, <cos_sim> = 0.978203
|
| 654 |
+
84: Layer 69, <cos_sim> = 0.97839
|
| 655 |
+
85: Layer 64, <cos_sim> = 0.978551
|
| 656 |
+
86: Layer 66, <cos_sim> = 0.978619
|
| 657 |
+
87: Layer 63, <cos_sim> = 0.97875
|
| 658 |
+
88: Layer 74, <cos_sim> = 0.979117
|
| 659 |
+
89: Layer 62, <cos_sim> = 0.979471
|
| 660 |
+
90: Layer 68, <cos_sim> = 0.979636
|
| 661 |
+
91: Layer 61, <cos_sim> = 0.980855
|
| 662 |
+
|
| 663 |
+
llama_print_timings: load time = 195855.60 ms
|
| 664 |
+
llama_print_timings: sample time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
|
| 665 |
+
llama_print_timings: prompt eval time = 7668240.55 ms / 416768 tokens ( 18.40 ms per token, 54.35 tokens per second)
|
| 666 |
+
llama_print_timings: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
|
| 667 |
+
llama_print_timings: total time = 7872041.86 ms / 416769 tokens
|