ubergarm commited on
Commit
b7f8856
·
1 Parent(s): 5fe167e

upload imatrix log file has cossim per layer

Browse files
Files changed (2) hide show
  1. README.md +2 -2
  2. logs/imatrix-GLM-4.7-BF16.log +667 -0
README.md CHANGED
@@ -19,12 +19,12 @@ Currently cooking this now!
19
 
20
  - [x] download bf16 safetensors https://huggingface.co/zai-org/GLM-4.7
21
  - [x] use llama.cpp/convert_hf_to_gguf.py to create bf16 GGUF
22
- - [ ] calculate imatrix and upload to HF first so others can use as desired
23
  - [ ] cook Q8_0 and test perplexity of BF16 and Q8_0 for baseline data
24
  - [ ] cook IQ5_K with full q8_0 attn/shexp/first 3 dense layers and test
25
  - [ ] upload IQ5_K if all looking good
26
  - [ ] continue with smaller quants
27
- - [ ] chek if any folks open discussions with desired RAM/VRAM breakpoints
28
 
29
  ## `ik_llama.cpp` imatrix Quantizations of zai-org/GLM-4.7
30
  *NOTE* `ik_llama.cpp` can also run your existing GGUFs from bartowski, unsloth, mradermacher, etc if you want to try it out before downloading my quants.
 
19
 
20
  - [x] download bf16 safetensors https://huggingface.co/zai-org/GLM-4.7
21
  - [x] use llama.cpp/convert_hf_to_gguf.py to create bf16 GGUF
22
+ - [x] calculate imatrix and upload to HF first so others can use as desired
23
  - [ ] cook Q8_0 and test perplexity of BF16 and Q8_0 for baseline data
24
  - [ ] cook IQ5_K with full q8_0 attn/shexp/first 3 dense layers and test
25
  - [ ] upload IQ5_K if all looking good
26
  - [ ] continue with smaller quants
27
+ - [ ] check if any folks open discussions with desired RAM/VRAM breakpoints
28
 
29
  ## `ik_llama.cpp` imatrix Quantizations of zai-org/GLM-4.7
30
  *NOTE* `ik_llama.cpp` can also run your existing GGUFs from bartowski, unsloth, mradermacher, etc if you want to try it out before downloading my quants.
logs/imatrix-GLM-4.7-BF16.log ADDED
@@ -0,0 +1,667 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ model=/mnt/data/models/ubergarm/GLM-4.7-GGUF/GLM-160x21B-4.7-BF16-00001-of-00015.gguf
2
+
3
+ numactl -N ${SOCKET} -m ${SOCKET} \
4
+ ./build/bin/llama-imatrix \
5
+ --model "$model"\
6
+ -f ubergarm-imatrix-calibration-corpus-v02.txt \
7
+ -o /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat \
8
+ --no-fused-moe \
9
+ --no-fused-up-gate \
10
+ --no-fused-mul-multiadd \
11
+ --ctx-size 512 \
12
+ -ub 4096 -b 4096 \
13
+ --threads 96 \
14
+ --threads-batch 128 \
15
+ --no-mmap \
16
+ --numa numactl \
17
+ --verbosity 1 \
18
+ --layer-similarity
19
+
20
+ CPU: using device CPU - 0 MiB free
21
+ llama_model_loader: additional 14 GGUFs metadata loaded.
22
+ llama_model_loader: loaded meta data with 49 key-value pairs and 1761 tensors from /mnt/data/models/ubergarm/GLM-4.7-GGUF/GLM-160x21B-4.7-BF16-00001-of-00015.gguf (version GGUF V3 (latest))
23
+ llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
24
+ llama_model_loader: - kv 0: general.architecture str = glm4moe
25
+ llama_model_loader: - kv 1: general.type str = model
26
+ llama_model_loader: - kv 2: general.sampling.temp f32 = 1.000000
27
+ llama_model_loader: - kv 3: general.name str = GLM 4.7
28
+ llama_model_loader: - kv 4: general.version str = 4.7
29
+ llama_model_loader: - kv 5: general.basename str = GLM
30
+ llama_model_loader: - kv 6: general.size_label str = 160x21B
31
+ llama_model_loader: - kv 7: general.license str = mit
32
+ llama_model_loader: - kv 8: general.tags arr[str,1] = ["text-generation"]
33
+ llama_model_loader: - kv 9: general.languages arr[str,2] = ["en", "zh"]
34
+ llama_model_loader: - kv 10: glm4moe.block_count u32 = 93
35
+ llama_model_loader: - kv 11: glm4moe.context_length u32 = 202752
36
+ llama_model_loader: - kv 12: glm4moe.embedding_length u32 = 5120
37
+ llama_model_loader: - kv 13: glm4moe.feed_forward_length u32 = 12288
38
+ llama_model_loader: - kv 14: glm4moe.attention.head_count u32 = 96
39
+ llama_model_loader: - kv 15: glm4moe.attention.head_count_kv u32 = 8
40
+ llama_model_loader: - kv 16: glm4moe.rope.freq_base f32 = 1000000.000000
41
+ llama_model_loader: - kv 17: glm4moe.attention.layer_norm_rms_epsilon f32 = 0.000010
42
+ llama_model_loader: - kv 18: glm4moe.expert_used_count u32 = 8
43
+ llama_model_loader: - kv 19: glm4moe.expert_group_count u32 = 1
44
+ llama_model_loader: - kv 20: glm4moe.expert_group_used_count u32 = 1
45
+ llama_model_loader: - kv 21: glm4moe.attention.key_length u32 = 128
46
+ llama_model_loader: - kv 22: glm4moe.attention.value_length u32 = 128
47
+ llama_model_loader: - kv 23: general.file_type u32 = 32
48
+ llama_model_loader: - kv 24: glm4moe.rope.dimension_count u32 = 64
49
+ llama_model_loader: - kv 25: glm4moe.expert_count u32 = 160
50
+ llama_model_loader: - kv 26: glm4moe.expert_feed_forward_length u32 = 1536
51
+ llama_model_loader: - kv 27: glm4moe.expert_shared_count u32 = 1
52
+ llama_model_loader: - kv 28: glm4moe.leading_dense_block_count u32 = 3
53
+ llama_model_loader: - kv 29: glm4moe.expert_gating_func u32 = 2
54
+ llama_model_loader: - kv 30: glm4moe.expert_weights_scale f32 = 2.500000
55
+ llama_model_loader: - kv 31: glm4moe.expert_weights_norm bool = true
56
+ llama_model_loader: - kv 32: glm4moe.nextn_predict_layers u32 = 1
57
+ llama_model_loader: - kv 33: general.quantization_version u32 = 2
58
+ llama_model_loader: - kv 34: tokenizer.ggml.model str = gpt2
59
+ llama_model_loader: - kv 35: tokenizer.ggml.pre str = glm4
60
+ llama_model_loader: - kv 36: tokenizer.ggml.tokens arr[str,151552] = ["!", "\"", "#", "$", "%", "&", "'", ...
61
+ llama_model_loader: - kv 37: tokenizer.ggml.token_type arr[i32,151552] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
62
+ llama_model_loader: - kv 38: tokenizer.ggml.merges arr[str,318088] = ["Ġ Ġ", "Ġ ĠĠĠ", "ĠĠ ĠĠ", "...
63
+ llama_model_loader: - kv 39: tokenizer.ggml.eos_token_id u32 = 151329
64
+ llama_model_loader: - kv 40: tokenizer.ggml.padding_token_id u32 = 151329
65
+ llama_model_loader: - kv 41: tokenizer.ggml.bos_token_id u32 = 151331
66
+ llama_model_loader: - kv 42: tokenizer.ggml.eot_token_id u32 = 151336
67
+ llama_model_loader: - kv 43: tokenizer.ggml.unknown_token_id u32 = 151329
68
+ llama_model_loader: - kv 44: tokenizer.ggml.eom_token_id u32 = 151338
69
+ llama_model_loader: - kv 45: tokenizer.chat_template str = [gMASK]<sop>\n{%- if tools -%}\n<|syste...
70
+ llama_model_loader: - kv 46: split.no u16 = 0
71
+ llama_model_loader: - kv 47: split.count u16 = 15
72
+ llama_model_loader: - kv 48: split.tensors.count i32 = 1761
73
+ llama_model_loader: - type f32: 835 tensors
74
+ llama_model_loader: - type bf16: 926 tensors
75
+ load: special_eot_id is not in special_eog_ids - the tokenizer config may be incorrect
76
+ load: special_eom_id is not in special_eog_ids - the tokenizer config may be incorrect
77
+ load: printing all EOG tokens:
78
+ load: - 151329 ('<|endoftext|>')
79
+ load: - 151336 ('<|user|>')
80
+ load: - 151338 ('<|observation|>')
81
+ load: special tokens cache size = 36
82
+ load: token to piece cache size = 0.9713 MB
83
+ llm_load_print_meta: format = GGUF V3 (latest)
84
+ llm_load_print_meta: arch = glm4moe
85
+ llm_load_print_meta: n_ctx_train = 202752
86
+ llm_load_print_meta: n_embd = 5120
87
+ llm_load_print_meta: n_layer = 93
88
+ llm_load_print_meta: n_head = 96
89
+ llm_load_print_meta: n_head_kv = 8
90
+ llm_load_print_meta: n_rot = 64
91
+ llm_load_print_meta: n_swa = 0
92
+ llm_load_print_meta: n_swa_pattern = 1
93
+ llm_load_print_meta: n_embd_head_k = 128
94
+ llm_load_print_meta: n_embd_head_v = 128
95
+ llm_load_print_meta: n_gqa = 12
96
+ llm_load_print_meta: n_embd_k_gqa = 1024
97
+ llm_load_print_meta: n_embd_v_gqa = 1024
98
+ llm_load_print_meta: f_norm_eps = 0.0e+00
99
+ llm_load_print_meta: f_norm_rms_eps = 1.0e-05
100
+ llm_load_print_meta: f_clamp_kqv = 0.0e+00
101
+ llm_load_print_meta: f_max_alibi_bias = 0.0e+00
102
+ llm_load_print_meta: f_logit_scale = 0.0e+00
103
+ llm_load_print_meta: n_ff = 12288
104
+ llm_load_print_meta: n_expert = 160
105
+ llm_load_print_meta: n_expert_used = 8
106
+ llm_load_print_meta: causal attn = 1
107
+ llm_load_print_meta: pooling type = 0
108
+ llm_load_print_meta: rope type = 2
109
+ llm_load_print_meta: rope scaling = linear
110
+ llm_load_print_meta: freq_base_train = 1000000.0
111
+ llm_load_print_meta: freq_scale_train = 1
112
+ llm_load_print_meta: n_ctx_orig_yarn = 202752
113
+ llm_load_print_meta: rope_finetuned = unknown
114
+ llm_load_print_meta: ssm_d_conv = 0
115
+ llm_load_print_meta: ssm_d_inner = 0
116
+ llm_load_print_meta: ssm_d_state = 0
117
+ llm_load_print_meta: ssm_dt_rank = 0
118
+ llm_load_print_meta: model type = 355B.A32B
119
+ llm_load_print_meta: model ftype = BF16
120
+ llm_load_print_meta: model params = 358.338 B
121
+ llm_load_print_meta: model size = 667.598 GiB (16.003 BPW)
122
+ llm_load_print_meta: repeating layers = 664.707 GiB (16.003 BPW, 356.786 B parameters)
123
+ llm_load_print_meta: general.name = GLM 4.7
124
+ print_info: vocab type = BPE
125
+ print_info: n_vocab = 151552
126
+ print_info: n_merges = 318088
127
+ print_info: BOS token = 151331 '[gMASK]'
128
+ print_info: EOS token = 151329 '<|endoftext|>'
129
+ print_info: EOT token = 151336 '<|user|>'
130
+ print_info: EOM token = 151338 '<|observation|>'
131
+ print_info: UNK token = 151329 '<|endoftext|>'
132
+ print_info: PAD token = 151329 '<|endoftext|>'
133
+ print_info: LF token = 198 'Ċ'
134
+ print_info: FIM PRE token = 151347 '<|code_prefix|>'
135
+ print_info: FIM SUF token = 151349 '<|code_suffix|>'
136
+ print_info: FIM MID token = 151348 '<|code_middle|>'
137
+ print_info: EOG token = 151329 '<|endoftext|>'
138
+ print_info: EOG token = 151336 '<|user|>'
139
+ print_info: EOG token = 151338 '<|observation|>'
140
+ print_info: max token length = 1024
141
+ llm_load_tensors: ggml ctx size = 0.72 MiB
142
+ model has unused tensor blk.92.attn_norm.weight (size = 20480 bytes) -- ignoring
143
+ model has unused tensor blk.92.attn_q.weight (size = 125829120 bytes) -- ignoring
144
+ model has unused tensor blk.92.attn_k.weight (size = 10485760 bytes) -- ignoring
145
+ model has unused tensor blk.92.attn_v.weight (size = 10485760 bytes) -- ignoring
146
+ model has unused tensor blk.92.attn_q.bias (size = 49152 bytes) -- ignoring
147
+ model has unused tensor blk.92.attn_k.bias (size = 4096 bytes) -- ignoring
148
+ model has unused tensor blk.92.attn_v.bias (size = 4096 bytes) -- ignoring
149
+ model has unused tensor blk.92.attn_output.weight (size = 125829120 bytes) -- ignoring
150
+ model has unused tensor blk.92.attn_q_norm.weight (size = 512 bytes) -- ignoring
151
+ model has unused tensor blk.92.attn_k_norm.weight (size = 512 bytes) -- ignoring
152
+ model has unused tensor blk.92.post_attention_norm.weight (size = 20480 bytes) -- ignoring
153
+ model has unused tensor blk.92.ffn_gate_inp.weight (size = 3276800 bytes) -- ignoring
154
+ model has unused tensor blk.92.exp_probs_b.bias (size = 640 bytes) -- ignoring
155
+ model has unused tensor blk.92.ffn_gate_exps.weight (size = 2516582400 bytes) -- ignoring
156
+ model has unused tensor blk.92.ffn_down_exps.weight (size = 2516582400 bytes) -- ignoring
157
+ model has unused tensor blk.92.ffn_up_exps.weight (size = 2516582400 bytes) -- ignoring
158
+ model has unused tensor blk.92.ffn_gate_shexp.weight (size = 15728640 bytes) -- ignoring
159
+ model has unused tensor blk.92.ffn_down_shexp.weight (size = 15728640 bytes) -- ignoring
160
+ model has unused tensor blk.92.ffn_up_shexp.weight (size = 15728640 bytes) -- ignoring
161
+ model has unused tensor blk.92.nextn.eh_proj.weight (size = 104857600 bytes) -- ignoring
162
+ model has unused tensor blk.92.nextn.embed_tokens.weight (size = 1551892480 bytes) -- ignoring
163
+ model has unused tensor blk.92.nextn.enorm.weight (size = 20480 bytes) -- ignoring
164
+ model has unused tensor blk.92.nextn.hnorm.weight (size = 20480 bytes) -- ignoring
165
+ model has unused tensor blk.92.nextn.shared_head_head.weight (size = 1551892480 bytes) -- ignoring
166
+ model has unused tensor blk.92.nextn.shared_head_norm.weight (size = 20480 bytes) -- ignoring
167
+ llm_load_tensors: offloading 0 repeating layers to GPU
168
+ llm_load_tensors: offloaded 0/94 layers to GPU
169
+ llm_load_tensors: CPU buffer size = 673051.91 MiB
170
+ ....................................................................................................
171
+ llama_new_context_with_model: n_ctx = 512
172
+ llama_new_context_with_model: n_batch = 512
173
+ llama_new_context_with_model: n_ubatch = 512
174
+ llama_new_context_with_model: flash_attn = 1
175
+ llama_new_context_with_model: attn_max_b = 0
176
+ llama_new_context_with_model: fused_moe = 0
177
+ llama_new_context_with_model: grouped er = 0
178
+ llama_new_context_with_model: fused_up_gate = 0
179
+ llama_new_context_with_model: fused_mmad = 0
180
+ llama_new_context_with_model: rope_cache = 0
181
+ llama_new_context_with_model: graph_reuse = 0
182
+ llama_new_context_with_model: k_cache_hadam = 0
183
+ llama_new_context_with_model: split_mode_graph_scheduling = 0
184
+ llama_new_context_with_model: ser = -1, 0
185
+ llama_new_context_with_model: freq_base = 1000000.0
186
+ llama_new_context_with_model: freq_scale = 1
187
+ llama_kv_cache_init: CPU KV buffer size = 184.00 MiB
188
+ llama_new_context_with_model: KV self size = 184.00 MiB, K (f16): 92.00 MiB, V (f16): 92.00 MiB
189
+ llama_new_context_with_model: CPU output buffer size = 0.58 MiB
190
+ llama_new_context_with_model: CPU compute buffer size = 306.00 MiB
191
+ llama_new_context_with_model: graph nodes = 4634
192
+ llama_new_context_with_model: graph splits = 1
193
+ XXXXXXXXXXXXXXXXXXXXX Setting only active experts offload
194
+
195
+ system_info: n_threads = 96 (n_threads_batch = 128) / 512 | AVX = 1 | AVX_VNNI = 1 | AVX2 = 1 | AVX512 = 1 | AVX512_VBMI = 1 | AVX512_VNNI = 1 | AVX512_BF16 = 1 | FMA = 1 | NEON = 0 | SVE = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | MATMUL_INT8 = 0 | LLAMAFILE = 1 |
196
+ compute_imatrix: tokenizing the input ..
197
+ compute_imatrix: tokenization took 508.555 ms
198
+ compute_imatrix: computing over 814 chunks with batch_size 512
199
+ compute_imatrix: 9.95 seconds per pass - ETA 2 hours 15.02 minutes
200
+ ======================================= HAVE_FANCY_SIMD is defined
201
+ [1]17.5129,[2]6.9568,[3]4.5205,[4]3.2674,[5]2.6460,[6]2.2556,[7]2.0217,[8]1.8697,[9]1.8579,
202
+ save_imatrix: entry ' blk.73.ffn_gate_exps.weight' has partial data (98.75%) 2 out of 160 experts are missing data Storing **but be aware**
203
+ save_imatrix: entry ' blk.73.ffn_up_exps.weight' has partial data (98.75%) 2 out of 160 experts are missing data Storing **but be aware**
204
+ save_imatrix: entry ' blk.56.ffn_down_exps.weight' has partial data (99.38%) 1 out of 160 experts are missing data Storing **but be aware**
205
+ save_imatrix: entry ' blk.56.ffn_gate_exps.weight' has partial data (99.38%) 1 out of 160 experts are missing data Storing **but be aware**
206
+ save_imatrix: entry ' blk.56.ffn_up_exps.weight' has partial data (99.38%) 1 out of 160 experts are missing data Storing **but be aware**
207
+ save_imatrix: entry ' blk.48.ffn_gate_exps.weight' has partial data (99.38%) 1 out of 160 experts are missing data Storing **but be aware**
208
+ save_imatrix: entry ' blk.6.ffn_gate_exps.weight' has partial data (99.38%) 1 out of 160 experts are missing data Storing **but be aware**
209
+ save_imatrix: entry ' blk.6.ffn_down_exps.weight' has partial data (99.38%) 1 out of 160 experts are missing data Storing **but be aware**
210
+ save_imatrix: entry ' blk.6.ffn_up_exps.weight' has partial data (99.38%) 1 out of 160 experts are missing data Storing **but be aware**
211
+ save_imatrix: entry ' blk.73.ffn_down_exps.weight' has partial data (98.75%) 2 out of 160 experts are missing data Storing **but be aware**
212
+ save_imatrix: entry ' blk.48.ffn_down_exps.weight' has partial data (99.38%) 1 out of 160 experts are missing data Storing **but be aware**
213
+ save_imatrix: entry ' blk.48.ffn_up_exps.weight' has partial data (99.38%) 1 out of 160 experts are missing data Storing **but be aware**
214
+
215
+ save_imatrix: stored collected data after 10 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
216
+ [10]1.7776,[11]1.8889,[12]1.9858,[13]2.0575,[14]2.1264,[15]2.0255,[16]1.9435,[17]1.8890,[18]1.8325,[19]1.7753,
217
+ save_imatrix: stored collected data after 20 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
218
+ [20]1.7407,[21]1.6968,[22]1.6679,[23]1.6358,[24]1.6059,[25]1.5762,[26]1.6561,[27]1.7531,[28]1.8685,[29]1.8406,
219
+ save_imatrix: stored collected data after 30 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
220
+ [30]1.8213,[31]1.8366,[32]1.8313,[33]1.9053,[34]1.8840,[35]1.8786,[36]1.8690,[37]1.8635,[38]1.8952,[39]1.9113,
221
+ save_imatrix: stored collected data after 40 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
222
+ [40]1.9002,[41]1.9285,[42]1.9373,[43]1.9526,[44]1.9645,[45]1.9712,[46]1.9569,[47]1.9671,[48]1.9661,[49]1.9672,
223
+ save_imatrix: stored collected data after 50 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
224
+ [50]1.9573,[51]1.9761,[52]1.9892,[53]1.9766,[54]1.9847,[55]1.9872,[56]1.9924,[57]1.9852,[58]2.0341,[59]2.0846,
225
+ save_imatrix: stored collected data after 60 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
226
+ [60]2.1336,[61]2.1480,[62]2.1927,[63]2.2231,[64]2.2163,[65]2.2162,[66]2.2193,[67]2.2055,[68]2.2219,[69]2.2626,
227
+ save_imatrix: stored collected data after 70 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
228
+ [70]2.3155,[71]2.3455,[72]2.3846,[73]2.4166,[74]2.4360,[75]2.4648,[76]2.4808,[77]2.5090,[78]2.5058,[79]2.4882,
229
+ save_imatrix: stored collected data after 80 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
230
+ [80]2.4855,[81]2.4882,[82]2.5159,[83]2.5581,[84]2.5770,[85]2.5819,[86]2.5855,[87]2.5763,[88]2.5781,[89]2.5664,
231
+ save_imatrix: stored collected data after 90 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
232
+ [90]2.5561,[91]2.5523,[92]2.5356,[93]2.5154,[94]2.5440,[95]2.5928,[96]2.6132,[97]2.6158,[98]2.6236,[99]2.6440,
233
+ save_imatrix: stored collected data after 100 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
234
+ [100]2.6613,[101]2.6685,[102]2.6709,[103]2.7057,[104]2.7300,[105]2.7232,[106]2.7660,[107]2.8099,[108]2.8406,[109]2.8811,
235
+ save_imatrix: stored collected data after 110 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
236
+ [110]2.9123,[111]2.9467,[112]2.9798,[113]2.9740,[114]2.9899,[115]3.0050,[116]3.0137,[117]3.0245,[118]3.0569,[119]3.0943,
237
+ save_imatrix: stored collected data after 120 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
238
+ [120]3.1353,[121]3.1299,[122]3.1032,[123]3.0869,[124]3.1069,[125]3.0955,[126]3.0717,[127]3.0709,[128]3.0689,[129]3.0750,
239
+ save_imatrix: stored collected data after 130 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
240
+ [130]3.0827,[131]3.0998,[132]3.1160,[133]3.1226,[134]3.1614,[135]3.1798,[136]3.1540,[137]3.1290,[138]3.1061,[139]3.0821,
241
+ save_imatrix: stored collected data after 140 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
242
+ [140]3.0908,[141]3.1033,[142]3.1438,[143]3.1734,[144]3.1791,[145]3.2029,[146]3.2291,[147]3.2515,[148]3.2845,[149]3.3145,
243
+ save_imatrix: stored collected data after 150 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
244
+ [150]3.3451,[151]3.3640,[152]3.3849,[153]3.4019,[154]3.4113,[155]3.4073,[156]3.4248,[157]3.4352,[158]3.4461,[159]3.4588,
245
+ save_imatrix: stored collected data after 160 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
246
+ [160]3.4724,[161]3.4748,[162]3.4794,[163]3.4943,[164]3.4998,[165]3.5079,[166]3.5212,[167]3.5227,[168]3.5252,[169]3.5323,
247
+ save_imatrix: stored collected data after 170 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
248
+ [170]3.5418,[171]3.5468,[172]3.5522,[173]3.5591,[174]3.5774,[175]3.5892,[176]3.5948,[177]3.6013,[178]3.6183,[179]3.6066,
249
+ save_imatrix: stored collected data after 180 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
250
+ [180]3.6147,[181]3.6290,[182]3.6530,[183]3.6691,[184]3.6754,[185]3.6775,[186]3.6758,[187]3.6737,[188]3.6741,[189]3.6748,
251
+ save_imatrix: stored collected data after 190 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
252
+ [190]3.6751,[191]3.6712,[192]3.6937,[193]3.7245,[194]3.7498,[195]3.7776,[196]3.7992,[197]3.8353,[198]3.8457,[199]3.8632,
253
+ save_imatrix: stored collected data after 200 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
254
+ [200]3.8543,[201]3.8694,[202]3.8607,[203]3.8371,[204]3.8146,[205]3.8352,[206]3.8499,[207]3.8590,[208]3.8685,[209]3.8886,
255
+ save_imatrix: stored collected data after 210 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
256
+ [210]3.9044,[211]3.9209,[212]3.9405,[213]3.9567,[214]3.9581,[215]3.9352,[216]3.9111,[217]3.8873,[218]3.8636,[219]3.8407,
257
+ save_imatrix: stored collected data after 220 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
258
+ [220]3.8241,[221]3.8216,[222]3.8120,[223]3.8085,[224]3.7952,[225]3.7765,[226]3.7761,[227]3.7828,[228]3.8044,[229]3.8287,
259
+ save_imatrix: stored collected data after 230 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
260
+ [230]3.8397,[231]3.8627,[232]3.8583,[233]3.8832,[234]3.9142,[235]3.9274,[236]3.9423,[237]3.9476,[238]3.9719,[239]4.0009,
261
+ save_imatrix: stored collected data after 240 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
262
+ [240]3.9975,[241]4.0078,[242]4.0223,[243]4.0432,[244]4.0635,[245]4.0782,[246]4.0913,[247]4.1018,[248]4.0917,[249]4.1182,
263
+ save_imatrix: stored collected data after 250 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
264
+ [250]4.1322,[251]4.1512,[252]4.1620,[253]4.1670,[254]4.1736,[255]4.1769,[256]4.1893,[257]4.1941,[258]4.2055,[259]4.2207,
265
+ save_imatrix: stored collected data after 260 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
266
+ [260]4.2305,[261]4.2418,[262]4.2543,[263]4.2700,[264]4.2824,[265]4.2997,[266]4.2846,[267]4.2893,[268]4.2945,[269]4.3088,
267
+ save_imatrix: stored collected data after 270 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
268
+ [270]4.3304,[271]4.3455,[272]4.3672,[273]4.3680,[274]4.3670,[275]4.3766,[276]4.3829,[277]4.3999,[278]4.4148,[279]4.4279,
269
+ save_imatrix: stored collected data after 280 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
270
+ [280]4.4372,[281]4.4395,[282]4.4538,[283]4.4654,[284]4.4684,[285]4.4848,[286]4.4863,[287]4.4904,[288]4.4993,[289]4.4958,
271
+ save_imatrix: stored collected data after 290 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
272
+ [290]4.5076,[291]4.5134,[292]4.5196,[293]4.5372,[294]4.5508,[295]4.5653,[296]4.5830,[297]4.5879,[298]4.6079,[299]4.6212,
273
+ save_imatrix: stored collected data after 300 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
274
+ [300]4.6382,[301]4.6506,[302]4.6647,[303]4.6700,[304]4.6895,[305]4.6978,[306]4.7023,[307]4.7108,[308]4.7293,[309]4.7394,
275
+ save_imatrix: stored collected data after 310 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
276
+ [310]4.7444,[311]4.7529,[312]4.7618,[313]4.7761,[314]4.7839,[315]4.7935,[316]4.8056,[317]4.8184,[318]4.8331,[319]4.8379,
277
+ save_imatrix: stored collected data after 320 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
278
+ [320]4.8422,[321]4.8356,[322]4.8464,[323]4.8297,[324]4.8477,[325]4.8512,[326]4.8283,[327]4.8413,[328]4.8518,[329]4.8579,
279
+ save_imatrix: stored collected data after 330 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
280
+ [330]4.8640,[331]4.8633,[332]4.8671,[333]4.8863,[334]4.8834,[335]4.8949,[336]4.9110,[337]4.9204,[338]4.9253,[339]4.9127,
281
+ save_imatrix: stored collected data after 340 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
282
+ [340]4.9237,[341]4.9406,[342]4.9567,[343]4.9745,[344]4.9973,[345]5.0267,[346]5.0290,[347]5.0303,[348]5.0331,[349]5.0414,
283
+ save_imatrix: stored collected data after 350 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
284
+ [350]5.0554,[351]5.0758,[352]5.0761,[353]5.0728,[354]5.0839,[355]5.0801,[356]5.0811,[357]5.0798,[358]5.0753,[359]5.0797,
285
+ save_imatrix: stored collected data after 360 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
286
+ [360]5.0912,[361]5.0878,[362]5.0861,[363]5.0686,[364]5.0505,[365]5.0334,[366]5.0192,[367]4.9996,[368]4.9826,[369]4.9654,
287
+ save_imatrix: stored collected data after 370 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
288
+ [370]4.9512,[371]4.9353,[372]4.9188,[373]4.9057,[374]4.8921,[375]4.8736,[376]4.8610,[377]4.8467,[378]4.8301,[379]4.8157,
289
+ save_imatrix: stored collected data after 380 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
290
+ [380]4.8141,[381]4.7993,[382]4.7929,[383]4.7967,[384]4.7841,[385]4.7779,[386]4.7666,[387]4.7479,[388]4.7307,[389]4.7229,
291
+ save_imatrix: stored collected data after 390 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
292
+ [390]4.7128,[391]4.6979,[392]4.6796,[393]4.6613,[394]4.6594,[395]4.6573,[396]4.6530,[397]4.6422,[398]4.6436,[399]4.6429,
293
+ save_imatrix: stored collected data after 400 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
294
+ [400]4.6261,[401]4.6110,[402]4.6038,[403]4.5899,[404]4.5786,[405]4.5680,[406]4.5586,[407]4.5421,[408]4.5262,[409]4.5115,
295
+ save_imatrix: stored collected data after 410 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
296
+ [410]4.4991,[411]4.4877,[412]4.4820,[413]4.4730,[414]4.4690,[415]4.4643,[416]4.4624,[417]4.4571,[418]4.4519,[419]4.4374,
297
+ save_imatrix: stored collected data after 420 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
298
+ [420]4.4230,[421]4.4077,[422]4.3944,[423]4.3803,[424]4.3684,[425]4.3546,[426]4.3397,[427]4.3292,[428]4.3145,[429]4.3076,
299
+ save_imatrix: stored collected data after 430 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
300
+ [430]4.2946,[431]4.2847,[432]4.2735,[433]4.2636,[434]4.2620,[435]4.2610,[436]4.2546,[437]4.2443,[438]4.2379,[439]4.2240,
301
+ save_imatrix: stored collected data after 440 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
302
+ [440]4.2110,[441]4.1987,[442]4.1868,[443]4.1755,[444]4.1721,[445]4.1629,[446]4.1593,[447]4.1535,[448]4.1430,[449]4.1400,
303
+ save_imatrix: stored collected data after 450 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
304
+ [450]4.1326,[451]4.1249,[452]4.1137,[453]4.1065,[454]4.0994,[455]4.0910,[456]4.0787,[457]4.0669,[458]4.0547,[459]4.0430,
305
+ save_imatrix: stored collected data after 460 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
306
+ [460]4.0317,[461]4.0223,[462]4.0129,[463]4.0069,[464]3.9991,[465]3.9951,[466]3.9893,[467]3.9839,[468]3.9785,[469]3.9728,
307
+ save_imatrix: stored collected data after 470 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
308
+ [470]3.9673,[471]3.9618,[472]3.9564,[473]3.9517,[474]3.9461,[475]3.9405,[476]3.9357,[477]3.9303,[478]3.9250,[479]3.9215,
309
+ save_imatrix: stored collected data after 480 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
310
+ [480]3.9109,[481]3.9015,[482]3.8973,[483]3.8903,[484]3.8828,[485]3.8726,[486]3.8630,[487]3.8537,[488]3.8443,[489]3.8383,
311
+ save_imatrix: stored collected data after 490 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
312
+ [490]3.8310,[491]3.8243,[492]3.8204,[493]3.8151,[494]3.8084,[495]3.8004,[496]3.7998,[497]3.7966,[498]3.7917,[499]3.7900,
313
+ save_imatrix: stored collected data after 500 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
314
+ [500]3.7876,[501]3.7866,[502]3.7874,[503]3.7902,[504]3.7887,[505]3.7828,[506]3.7748,[507]3.7788,[508]3.7894,[509]3.7982,
315
+ save_imatrix: stored collected data after 510 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
316
+ [510]3.8064,[511]3.8136,[512]3.8212,[513]3.8257,[514]3.8295,[515]3.8312,[516]3.8390,[517]3.8421,[518]3.8486,[519]3.8575,
317
+ save_imatrix: stored collected data after 520 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
318
+ [520]3.8708,[521]3.8874,[522]3.9011,[523]3.8995,[524]3.9063,[525]3.9102,[526]3.9165,[527]3.9179,[528]3.9201,[529]3.9289,
319
+ save_imatrix: stored collected data after 530 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
320
+ [530]3.9341,[531]3.9355,[532]3.9425,[533]3.9482,[534]3.9554,[535]3.9553,[536]3.9550,[537]3.9558,[538]3.9602,[539]3.9650,
321
+ save_imatrix: stored collected data after 540 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
322
+ [540]3.9698,[541]3.9741,[542]3.9765,[543]3.9788,[544]3.9835,[545]3.9886,[546]3.9973,[547]4.0056,[548]4.0122,[549]4.0208,
323
+ save_imatrix: stored collected data after 550 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
324
+ [550]4.0278,[551]4.0358,[552]4.0422,[553]4.0482,[554]4.0550,[555]4.0610,[556]4.0581,[557]4.0553,[558]4.0519,[559]4.0563,
325
+ save_imatrix: stored collected data after 560 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
326
+ [560]4.0622,[561]4.0664,[562]4.0717,[563]4.0722,[564]4.0769,[565]4.0773,[566]4.0819,[567]4.0827,[568]4.0828,[569]4.0823,
327
+ save_imatrix: stored collected data after 570 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
328
+ [570]4.0830,[571]4.0859,[572]4.0823,[573]4.0797,[574]4.0752,[575]4.0715,[576]4.0642,[577]4.0590,[578]4.0525,[579]4.0453,
329
+ save_imatrix: stored collected data after 580 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
330
+ [580]4.0425,[581]4.0443,[582]4.0423,[583]4.0433,[584]4.0410,[585]4.0407,[586]4.0404,[587]4.0377,[588]4.0320,[589]4.0325,
331
+ save_imatrix: stored collected data after 590 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
332
+ [590]4.0294,[591]4.0221,[592]4.0155,[593]4.0081,[594]4.0022,[595]3.9988,[596]3.9974,[597]3.9952,[598]3.9942,[599]3.9917,
333
+ save_imatrix: stored collected data after 600 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
334
+ [600]3.9871,[601]3.9813,[602]3.9814,[603]3.9815,[604]3.9813,[605]3.9772,[606]3.9751,[607]3.9720,[608]3.9753,[609]3.9744,
335
+ save_imatrix: stored collected data after 610 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
336
+ [610]3.9720,[611]3.9726,[612]3.9723,[613]3.9676,[614]3.9607,[615]3.9530,[616]3.9455,[617]3.9375,[618]3.9301,[619]3.9224,
337
+ save_imatrix: stored collected data after 620 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
338
+ [620]3.9147,[621]3.9061,[622]3.8977,[623]3.8901,[624]3.8827,[625]3.8750,[626]3.8686,[627]3.8608,[628]3.8540,[629]3.8483,
339
+ save_imatrix: stored collected data after 630 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
340
+ [630]3.8414,[631]3.8345,[632]3.8297,[633]3.8220,[634]3.8178,[635]3.8159,[636]3.8125,[637]3.8052,[638]3.7995,[639]3.7934,
341
+ save_imatrix: stored collected data after 640 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
342
+ [640]3.7862,[641]3.7807,[642]3.7742,[643]3.7684,[644]3.7622,[645]3.7555,[646]3.7488,[647]3.7428,[648]3.7423,[649]3.7357,
343
+ save_imatrix: stored collected data after 650 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
344
+ [650]3.7289,[651]3.7222,[652]3.7158,[653]3.7091,[654]3.7022,[655]3.6956,[656]3.6892,[657]3.6834,[658]3.6769,[659]3.6795,
345
+ save_imatrix: stored collected data after 660 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
346
+ [660]3.6798,[661]3.6828,[662]3.6807,[663]3.6746,[664]3.6705,[665]3.6650,[666]3.6584,[667]3.6531,[668]3.6477,[669]3.6427,
347
+ save_imatrix: stored collected data after 670 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
348
+ [670]3.6377,[671]3.6320,[672]3.6260,[673]3.6203,[674]3.6165,[675]3.6115,[676]3.6057,[677]3.6006,[678]3.5948,[679]3.5888,
349
+ save_imatrix: stored collected data after 680 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
350
+ [680]3.5861,[681]3.5802,[682]3.5752,[683]3.5704,[684]3.5649,[685]3.5604,[686]3.5584,[687]3.5571,[688]3.5532,[689]3.5485,
351
+ save_imatrix: stored collected data after 690 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
352
+ [690]3.5423,[691]3.5359,[692]3.5305,[693]3.5249,[694]3.5211,[695]3.5185,[696]3.5168,[697]3.5141,[698]3.5125,[699]3.5101,
353
+ save_imatrix: stored collected data after 700 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
354
+ [700]3.5082,[701]3.5068,[702]3.5052,[703]3.5033,[704]3.5014,[705]3.4998,[706]3.4983,[707]3.4959,[708]3.4946,[709]3.4925,
355
+ save_imatrix: stored collected data after 710 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
356
+ [710]3.4907,[711]3.4886,[712]3.4894,[713]3.4891,[714]3.4893,[715]3.4904,[716]3.4915,[717]3.4923,[718]3.4931,[719]3.4948,
357
+ save_imatrix: stored collected data after 720 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
358
+ [720]3.4969,[721]3.4973,[722]3.4981,[723]3.4991,[724]3.5006,[725]3.5017,[726]3.5034,[727]3.5048,[728]3.5068,[729]3.5068,
359
+ save_imatrix: stored collected data after 730 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
360
+ [730]3.5070,[731]3.5082,[732]3.5111,[733]3.5122,[734]3.5126,[735]3.5127,[736]3.5141,[737]3.5162,[738]3.5169,[739]3.5198,
361
+ save_imatrix: stored collected data after 740 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
362
+ [740]3.5214,[741]3.5233,[742]3.5248,[743]3.5255,[744]3.5255,[745]3.5267,[746]3.5283,[747]3.5298,[748]3.5312,[749]3.5323,
363
+ save_imatrix: stored collected data after 750 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
364
+ [750]3.5335,[751]3.5345,[752]3.5365,[753]3.5398,[754]3.5405,[755]3.5417,[756]3.5434,[757]3.5449,[758]3.5457,[759]3.5472,
365
+ save_imatrix: stored collected data after 760 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
366
+ [760]3.5482,[761]3.5489,[762]3.5507,[763]3.5511,[764]3.5530,[765]3.5540,[766]3.5556,[767]3.5563,[768]3.5573,[769]3.5577,
367
+ save_imatrix: stored collected data after 770 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
368
+ [770]3.5588,[771]3.5610,[772]3.5617,[773]3.5619,[774]3.5626,[775]3.5646,[776]3.5655,[777]3.5679,[778]3.5679,[779]3.5693,
369
+ save_imatrix: stored collected data after 780 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
370
+ [780]3.5708,[781]3.5729,[782]3.5750,[783]3.5778,[784]3.5781,[785]3.5787,[786]3.5794,[787]3.5812,[788]3.5814,[789]3.5837,
371
+ save_imatrix: stored collected data after 790 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
372
+ [790]3.5849,[791]3.5861,[792]3.5863,[793]3.5874,[794]3.5896,[795]3.5911,[796]3.5914,[797]3.5930,[798]3.5942,[799]3.5979,
373
+ save_imatrix: stored collected data after 800 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
374
+ [800]3.5984,[801]3.5983,[802]3.6000,[803]3.6018,[804]3.6027,[805]3.6036,[806]3.6041,[807]3.6050,[808]3.6054,[809]3.6062,
375
+ save_imatrix: stored collected data after 810 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
376
+ [810]3.6083,[811]3.6108,[812]3.6119,[813]3.6131,[814]3.6137,
377
+ save_imatrix: stored collected data after 814 chunks in /mnt/data/models/ubergarm/GLM-4.7-GGUF/imatrix-GLM-4.7-BF16.dat
378
+
379
+ Final estimate: PPL = 3.6137 +/- 0.01805
380
+
381
+ ======================== sorted layer importances
382
+ 0: Layer 0, <cos_sim> = 0.433589
383
+ 1: Layer 2, <cos_sim> = 0.752289
384
+ 2: Layer 1, <cos_sim> = 0.764358
385
+ 3: Layer 3, <cos_sim> = 0.861103
386
+ 4: Layer 4, <cos_sim> = 0.90387
387
+ 5: Layer 32, <cos_sim> = 0.905589
388
+ 6: Layer 6, <cos_sim> = 0.912358
389
+ 7: Layer 37, <cos_sim> = 0.913118
390
+ 8: Layer 39, <cos_sim> = 0.913941
391
+ 9: Layer 31, <cos_sim> = 0.914878
392
+ 10: Layer 23, <cos_sim> = 0.915726
393
+ 11: Layer 91, <cos_sim> = 0.915909
394
+ 12: Layer 41, <cos_sim> = 0.917222
395
+ 13: Layer 40, <cos_sim> = 0.918507
396
+ 14: Layer 33, <cos_sim> = 0.918549
397
+ 15: Layer 29, <cos_sim> = 0.919203
398
+ 16: Layer 30, <cos_sim> = 0.919353
399
+ 17: Layer 28, <cos_sim> = 0.921385
400
+ 18: Layer 38, <cos_sim> = 0.921396
401
+ 19: Layer 24, <cos_sim> = 0.922245
402
+ 20: Layer 34, <cos_sim> = 0.922372
403
+ 21: Layer 22, <cos_sim> = 0.922432
404
+ 22: Layer 26, <cos_sim> = 0.924714
405
+ 23: Layer 36, <cos_sim> = 0.924901
406
+ 24: Layer 14, <cos_sim> = 0.925139
407
+ 25: Layer 25, <cos_sim> = 0.9268
408
+ 26: Layer 13, <cos_sim> = 0.92694
409
+ 27: Layer 35, <cos_sim> = 0.927297
410
+ 28: Layer 10, <cos_sim> = 0.927834
411
+ 29: Layer 27, <cos_sim> = 0.928177
412
+ 30: Layer 11, <cos_sim> = 0.929866
413
+ 31: Layer 21, <cos_sim> = 0.929894
414
+ 32: Layer 85, <cos_sim> = 0.93049
415
+ 33: Layer 7, <cos_sim> = 0.930774
416
+ 34: Layer 84, <cos_sim> = 0.932103
417
+ 35: Layer 8, <cos_sim> = 0.933102
418
+ 36: Layer 9, <cos_sim> = 0.935479
419
+ 37: Layer 42, <cos_sim> = 0.935862
420
+ 38: Layer 12, <cos_sim> = 0.936215
421
+ 39: Layer 5, <cos_sim> = 0.941695
422
+ 40: Layer 43, <cos_sim> = 0.943382
423
+ 41: Layer 86, <cos_sim> = 0.947319
424
+ 42: Layer 15, <cos_sim> = 0.948505
425
+ 43: Layer 20, <cos_sim> = 0.948549
426
+ 44: Layer 18, <cos_sim> = 0.951088
427
+ 45: Layer 44, <cos_sim> = 0.952598
428
+ 46: Layer 83, <cos_sim> = 0.952599
429
+ 47: Layer 19, <cos_sim> = 0.952615
430
+ 48: Layer 45, <cos_sim> = 0.953287
431
+ 49: Layer 17, <cos_sim> = 0.956447
432
+ 50: Layer 80, <cos_sim> = 0.957907
433
+ 51: Layer 16, <cos_sim> = 0.957981
434
+ 52: Layer 46, <cos_sim> = 0.958118
435
+ 53: Layer 81, <cos_sim> = 0.959244
436
+ 54: Layer 87, <cos_sim> = 0.959352
437
+ 55: Layer 90, <cos_sim> = 0.960285
438
+ 56: Layer 82, <cos_sim> = 0.961087
439
+ 57: Layer 47, <cos_sim> = 0.961475
440
+ 58: Layer 89, <cos_sim> = 0.962276
441
+ 59: Layer 88, <cos_sim> = 0.963196
442
+ 60: Layer 79, <cos_sim> = 0.963523
443
+ 61: Layer 48, <cos_sim> = 0.963567
444
+ 62: Layer 50, <cos_sim> = 0.964597
445
+ 63: Layer 49, <cos_sim> = 0.965508
446
+ 64: Layer 51, <cos_sim> = 0.965609
447
+ 65: Layer 52, <cos_sim> = 0.967696
448
+ 66: Layer 54, <cos_sim> = 0.968009
449
+ 67: Layer 53, <cos_sim> = 0.970224
450
+ 68: Layer 76, <cos_sim> = 0.970396
451
+ 69: Layer 78, <cos_sim> = 0.971591
452
+ 70: Layer 55, <cos_sim> = 0.971771
453
+ 71: Layer 75, <cos_sim> = 0.973436
454
+ 72: Layer 77, <cos_sim> = 0.975951
455
+ 73: Layer 58, <cos_sim> = 0.978094
456
+ 74: Layer 56, <cos_sim> = 0.978404
457
+ 75: Layer 57, <cos_sim> = 0.979015
458
+ 76: Layer 59, <cos_sim> = 0.979639
459
+ 77: Layer 73, <cos_sim> = 0.980629
460
+ 78: Layer 67, <cos_sim> = 0.981126
461
+ 79: Layer 66, <cos_sim> = 0.981658
462
+ 80: Layer 72, <cos_sim> = 0.981951
463
+ 81: Layer 65, <cos_sim> = 0.981978
464
+ 82: Layer 61, <cos_sim> = 0.982014
465
+ 83: Layer 68, <cos_sim> = 0.982152
466
+ 84: Layer 74, <cos_sim> = 0.982164
467
+ 85: Layer 60, <cos_sim> = 0.982302
468
+ 86: Layer 71, <cos_sim> = 0.982914
469
+ 87: Layer 63, <cos_sim> = 0.983344
470
+ 88: Layer 70, <cos_sim> = 0.983749
471
+ 89: Layer 64, <cos_sim> = 0.984071
472
+ 90: Layer 69, <cos_sim> = 0.984258
473
+ 91: Layer 62, <cos_sim> = 0.984467
474
+
475
+ ======================== sorted attention importances
476
+ 0: Layer 0, <cos_sim> = 0.335289
477
+ 1: Layer 1, <cos_sim> = 0.552763
478
+ 2: Layer 2, <cos_sim> = 0.637396
479
+ 3: Layer 3, <cos_sim> = 0.816339
480
+ 4: Layer 7, <cos_sim> = 0.824544
481
+ 5: Layer 13, <cos_sim> = 0.850178
482
+ 6: Layer 6, <cos_sim> = 0.850298
483
+ 7: Layer 4, <cos_sim> = 0.851804
484
+ 8: Layer 9, <cos_sim> = 0.859275
485
+ 9: Layer 8, <cos_sim> = 0.866695
486
+ 10: Layer 12, <cos_sim> = 0.874505
487
+ 11: Layer 15, <cos_sim> = 0.876165
488
+ 12: Layer 5, <cos_sim> = 0.876507
489
+ 13: Layer 10, <cos_sim> = 0.87806
490
+ 14: Layer 11, <cos_sim> = 0.880676
491
+ 15: Layer 16, <cos_sim> = 0.893902
492
+ 16: Layer 17, <cos_sim> = 0.899423
493
+ 17: Layer 21, <cos_sim> = 0.900672
494
+ 18: Layer 14, <cos_sim> = 0.9032
495
+ 19: Layer 19, <cos_sim> = 0.909055
496
+ 20: Layer 20, <cos_sim> = 0.911488
497
+ 21: Layer 18, <cos_sim> = 0.917251
498
+ 22: Layer 23, <cos_sim> = 0.919361
499
+ 23: Layer 22, <cos_sim> = 0.928206
500
+ 24: Layer 24, <cos_sim> = 0.932381
501
+ 25: Layer 25, <cos_sim> = 0.936273
502
+ 26: Layer 32, <cos_sim> = 0.938645
503
+ 27: Layer 28, <cos_sim> = 0.941543
504
+ 28: Layer 26, <cos_sim> = 0.942651
505
+ 29: Layer 33, <cos_sim> = 0.943323
506
+ 30: Layer 27, <cos_sim> = 0.943763
507
+ 31: Layer 37, <cos_sim> = 0.944613
508
+ 32: Layer 31, <cos_sim> = 0.945652
509
+ 33: Layer 30, <cos_sim> = 0.946387
510
+ 34: Layer 38, <cos_sim> = 0.948997
511
+ 35: Layer 39, <cos_sim> = 0.94954
512
+ 36: Layer 35, <cos_sim> = 0.950607
513
+ 37: Layer 41, <cos_sim> = 0.951778
514
+ 38: Layer 34, <cos_sim> = 0.952551
515
+ 39: Layer 40, <cos_sim> = 0.95284
516
+ 40: Layer 29, <cos_sim> = 0.952981
517
+ 41: Layer 42, <cos_sim> = 0.954776
518
+ 42: Layer 36, <cos_sim> = 0.958211
519
+ 43: Layer 85, <cos_sim> = 0.963066
520
+ 44: Layer 43, <cos_sim> = 0.963722
521
+ 45: Layer 44, <cos_sim> = 0.964977
522
+ 46: Layer 45, <cos_sim> = 0.966557
523
+ 47: Layer 46, <cos_sim> = 0.969251
524
+ 48: Layer 84, <cos_sim> = 0.971393
525
+ 49: Layer 86, <cos_sim> = 0.971928
526
+ 50: Layer 51, <cos_sim> = 0.973333
527
+ 51: Layer 52, <cos_sim> = 0.974347
528
+ 52: Layer 83, <cos_sim> = 0.974803
529
+ 53: Layer 50, <cos_sim> = 0.977621
530
+ 54: Layer 48, <cos_sim> = 0.977849
531
+ 55: Layer 47, <cos_sim> = 0.97789
532
+ 56: Layer 81, <cos_sim> = 0.978345
533
+ 57: Layer 82, <cos_sim> = 0.978486
534
+ 58: Layer 49, <cos_sim> = 0.978655
535
+ 59: Layer 80, <cos_sim> = 0.97866
536
+ 60: Layer 53, <cos_sim> = 0.979166
537
+ 61: Layer 91, <cos_sim> = 0.98049
538
+ 62: Layer 58, <cos_sim> = 0.981312
539
+ 63: Layer 54, <cos_sim> = 0.981736
540
+ 64: Layer 87, <cos_sim> = 0.982023
541
+ 65: Layer 79, <cos_sim> = 0.982483
542
+ 66: Layer 78, <cos_sim> = 0.983622
543
+ 67: Layer 88, <cos_sim> = 0.983653
544
+ 68: Layer 90, <cos_sim> = 0.985642
545
+ 69: Layer 61, <cos_sim> = 0.986197
546
+ 70: Layer 89, <cos_sim> = 0.986293
547
+ 71: Layer 68, <cos_sim> = 0.986564
548
+ 72: Layer 59, <cos_sim> = 0.986572
549
+ 73: Layer 73, <cos_sim> = 0.98676
550
+ 74: Layer 71, <cos_sim> = 0.986905
551
+ 75: Layer 55, <cos_sim> = 0.986992
552
+ 76: Layer 72, <cos_sim> = 0.987429
553
+ 77: Layer 76, <cos_sim> = 0.987882
554
+ 78: Layer 57, <cos_sim> = 0.988337
555
+ 79: Layer 56, <cos_sim> = 0.988355
556
+ 80: Layer 77, <cos_sim> = 0.98847
557
+ 81: Layer 67, <cos_sim> = 0.988501
558
+ 82: Layer 65, <cos_sim> = 0.98852
559
+ 83: Layer 70, <cos_sim> = 0.988926
560
+ 84: Layer 74, <cos_sim> = 0.988971
561
+ 85: Layer 64, <cos_sim> = 0.988973
562
+ 86: Layer 63, <cos_sim> = 0.989051
563
+ 87: Layer 66, <cos_sim> = 0.989456
564
+ 88: Layer 60, <cos_sim> = 0.989791
565
+ 89: Layer 69, <cos_sim> = 0.99087
566
+ 90: Layer 75, <cos_sim> = 0.991036
567
+ 91: Layer 62, <cos_sim> = 0.991942
568
+
569
+ ======================== sorted ffn importances
570
+ 0: Layer 0, <cos_sim> = 0.584305
571
+ 1: Layer 1, <cos_sim> = 0.599857
572
+ 2: Layer 2, <cos_sim> = 0.734115
573
+ 3: Layer 6, <cos_sim> = 0.807659
574
+ 4: Layer 3, <cos_sim> = 0.823881
575
+ 5: Layer 8, <cos_sim> = 0.855655
576
+ 6: Layer 11, <cos_sim> = 0.855686
577
+ 7: Layer 4, <cos_sim> = 0.858089
578
+ 8: Layer 14, <cos_sim> = 0.858127
579
+ 9: Layer 12, <cos_sim> = 0.860683
580
+ 10: Layer 5, <cos_sim> = 0.864949
581
+ 11: Layer 7, <cos_sim> = 0.867606
582
+ 12: Layer 9, <cos_sim> = 0.883365
583
+ 13: Layer 10, <cos_sim> = 0.884968
584
+ 14: Layer 15, <cos_sim> = 0.885658
585
+ 15: Layer 16, <cos_sim> = 0.887954
586
+ 16: Layer 20, <cos_sim> = 0.892895
587
+ 17: Layer 18, <cos_sim> = 0.90077
588
+ 18: Layer 19, <cos_sim> = 0.901247
589
+ 19: Layer 13, <cos_sim> = 0.902837
590
+ 20: Layer 24, <cos_sim> = 0.914728
591
+ 21: Layer 22, <cos_sim> = 0.91663
592
+ 22: Layer 25, <cos_sim> = 0.91686
593
+ 23: Layer 17, <cos_sim> = 0.91982
594
+ 24: Layer 26, <cos_sim> = 0.920503
595
+ 25: Layer 23, <cos_sim> = 0.921116
596
+ 26: Layer 27, <cos_sim> = 0.924545
597
+ 27: Layer 29, <cos_sim> = 0.92818
598
+ 28: Layer 32, <cos_sim> = 0.931219
599
+ 29: Layer 21, <cos_sim> = 0.931957
600
+ 30: Layer 31, <cos_sim> = 0.931987
601
+ 31: Layer 28, <cos_sim> = 0.933451
602
+ 32: Layer 30, <cos_sim> = 0.934623
603
+ 33: Layer 34, <cos_sim> = 0.935862
604
+ 34: Layer 37, <cos_sim> = 0.93849
605
+ 35: Layer 36, <cos_sim> = 0.939261
606
+ 36: Layer 33, <cos_sim> = 0.94047
607
+ 37: Layer 39, <cos_sim> = 0.942833
608
+ 38: Layer 40, <cos_sim> = 0.943535
609
+ 39: Layer 35, <cos_sim> = 0.943962
610
+ 40: Layer 41, <cos_sim> = 0.944572
611
+ 41: Layer 91, <cos_sim> = 0.944611
612
+ 42: Layer 38, <cos_sim> = 0.94701
613
+ 43: Layer 43, <cos_sim> = 0.951876
614
+ 44: Layer 42, <cos_sim> = 0.953462
615
+ 45: Layer 44, <cos_sim> = 0.954221
616
+ 46: Layer 45, <cos_sim> = 0.954828
617
+ 47: Layer 84, <cos_sim> = 0.960194
618
+ 48: Layer 46, <cos_sim> = 0.962422
619
+ 49: Layer 47, <cos_sim> = 0.963472
620
+ 50: Layer 50, <cos_sim> = 0.963841
621
+ 51: Layer 48, <cos_sim> = 0.964882
622
+ 52: Layer 51, <cos_sim> = 0.96498
623
+ 53: Layer 49, <cos_sim> = 0.965125
624
+ 54: Layer 85, <cos_sim> = 0.965745
625
+ 55: Layer 90, <cos_sim> = 0.966198
626
+ 56: Layer 52, <cos_sim> = 0.968709
627
+ 57: Layer 89, <cos_sim> = 0.969302
628
+ 58: Layer 86, <cos_sim> = 0.970209
629
+ 59: Layer 79, <cos_sim> = 0.971392
630
+ 60: Layer 80, <cos_sim> = 0.97181
631
+ 61: Layer 83, <cos_sim> = 0.971817
632
+ 62: Layer 53, <cos_sim> = 0.972442
633
+ 63: Layer 81, <cos_sim> = 0.972559
634
+ 64: Layer 87, <cos_sim> = 0.973106
635
+ 65: Layer 78, <cos_sim> = 0.973454
636
+ 66: Layer 57, <cos_sim> = 0.973742
637
+ 67: Layer 77, <cos_sim> = 0.97382
638
+ 68: Layer 82, <cos_sim> = 0.974303
639
+ 69: Layer 55, <cos_sim> = 0.974649
640
+ 70: Layer 54, <cos_sim> = 0.974867
641
+ 71: Layer 76, <cos_sim> = 0.975321
642
+ 72: Layer 75, <cos_sim> = 0.975472
643
+ 73: Layer 88, <cos_sim> = 0.975633
644
+ 74: Layer 58, <cos_sim> = 0.976417
645
+ 75: Layer 56, <cos_sim> = 0.976436
646
+ 76: Layer 60, <cos_sim> = 0.976607
647
+ 77: Layer 73, <cos_sim> = 0.977296
648
+ 78: Layer 72, <cos_sim> = 0.977447
649
+ 79: Layer 65, <cos_sim> = 0.977744
650
+ 80: Layer 67, <cos_sim> = 0.977822
651
+ 81: Layer 70, <cos_sim> = 0.977891
652
+ 82: Layer 59, <cos_sim> = 0.978032
653
+ 83: Layer 71, <cos_sim> = 0.978203
654
+ 84: Layer 69, <cos_sim> = 0.97839
655
+ 85: Layer 64, <cos_sim> = 0.978551
656
+ 86: Layer 66, <cos_sim> = 0.978619
657
+ 87: Layer 63, <cos_sim> = 0.97875
658
+ 88: Layer 74, <cos_sim> = 0.979117
659
+ 89: Layer 62, <cos_sim> = 0.979471
660
+ 90: Layer 68, <cos_sim> = 0.979636
661
+ 91: Layer 61, <cos_sim> = 0.980855
662
+
663
+ llama_print_timings: load time = 195855.60 ms
664
+ llama_print_timings: sample time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
665
+ llama_print_timings: prompt eval time = 7668240.55 ms / 416768 tokens ( 18.40 ms per token, 54.35 tokens per second)
666
+ llama_print_timings: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
667
+ llama_print_timings: total time = 7872041.86 ms / 416769 tokens