maximuspowers commited on
Commit
13b5551
·
verified ·
1 Parent(s): 5ddc377

Upload weight-space autoencoder (encoder + decoder) and configuration

Browse files
Files changed (5) hide show
  1. README.md +42 -0
  2. config.yaml +116 -0
  3. decoder.pt +3 -0
  4. encoder.pt +3 -0
  5. tokenizer_config.json +9 -0
README.md ADDED
@@ -0,0 +1,42 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - weight-space-learning
4
+ - neural-network-autoencoder
5
+ - autoencoder
6
+ - transformer
7
+ datasets:
8
+ - maximuspowers/muat-fourier-5
9
+ ---
10
+
11
+ # Weight-Space Autoencoder (TRANSFORMER)
12
+
13
+ This model is a weight-space autoencoder trained on neural network activation weights/signatures.
14
+ It includes both an encoder (compresses weights into latent representations) and a decoder (reconstructs weights from latent codes).
15
+
16
+ ## Model Description
17
+
18
+ - **Architecture**: Transformer encoder-decoder
19
+ - **Training Dataset**: maximuspowers/muat-fourier-5
20
+ - **Input Mode**: signature
21
+ - **Latent Dimension**: 128
22
+
23
+ ## Tokenization
24
+
25
+ - **Chunk Size**: 1 weight values per token
26
+ - **Max Tokens**: 64
27
+ - **Metadata**: True
28
+
29
+ ## Training Config
30
+
31
+ - **Loss Functions**: reconstruction, contrastive, functional
32
+ - **Optimizer**: adamw
33
+ - **Learning Rate**: 0.0001
34
+ - **Batch Size**: 32
35
+
36
+ ## Performance Metrics (Test Set)
37
+
38
+ - **MSE**: 0.105820
39
+ - **MAE**: 0.208260
40
+ - **RMSE**: 0.325300
41
+ - **Cosine Similarity**: 0.9560
42
+ - **R² Score**: 0.9830
config.yaml ADDED
@@ -0,0 +1,116 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ architecture:
2
+ latent_dim: 128
3
+ transformer:
4
+ decoder:
5
+ activation: gelu
6
+ d_model: 512
7
+ dim_feedforward: 2048
8
+ dropout: 0.1
9
+ num_heads: 8
10
+ num_layers: 6
11
+ encoder:
12
+ activation: gelu
13
+ d_model: 512
14
+ dim_feedforward: 2048
15
+ dropout: 0.1
16
+ num_heads: 8
17
+ num_layers: 6
18
+ pooling: mean
19
+ positional_encoding: learned
20
+ type: transformer
21
+ dataloader:
22
+ num_workers: 0
23
+ pin_memory: true
24
+ dataset:
25
+ hf_dataset: maximuspowers/muat-fourier-5
26
+ input_mode: signature
27
+ max_dimensions:
28
+ max_hidden_layers: 6
29
+ max_neurons_per_layer: 8
30
+ max_sequence_length: 5
31
+ neuron_profile:
32
+ features_per_neuron: 5
33
+ methods:
34
+ - fourier
35
+ random_seed: 42
36
+ test_split: 0.1
37
+ train_split: 0.8
38
+ val_split: 0.1
39
+ device:
40
+ type: auto
41
+ evaluation:
42
+ metrics:
43
+ - mse
44
+ - mae
45
+ - rmse
46
+ - cosine_similarity
47
+ - relative_error
48
+ - r2_score
49
+ per_layer_metrics: false
50
+ hub:
51
+ enabled: true
52
+ private: false
53
+ push_logs: true
54
+ push_metrics: true
55
+ push_model: true
56
+ repo_id: maximuspowers/weight-autoencoder-mlp-v1
57
+ token: <REDACTED>
58
+ logging:
59
+ checkpoint:
60
+ enabled: true
61
+ mode: min
62
+ monitor: val_loss
63
+ save_best_only: true
64
+ tensorboard:
65
+ auto_launch: true
66
+ enabled: true
67
+ log_interval: 10
68
+ port: 6006
69
+ visualizations:
70
+ enabled: true
71
+ log_interval: 1
72
+ num_image_samples: 4
73
+ verbose: true
74
+ loss:
75
+ contrastive:
76
+ enabled: true
77
+ projection_head:
78
+ hidden_dim: 64
79
+ input_dim: 128
80
+ output_dim: 32
81
+ temperature: 0.1
82
+ weight: 0.4
83
+ functional:
84
+ benchmark_path: /configs/autoencoder/benchmark_dataset.json
85
+ enabled: true
86
+ test_samples: null
87
+ weight: 0.4
88
+ reconstruction:
89
+ enabled: true
90
+ type: mse
91
+ weight: 0.2
92
+ run_dir: /Users/max/Desktop/muat/model_zoo/runs/train-encoder-decoder_config_2025-12-17_19-33-32
93
+ run_log_cleanup: false
94
+ tokenization:
95
+ chunk_size: 1
96
+ granularity: neuron
97
+ include_metadata: true
98
+ max_tokens: 64
99
+ training:
100
+ batch_size: 32
101
+ early_stopping:
102
+ enabled: true
103
+ mode: min
104
+ monitor: val_loss
105
+ patience: 15
106
+ epochs: 250
107
+ gradient_accumulation_steps: 4
108
+ learning_rate: 0.0001
109
+ lr_scheduler:
110
+ enabled: true
111
+ factor: 0.5
112
+ min_lr: 1.0e-06
113
+ patience: 5
114
+ max_grad_norm: 1.0
115
+ optimizer: adamw
116
+ weight_decay: 0.0001
decoder.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1a7e1b2bed452a4562d4f0e6fb7e47a75e917bfbf6a68f660bdfc3194fabfdca
3
+ size 101365774
encoder.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4ebcb2592d5bb6ef3f7806da61037cc769ad5f29534c6dbdb683228624a2db38
3
+ size 76106790
tokenizer_config.json ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "chunk_size": 1,
3
+ "max_tokens": 64,
4
+ "include_metadata": true,
5
+ "metadata_features": 5,
6
+ "token_dim": 14,
7
+ "granularity": "neuron",
8
+ "max_neuron_data_size": 9
9
+ }