weiweiz1
/

Llama-3.2-1B-Instruct-NVFP4

8-bit precision

compressed-tensors

Model card Files Files and versions

Llama-3.2-1B-Instruct-NVFP4 / quantization_config.json

weiweiz1's picture

Upload folder using huggingface_hub

bb1f85e verified 5 months ago

history blame contribute delete

301 Bytes

	{
	"bits": 4,
	"group_size": 16,
	"sym": true,
	"data_type": "nv_fp4",
	"seqlen": 512,
	"batch_size": 4,
	"iters": 20,
	"autoround_version": "0.6.0",
	"quant_method": "auto-round",
	"packing_format": "nv_fp",
	"scale_format": [
	"e8m0"
	],
	"scale_calculation_mode": [
	"even"
	]
	}