YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Qwen3 6.4M Model with Falcon-H1-0.5B-Instruct Tokenizer

Model Description

This is a 6.4M parameter Qwen3 model architecture combined with the Falcon-H1-0.5B-Instruct tokenizer (32K vocabulary).

  • Architecture: Qwen3 (Grouped Query Attention, RMS Normalization, Q/K Normalization, RoPE)
  • Tokenizer: Falcon-H1-0.5B-Instruct (32K vocab)
  • Parameters: 6,393,440
  • Precision: BF16
  • Format: SafeTensors
  • Vocabulary Size: 32768
  • Use Case: Lightweight model with hybrid sliding window

Configuration

  • vocab_size: 32768
  • hidden_size: 96
  • num_attention_heads: 8
  • num_key_value_heads: 4
  • num_hidden_layers: 8
  • intermediate_size: 384
  • head_dim: 128
  • max_position_embeddings: 8192

Special Tokens

  • BOS: <|begin_of_text|> (id: 17)
  • EOS: <|end_of_text|> (id: 11)
  • PAD: <|pad|> (id: 32767)

Usage

from transformers import Qwen3ForCausalLM, AutoTokenizer

model = Qwen3ForCausalLM.from_pretrained("./workspace/6.4m-falcon-tokenizer")
tokenizer = AutoTokenizer.from_pretrained("./workspace/6.4m-falcon-tokenizer")

# Generate text
inputs = tokenizer("Hello, world!", return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=50)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

# Batch processing (start small)
texts = ["Hello", "How are you", "Good morning"]
inputs = tokenizer(texts, return_tensors="pt", padding=True, truncation=True)
with torch.no_grad():
    outputs = model.generate(**inputs, max_new_tokens=20)

Important Notes

  • Model uses Qwen3 architecture with Falcon tokenizer (32K vocabulary)
  • All token IDs must be < 32768 to avoid CUDA errors
  • Start with small batch sizes (1-4) and gradually increase
  • Use proper padding to prevent dimension mismatches
  • Model initialized with random weights - requires fine-tuning
  • Compatible with Qwen3 APIs but uses Falcon vocabulary
Downloads last month
32
Safetensors
Model size
6.39M params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support