view article Article Tensor Parallelism (TP) in Transformers: 5 Minutes to Understand 3 days ago • 39
view article Article Transformers v5: Simple model definitions powering the AI ecosystem +2 6 days ago • 223
MobileLLM-R1 Collection MobileLLM-R1, a series of sub-billion parameter reasoning models • 10 items • Updated 15 days ago • 27
view article Article huggingface_hub v1.0: Five Years of Building the Foundation of Open Machine Learning +2 Oct 27 • 69
Jan-v2-VL Collection Jan-v2-VL: an 8B VLM focused on reliable, many-step task execution. • 6 items • Updated 24 days ago • 36
view article Article Optimizing Mixture-of-Experts Training: A Cost-Effective, Two-Sided Approach Sep 30 • 3