view article Article Nemotron 3 Nano \- A new Standard for Efficient, Open, and Intelligent Agentic Models 2 days ago • 73
From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence Paper • 2511.18538 • Published 24 days ago • 264
Supervised Reinforcement Learning: From Expert Trajectories to Step-wise Reasoning Paper • 2510.25992 • Published Oct 29 • 45
view article Article Introducing AutoRound: Intel’s Advanced Quantization for LLMs and VLMs +7 Apr 29 • 43
view article Article Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment Feb 11 • 92
view article Article From DeepSpeed to FSDP and Back Again with Hugging Face Accelerate +2 Jun 13, 2024 • 61