Progression

RSS feed
Switch to light mode
Buy me a coffee

VibeThinker is a 3B param model that beats Opus 4.5 on reasoning with novel SFT+GRPOReddit

Jun 23, 2026 14:18
VibeThinker is a 3B param model that beats Opus 4.5 on reasoning with novel SFT+GRPO
arxiv.org
Go to Progression Home