"it took Claude Fable 2.5 hours to write a fused megakernel which delivers a >18x speed-up over a PyTorch baseline now please recall that: - Fable is not the full Mythos model - Anthropic can spend much more than just 2.5h and ~550k tokens on this - they probably have better harnes…" — Lisan al Gaib

Claude Fable 5 [max] wrote the first genuine (and fastest) megakernel ever submitted to KernelBench-Mega.

It was tested on: Kimi-Linear W4A16 batch-1 decode for RTX PRO 6000 Blackwell. Every prior model "won" it with a multi-kernel Triton pipeline that fails our — Elliot Arledge

Source: https://x.com/elliotarledge/status/2072814573753975266

ses Anthropic is definitely doing some sweet autoresearch internally. Especially architecture research bros are probably so happy at Anthropic. Imagine vibe-testing a new arch / tweak some arch and wanting to test it in a semi-optimized way. Just let 10T Mythos cook for a day. — Lisan al Gaib

Source: https://x.com/scaling01/status/2072829688569860098