"Fable 5 is back and we’ve got results for the re-released version on APEX-SWE. While it did not perform as well as its earlier version from June, the model still significantly outperforms Opus 4.8. Fable 5 (June): 65.5% Pass@1 Fable 5 (July): 54.8% Pass@1 Opus 4.8: 45.3% Pass@1 This..." — Mercor

Fable 5 is back and we’ve got results for the re-released version on APEX-SWE.

While it did not perform as well as its earlier version from June, the model still significantly outperforms Opus 4.8.

Fable 5 (June): 65.5% Pass@1 Fable 5 (July): 54.8% Pass@1 Opus 4.8: 45.3% Pass@1

This re-release scored about 10 points below the original Fable 5, however it still beat Opus 4.8 by more than 9 points. APEX-SWE evaluates AI models across two different areas of software engineering work, Integration and Observability.

Here is how the Fable 5 re-release performed in both areas compared to the June release.

Integration Fable 5 (June): 61.33% Fable 5 (July): 59.33%

Observability — Mercor

Source: https://x.com/mercor_ai/status/2073080728074727485