GPT-5 was the first model that occasionally produced code that I could push without any changes
Claude still tends to add "fluff" around the solution and over-engineer, not that the code doesn't work, it's just that it's ugly
Interesting, I have consistently found that Codex does much better code reviews than Claude. Claude will occasionally find real issues, but will frequently bike shed things I don't care about. Codex always finds things that I do actually care about and that clearly need fixing.
I’d agree with you until Opus 4.5.
Eh sonnet 4.5 was better at Rust for me
My only gripe is I wish they'd publish Codex CLI updates to homebrew the same time as npm :)