Comparing the Latest Coding Models
A breakdown of the strengths and weaknesses of Opus 4.6, Codex 5.3, Sonnet 4.6, and Composer 1.5 for different development tasks.
A breakdown of the strengths and weaknesses of Opus 4.6, Codex 5.3, Sonnet 4.6, and Composer 1.5 for different development tasks.
Choosing the "best" AI model depends entirely on the task: Opus 4.6 dominates frontend creativity, Codex 5.3 is unmatched for massive refactors and backend reliability, and Sonnet 4.6 remains the best daily driver. Cursor's Composer 1.5 is the cheap option I keep reaching for on mobile UI/UX.
I've been rotating through Opus 4.6, Codex 5.3, Sonnet 4.6, and Cursor's Composer 1.5 on real production work—not leaderboard scores. Here's where each one actually earns a slot in my workflow.
Opus is what I open for frontend. It's good at creating websites that don't look like they were assembled from a component catalog—spacing, typography, and modern patterns land without me fighting the model. Where it falls down is stamina: long context windows and multi-step deploy runs are where it loses the thread.
Codex 5.3 is the one I trust for ugly, long-running work. It's noticeably better than Opus 4.6 at holding complex state across a big change and actually getting things deployed. Massive refactors, gnarly infrastructure migrations—that's Codex. I don't reach for it first for UI polish; I reach for it when the job is "make production not lie."
Sonnet 4.6 is my default for everyday engineering: business logic, PR review, the stuff that doesn't need a flagship model. Haiku handles the cheap chores—commit messages, running tests—where latency matters more than brilliance.
Composer 1.5 in Cursor has been the surprise. It costs a fraction of the big models, and on iOS and Android UI work it's at least as good as Opus and Codex for me—sometimes better, because it's fast enough that I'll actually iterate. "Feels fast" sounds silly until you're tuning layout on a phone build for the fifth time in an hour.
Match the model to the phase you're in: Opus for screens, Codex for refactors and infra, Sonnet for the daily grind, Composer when mobile UI needs speed more than swagger.