Comparing the Latest Coding Models
A breakdown of the strengths and weaknesses of Opus 4.6, Codex 5.3, Sonnet 4.6, and Composer 1.5 for different development tasks.
A breakdown of the strengths and weaknesses of Opus 4.6, Codex 5.3, Sonnet 4.6, and Composer 1.5 for different development tasks.
The landscape of AI coding models is constantly shifting. As a staff engineer I spend a lot of time evaluating which tools actually deliver results in a production environment. I have been using a few of the latest coding models recently and wanted to share my thoughts on their strengths and weaknesses. The models I am looking at are Opus 4.6, Codex 5.3, Sonnet 4.6, and Composer 1.5.
Opus 4.6 has been a standout for frontend work. It is surprisingly good at creating websites and crafting designs that look like they were designed and created by a human. It understands aesthetics and modern web patterns intuitively. However, it can sometimes struggle with extremely long context windows or multi-step deployment operations where it loses track of the overarching goal.
When it comes to long running tasks, Codex 5.3 is incredibly impressive. It is noticeably better than Opus 4.6 at keeping track of complex state over time and making sure that things actually work and deploy correctly. If I need a model to grind through a massive refactor or a tricky infrastructure migration, Codex is my absolute go-to. It is less creative with UI but unmatched for backend reliability.
For daily tasks and normal development workflows, Sonnet 4.6 is doing a great job. It strikes a perfect balance between speed and capability. I will typically rely on Sonnet for writing standard business logic or reviewing pull requests. It is worth noting that I still use the smaller Haiku model for fast, low complexity things like generating commits and running tests.
Finally, Composer 1.5 inside Cursor is just blowing me away right now. It operates at a fraction of the cost of the bigger models. Despite the price difference, its ability to deliver UI and UX changes on iOS and Android apps is at least as good if not better than Opus and Codex. It feels incredibly fast and integrates perfectly into the mobile development workflow.
Choosing the right model is really about matching the tool to the specific phase of development you are in.