Rendered at 06:08:26 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.
kasey_junk 8 hours ago [-]
I think it really depends on how fully formed you ai workflows are. I have a very opinionated set of skills and agents files and a harness for running prompts against both for code production.
I do head to head comparisons with this setup pretty regularly and what I’ve found is there is not much difference in outcomes between the 2 frontier labs at equivalent model settings. It’s hard to get statistically significant results on my budget and eval ability but my anecdotal feeling is that there is as much difference in group as out in outcomes.
Given that setup I use codex much more than Claude because it’s more reliable.
But I believe it’s easier to go from nothing to decent with Claude.
For other stuff I use Claude.
01jonny01 8 hours ago [-]
Claude is good for producing 1 shot polished apps, but you will quickly burn through your allowance.
Chatgpt needs more prompting to get what you want, but its nearly impossible to reach your limit.
Do you mean Claude Code? If so, that's what I use(d) primarily for development, and Claude Desktop for general chats. My issue with Opus was that, every time I start a new task in Plan mode, it'd use 50k - 100k tokens and that'd by about 20% of the session limit. A bit of back and forth and its done for most of the work day. Just not feasible at all. The tasks I wanted it to perform were fairly small and contained, "Look at these three files @@@ and add xxx to @file. DON'T read any other files. If you need more context, ask me.". That worked sometimes but not always, still burned a lot of tokens.
pcael 22 hours ago [-]
Yes I meant Claude Code client.
Indeed Opus is a token eater, I usually use Sonnet because or that.
khaledh 19 hours ago [-]
I use both at the same time:
- Claude Opus for general discussion, design, reviews, etc.
- Codex GPT-5.4 High for task breakdown and implementation.
I often feed their responses to each other (manual copy/paste) to validate/improve the design and/or implementation. The outcome has been better than using one alone.
This workflow keeps Claude's usage in check (it doesn't eat as much tokens), and leverages Codex generous usage limits. Although sometimes I run into Codex's weekly limit and I need to purchase additional credits: 1000 credits for $40, which last for another 4-5 days (which usually overlap with my weekly refresh, so not all the credits are used up).
I do head to head comparisons with this setup pretty regularly and what I’ve found is there is not much difference in outcomes between the 2 frontier labs at equivalent model settings. It’s hard to get statistically significant results on my budget and eval ability but my anecdotal feeling is that there is as much difference in group as out in outcomes.
Given that setup I use codex much more than Claude because it’s more reliable.
But I believe it’s easier to go from nothing to decent with Claude.
For other stuff I use Claude.
Chatgpt needs more prompting to get what you want, but its nearly impossible to reach your limit.
- Claude Opus for general discussion, design, reviews, etc.
- Codex GPT-5.4 High for task breakdown and implementation.
I often feed their responses to each other (manual copy/paste) to validate/improve the design and/or implementation. The outcome has been better than using one alone.
This workflow keeps Claude's usage in check (it doesn't eat as much tokens), and leverages Codex generous usage limits. Although sometimes I run into Codex's weekly limit and I need to purchase additional credits: 1000 credits for $40, which last for another 4-5 days (which usually overlap with my weekly refresh, so not all the credits are used up).