Comment by mccoyb - Hacker Neue

mccoyb Dec 18, 2025 parent

I'm happy to pay the same right now for less (on the max plan, or whatever) -- because I'm never running into limits, and I'm running these models near all day every day (as a single user working on my own personal projects).

I consistently run into limits with CC (Opus 4.5) -- but even though Codex seems to be spending significantly more tokens, it just seems like the quota limit is much higher?

Computer0 Dec 18, 2025

I am on the $20 plan for CC and Codex, I feel like a session of usage on CC == ~20% Codex usage / 5 hours in terms of time spent inferencing. It has always seemed way more geneous than I would expect.

Aurornis Dec 18, 2025

Agreed. The $20 plans can go very far when you're using the coding agent as an additional tool in your development flow, not just trying to hammer it with prompts until you get output that works.

Managing context goes a long way, too. I clear context for every new task and keep the local context files up to date with key info to get the LLM on target quickly

girvo Dec 18, 2025

> I clear context for every new task and keep the local context files up to date with key info to get the LLM on target quickly

Aggressively recreating your context is still the best way to get the best results from these tools too, so it has a secondary benefit.

heliumtera Dec 18, 2025

It is ironic that in the gpt-4 era, when we couldn't see much value in this tools, all we could hear was "skill issues", "prompt engineering skills". Now they are actually quite capable for SOME tasks, specially for something that we don't really care about learning, and they, to a certain extent, can generalize. They perform much better than in gpt-4 era, objectively, across all domains. They perform much better with the absolute minimum input, objectively, across all domains. If someone skipped the whole "prompt engineering" and learned nothing during that time, this person is more equiped to perform well. Now I wonder how much I am leaving behind by ignoring this whole "skills, tools, MCP this and that, yada yada".

fragmede Dec 19, 2025

My answer is that the code they generate is still crap, so the new skill is in being able to spot the ways and places it wrote crap code, and how to quickly tell it to refactor to fix specific issues, and still come out ahead on productivity. Nothing like an ultra wide screen monitor (LG 40+) and having parallel codex or claude sessions going, working on a bunch of things at once in parallel. Get good at git worktree. Use them to make tools that make your own life easier that you previously wouldn't even have bothered to make. (chrome extensions and MCPs!)

The other skill is in knowing exactly when to roll up your sleeves and do it the old fashioned way. Which things they're good/useful for, and which things they aren't.

conradev Dec 19, 2025

Prompt engineering (communicating with models?) is a foundational skill. Skills, tools, MCPs, etc. are all built on prompts.

My take is that the overlap is strongest with engineering management. If you can learn how to manage a team of human engineers well, that translates to managing a team of agents well.

lukan Dec 19, 2025

And if learned how to articulate assignments for humans right in a clear way, you likely also can do a prompt right.

None of that knowlege will get useless, only working around current limitations of agents will.

miek Dec 19, 2025

Minimal prompting yielding better results? I haven't found this to be the case at all.

neom Dec 19, 2025

Any thoughts on your wondering? I too am wondering about the same mistake I might be making.

theonething Dec 18, 2025

do you mean running /compact often?

Aurornis Dec 19, 2025

If I want to continue the same task, I run /compact

If I want to start a new task, I /clear and then tell it to re-read the CLAUDE.md document where I put all of the quick context: Description of the project, key goals, where to find key code, reminders for tools to use, and so on. I aggressively update this file as I notice things that it’s always forgetting or looking up. I know some people have the LLM update their context file but I just do it myself with seemingly better results.

Using /compact burns through a lot of your usage quota and retains a lot of things you may not need. Giving it directions like “starting a new task doing ____, only keep necessary context for that” can help, but hitting /clear and having it re-read a short context primer is faster and uses less quota.

dionian Dec 19, 2025

I'm not who you asked, but i do the same thing, i keep important state in doc files and recreate sessions from that state. this allows me to clear context and reconstruct my status on that item. I have a skill that manages this

joquarky Dec 19, 2025

Using documents for state helps so much with adding guardrails.

I do wish that ChatGPT had a toggle next to each project file instead of having to delete and reupload to toggle or create separate projects for various combinations of files.

dionian Dec 20, 2025

This is why claude code/codex cli is the way to go for me because often they can recompute the state from the minimal description automatically. If i relaly do needto stop the session and come back in i can point it at the task file. it also has docs and scaladocs/javadocs in key places. good naming and project structure helps it very easily find the data it needs without me needing to feed it specific files. I did the 'feed it files and copy paste the code snippet' thing in chatgpt for months. wish i went to claude code sooner.

hadlock Dec 19, 2025

I noticed I am not hitting limits either. My guess is OpenAI sees CC as a real competitor/serious threat. Had OAI not given me virtually unlimited use I probably would have jumped ship to CC by now. Burning tons of cash at this stage is likely Very Worth It to maintain "market leader" status if only in the eyes of the media/investors. It's going to be real hard to claw back current usage limits though.

andai Dec 18, 2025

If you look at benchmarks, the Claude models score significantly higher intelligence per token. I'm not sure how that works exactly, but they are offset from the entire rest of the chart on that metric. It seems they need less tokens to get the same result. (I can't speak for how that affects performance on very difficult tasks though, since most of mine are pretty straightforward.)

So if you look at the total cost of running the benchmark, it's surprisingly similar to other models -- the higher price per token is offset by the significantly fewer tokens required to complete a task.

See "Cost to Run Artificial Analysis Index" and "Intelligence vs Output Tokens" here

https://artificialanalysis.ai/

...With the obligatory caveat that benchmarks are largely irrelevant for actual real world tasks and you need to test the thing on your actual task to see how well it does!

This item has no comments currently.

Preferences

Keyboard Shortcuts

Story Lists

Navigation

Miscellaneous