There can be diminishing returns, but every time I’ve used Claude Code for a real project I’ve found myself repeating certain things over and over again and interrupting tool usage until I put it in the Claude notes file.
You shouldn’t try to put everything in there all the time, but putting key info in there has been very high ROI for me.
Disclaimer: I’m a casual user, not a hardcore vibe coder. Claude seems much more capable when you follow the happy path of common projects, but gets constantly turned around when you try to use new frameworks and tools and such.
I like to write my CLAUDE.md directly, with just a couple paragraphs describing the codebase at a high level, and then I add details as I see the model making mistakes.
I like the sound of this but what technique do you use to maintain consistency across both views? Do you have a post-modification script which will strip comments and extraneous empty space after code has been modified?
I first "discovered" it because I repeatedly found LLM comments poisoned my code base over time and linited it's upper end of ability.
Easy to try just drop comments around a problem and see the difference. I was previously doing that and then manually updating the original.
1. SOT through a processor to strip comments and extra spaces. Publish to feature branch.
2. Point Claude at feature branch. Prompt for whatever changes you need. This runs against the minimalist feature branch. These changes will be committed with comments and readable spacing for the new code.
3. Verify code changes meet expectations.
4. Diff the changes from minimal version, and merge only that code into SOT.
Repeat.
1. Run into a problem you and AI can't solve. 2. Drop all comments 3. Restart debug/design session 4. Solve it and save results 5. Revert code to have comments and put update in
If that still doesn't work: Step 2.5 drop all unrelated code from context
The more you data load into context the more you dilute attention.
I'm skeptical this a valid generalization over what was directly observed. [1] We would learn more if they wrote a more detailed account of their observations. [2]
I'd like to draw a parallel to another area of study possibly unfamiliar to many of us. Anthropology faced similar issues until Geertz's 1970s reform emphasized "thick description" [3] meaning detailed contextual observations instead of thin generalization.
[1]: I would not draw this generalization. I've found that adding guidelines (on the order of 10k tokens) to my CLAUDE.md has been beneficial across all my conversations. At the same time, I have not constructed anything close to study of variations of my approach. And the underlying models are a moving target. I will admit that some of my guidelines were added to address issues I saw over a year ago and may be nothing more than vestigial appendages nowadays. This is why I'm reluctant to generalize.
[2]: What kind of "hard problems"? What is meant by "more" exactly? (Going from 250 to 500 tokens? 1000 to 2000? 2500 to 5000? &c) How much overlap exists between the CLAUDE.md content items? How much ambiguity? How much contradiction?
Even now if I am working on REALLY hard problems I will still manually copy and paste code sections out for discussion and algorithm designs. Depends on complexity.
This is why I still believe open ai O1-Pro was the best model I've ever seen. The amount of compute you could throw at a problem was absurd.
How do you practically achieve this? Honest question. Thanks
1. Turn off 2. Code 3. Turn on 4. Commit
I also delete all llm comments they 100% poison your codebase.
> 1. Turn off 2. Code 3. Turn on 4. Commit
What does it mean "turn off" / "turn on"?
Do you have a script to strip comments?
Okay, after the comments were stripped, does this become the common base for 3-way merge?
After modification of the code stripped of the comments, do you apply 3-way merge to reconcile the changes and the comments?
This seems a lot of work. What is the benefit? I mean demonstrable benefit.
How does it compare to instructing through AGENTS.md to ignore all comments?
> 1. Turn off 2. Code 3. Turn on 4. Commit
So can you describe your "turn off" / "turn on" process in practical terms?
Asking simply because saying "Custom scripts" is similar to saying "magic".
What did your comparison process look like? It feels intuitively accurate and validates my anecdotal impression but I'd love to hear the rigor behind your conclusions!
It's also easy to notice LLMs create garbage comments that get worse over time. I started deleting all comments manually alongside manual snippet selection to get max performance.
Then started just routinely deleting all comments pre big problem solving session. Was doing it enough to build some automation.
Maybe high quality human comments improve ability? Hard to test in a hybrid code base.
See it as a human, the comments are there to speed up understanding of the code.
I have found that more context comments and info damage quality on hard problems.
I actually for a long time now have two views for my code.
1. The raw code with no empty space or comments. 2. Code with comments
I never give the second to my LLM. The more context you give the lower it's upper end of quality becomes. This is just a habit I've picked up using LLMs every day hours a day since gpt3.5 it allows me to reach farther into extreme complexity.
I suppose I don't know what most people are using LLMs for but the higher complexity your work entails the less noise you should inject into it. It's tempting to add massive amounts of xontext but I've routinely found that fails on the higher levels of coding complexity and uniqueness. It was more apparent in earlier models newer ones will handle tons of context you just won't be able to get those upper ends of quality.
Compute to informatio ratio is all that matters. Compute is capped.