(Though now that I think of it, I might start interrupting people with “SUMMARIZING CONVERSATION HISTORY!” whenever they begin to bore me. Then I can change the subject.)
There are various hacks these tools take to cram more crap into a fixed-size bucket, but it’s still fundamentally different than how a person thinks.
Do you understand yourself what you just said? File is a way to organize data in memory of a computer by definition. When you write instructions to LLM, they persistently modify your prompts making LLM „remember“ certain stuff like coding conventions or explanations of your architectural choices.
> particularly if I have to do it
You have to communicate with LLM about the code. You either do it persistently (must remember) or contextually (should know only in context of a current session). So word „particularly“ is out of place here. You choose one way or another instead of bring able to just tell that some information is important or unimportant long-term. This communication would happen with humans too. LLMs have different interface for it, more explicit (giving the perception of more effort, when it is in fact the same; and let’s not forget that LLM is able to decide itself on whether to remember something or not).
> and in any case, it consumes context
So what? Generalization is an effective way to compress information. Because of it persistent instructions consume only a tiny fraction of context, but they reduce the need for LLM to go into full analysis of your code.
> but it’s still fundamentally different than how a person thinks.
Again, so what? Nobody can keep in short-term memory the entire code base. It should not be the expectation to have this ability neither it should not be considered a major disadvantage not to have it. Yes, we use our „context windows“ differently in a thinking process. What matters is what information we pack there and what we make of it.
Long term memory is its training data.
I've yet had the "forgets everything" to be a limiting factor. In fact, when using Aider, I aggressively ensure it forgets everything several times per session.
To me, it's a feature, not a drawback.
I've certainly had coworkers who I've had to tell "Look, will you forget about X? That use case, while it look similar, is actually quite different in assumptions, etc. Stop invoking your experiences there!"