- Skills are just prompt conventions; the exact form may change but the substance is reasonable. MCP, eh, it’s pretty bad, I can see it vanishing.
The agent loop architectural pattern (and that’s the relevant bit) is going to continue to matter. There will be new patterns for sure, but tool calling plus while loop (which is all an “agent” is) is powerful and highly general.
- Yes, that sentence is simply untrue for, at the very least, BrE. For example: https://www.the-independent.com/news/uk/home-news/chart-show... (2015)
- LLMs speed you up more if you have an appropriate theory in greenfield tasks (and if you do the work of writing your scaffold yourself).
Brownfield tasks are harder for the LLM at least in part because it’s harder to retroactively explain regular structure in a way the LLM understands and can serialize into eg CLAUDE.md.
- Previously https://www.hackerneue.com/item?id=29921137
- If you’re using ChatGPT directly for work then I believe that you are doing it so profoundly wrong, at this point, that you’re going to make really incorrect assumptions.
As we have all observed, the models get things wrong, and if you’re wrong 5% of the time, then ten edits in you’re at 60-40. So you need to run them in a loop where they’re constantly sanity checking themselves—-linting, styling, typing and testing. In other words; calling tools in a loop. Agents are so much better than any other approach it’s comical precisely because they’re scaffolding to let models self-correct.
This is likely somewhat domain-specific; I can’t imagine the models are that great at domains they haven’t seen much code in, so they probably suck at HFT infrastructure for example, though they are decent at reading docs by this point. There’s also a lot of skill in setting up the right documentation, testing structure, interfaces, etc etc etc to make the agents more reliable and productive (fringe benefit; your LLM-wielding colleagues actually write docs now, even if they’re full of em-dashes and emoji). You also need to be willing to let it write a bunch of code, look at it, work out why it’s structurally deficient, throw it away, and build the structure you want to guide it - but the typing is essentially free, so that’s tractable. Don’t view it as bad code, view it as a useful null result.
But if you’re not using Claude Code or Codex or Roo or relatives, you’re living in an entirely different world to the people who have gone messianic about these things.
- > As much as I love the aesthetic, I'm developing a fear that they'll soon spin off into a startup with some kind of paid model, and that government websites will regress.
gov.uk got started, in part, because the 2009 financial meltdown left a lot of good startup designers and engineers with not enough to do (and made civil service jobs more attractive for a bit!)
- NNs have absolutely revolutionized systems biology (itself a John Hopfield joint, and the AlphaFold team are reasonably likely to get a Nobel for medicine and physiology, possibly as soon as 'this year') and are becoming relevant in all kinds of weird parts of solid-state physics (trained functionals for DFT, eg https://www.nature.com/articles/s41598-020-64619-8).
The idea that academic disciplines are in any way isolated from each other is nonsense. Machine learning is computer science; it's also information theory; that means it's thermodynamics, which means it's physics. (Or, rather, it can be understood properly through all of these lenses).
John Hopfield himself has written about this; he views his work as physics because _it is performed from the viewpoint of a physicist_. Disciplines are subjective, not objective, phenomena.
- > Hopfield networks and Boltzmann machines
Think of this as a Nobel prize for systems physics – essentially "creative application of statistical mechanics" – and it makes a lot more sense why you'd pick these two.
(I am a mineral physicist who now works in machine learning, and I absolutely think of the entire field as applied statistical mechanics; is that correct? Yes and no: it's a valid metaphor.)
The POS software's on GitHub: https://github.com/sde1000/quicktill