- jes5199Cursor makes it easier to watch what the model is doing and to also make edits at the same time. I find it useful at work where I need to be able to justify every change in a code review. It’s also great for getting a feel for what the models are capable of - like, using Cursor for a few months make it easier to use Claude Code effectively
- ride the BART
- I think this might be the way forward, Claude is great at project managing.
I’m already telling Claude to ask Codex for a code review on PRs. or another fun pattern I found is you can use give the web version of Codex an open ended task like “make this method faster”, hit the “4x” button and end and up with four different pull requests attacking the problem in different ways. Then ask Claude to read the open PRs and make a 5th one that combines the approaches. This way Codex does the hard thinking but Claude does the glue
- I find that ChatGPT’s Codex reviews - which can also be set up to happen automatically on all PRs - seem smarter than Copilot’s, and make fewer mistakes. But these things change fast, maybe Copilot caught up and I didn’t notice
- can it read code review comments? I've been finding that having claude write code but letting codex review PRs is a productive workflow, claude code is capable of reading the feedback left in comments and is pretty good at following the advice.
- yesss, and OpenAI tried this first when they were going to do a “GPT store”. But REST APIs tend to be complicated because they’re supporting apps. MCP, when it works, is very simple functions
in practice it seems like command line tools work better than either of those approaches
- increasingly, the automated systems have access to the original ticket or bug report, and maybe even the conversation while implementation is happening. They can record the “why”
- I think that’s a tooling problem. Maybe we do end up running a lot more versions of things in the future. If we believe that code has gotten cheaper, it should be easier to do so.
- I love it when I have a tool that’s “done” but the software I work on in my career is never, ever done. It’s almost like there’s two different things we call “software”. there are tools like, idk, “curl” where you can use and old version and be happy. and there are interactive organizations in the world, like, eg, Hacker News, which mutates as the community’s needs change
- for now yes absolutely. but I’m already hearing rumblings that some people are having luck letting multiple agents edit the same directory simultaneously instead of putting changes through PR merge hell. It just needs coordinations tools, see https://github.com/Dicklesworthstone/mcp_agent_mail as one (possibly insane) prototype
for example it’s not out of the question that we could end up with tooling that does truly continuous testing and integration, automatically finding known-good deployments among a continuously edited multiplayer codebase
we’d have to spend a lot more energy on specifications and acceptance testing, rather than review, but I think that’s inevitable - code review can’t keep up with how fast code gets written now
- I could imagine that in ten years git will feel strangely slow and ceremonial. Why not just continuously work and continuously deploy live-edited software
- I just remembered, in those days, there was an alias called `ubygems` so you could pass `-rubygems` (ie, `-r` with `ubygems` as the argument) on the command line as if it was a first-class feature
it's so typical of ruby culture "haha, what if I do this silly thing" and then that gets shipped to production
- it’s an america thing
- huh okay, so, prediction: similar to how interpreted code eventually was given JIT so that it could be as fast as compiled code, eventually the LLMs will build libs of disposable helper functions as they work, which will look a lot like “writing code”. but we’ll stop thinking about it that way
- yeah why would anyone want to run code on a website
- no, emulated vim motions annoy me because they’re always slightly wrong, especially for the interesting stuff like multi-line selections. Or, worse, I’ve got stuff in my muscle memory from the `surround.vim` plugin that isn’t emulated. Easier to just open vim in a terminal if I want to do surgery on a file.
- I used vim for decades and the sensation is that I can hold the code in my hands and manipulate it directly. Other editors feel like I’m wearing thick gloves, can’t move correctly.
but it’s like learning to play the piano, it only feels natural after years of practice
nowadays I’m faster with Cursor so it doesn’t matter as much
- huh, my current ranking is:
1. claude code CLI, generally works, great tool use
2. codex on the web, feels REALLY smart, but can’t use tools
3. codex CLI, still smarter than claude but less situational awareness
4. codex via iphone app, buggier than the web app
5. claude code on the web, worst of all worlds
- they build them but they’re mostly not running them, utilization numbers keep falling. It’s either a central-planning failure or some kind of hedge
- ohh so basically an ESP32 can be a keyboard… can _two_ ESP32s be two hands of a single keyboard?