In other words you can’t vibe code in an environment where evaluating “does this code work” is an existential question. This is the case where 7k LOC/day becomes terrifying.
Until we get much better at automatically proving correctness of programs we will need review.
For the record, there are examples where human code review and design guidelines can lead to very low-bug code. NASA published their internal guidelines for producing safety-critical code[1]. The problem is that the development cost of software when using such processes is too high for most companies, and most companies don't actually produce safety-critical software.
My experience with the vast majority of LLM code submitted to projects I maintain is that it has subtle bugs that I managed to find through fairly cursory human review. The copilot code review feature on GitHub also tends to miss actual bugs and report nonexistent bugs, making it worse than useless. So in my view, the death of the benefits of human code review have been wildly exaggerated.
[1]: https://en.wikipedia.org/wiki/The_Power_of_10:_Rules_for_Dev...
You're right, human review and thorough design are a poor approximation of proving assumptions about your code. Yes bugs still exist. No you won't be able to prove the correctness of your code.
However, I can pretty confidently assume that malloc will work when I call it. I can pretty confidently assume that my thoroughly tested linked list will work when I call it. I can pretty confidently assume that following RAII will avoid most memory leaks.
Not all software needs meticulous careful human review. But I believe that the compounding cost of abstractions being lost and invariants being given up can be massive. I don't see any other way to attempt to maintain those other than human review or proven correctness.
"Not all software needs meticulous careful human review" is exactly the point. The question of exactly what software needs that kind of review is one whose answer I expect to change over the next 5-10 years. We are already at the point where it's so easy to produce small but highly non-trivial one-off applications that one needn't examine the code at all -- I completely agree with the above poster that we're rapidly discovering new examples of software development where output-verification is all you need, just like right now you don't hand-inspect the machine code generated by your compiler. The question is how far that will be able to go, and I don't think anybody really knows right now, except that we are not yet at the threshold. You keep bringing up examples where the stakes are "existential", but you're underestimating how much software development does not have anything close to existential stakes.
You know where this is going. I asked Claude if audio plugins were well represented in its training data, it said yes, off I went. I can’t review the code because I lack the expertise. It’s all C++ with a lot of math and the only math I’ve needed since college is addition and calculating percentages. However, I can have intelligent discussions about design and architecture and music UX. That’s been enough to get me a functional plugin that already does more in some respects than the original. I am (we are?) making it steadily more performant. It has only crashed twice and each time I just pasted the dump into Claude and it fixed the root cause.
Long story short: if you can verify the outcome, do you need to review the code? It helps that no one dies or gets underpaid if my audio plugin crashes. But still, you can’t tell me this isn’t remarkable. I think it’s clear there will be a massive proliferation of niche software.