- If you read hte post, he didn't ask it to delete his home directory. He misread the command it generated and approved it when he shouldn't have.
That's literally exactly the kind of non-determinism I'm talking about. If he'd just left the agent to it's own devices, the exact same thing would have happened.
now you may argue this highlights that people make catastrophic mistakes too, but I'm not sure i agree.
Or at least, they don't often make that kind of mistake. Not saying that they don't make any catastrophic mistakes (they obviously do....)
We know people tend to click "accept" on these kinds of permission prompts with only a cursory read of what it's doing. And the more of these prompts you get, the more likely you are to just click "yes" or whatever to get through it..
If anything this kind of perfectly highlights some of the ironies referenced in the post itself.
- Similar here, but most of what I've removed was pointless ai generated code that made it through lazy reviewers.
So much dead and useless code generated by these tools... and tens of thousands of lines of worthless tests..
honestly I don't mind it that much... my changed lines is through the roof relative to my peers and now that stack ranking is back........
- The issue is LLMs are, by design, non-deterministic.
That means that, with the current technology, there can never be a deterministic agent.
Now obviously, humans aren't deterministic either, but the error bars are a lot closer together than they are with LLMs these days.
An easy to point at example is the coding agent that removed someones home directory that was circulating around. I'm not saying a human has never done that, but it's far less likely because it's so far out of the realm of normal operations.
So as of today, we need humans in the loop. And this is understood by the people making these products. That's why they have all these permissions and prompts for you to accept/run commands and all of that.
- You're stretching really hard here to try and rationalize your position
First of all, I pick the hardware I support and the operating systems. I can make those things requirements when they are required.
But when you boil down your argument, it's that because one thing may introduce non-determinism, then any degree of non-determinism is acceptable.
At that point we don't even need LLMs. We can just have the computer do random things.
It's just a rehash of the infinite monkeys with infinite type writers which is ridiculous
- > nondeterministic programming language that would allow your mom to achieve SOTA results
I actually think it's great for giving non-programmers the ability to program to solve basic problems. That's really cool and it's pretty darn good at it.
I would refute that you get SOTA results.
That has never been my personal experience. Given that we don't see a large increase in innovative companies spinning up now that this technology is a few years old, I doubt it's the experience of most users.
> The whole "it's a bad language" refrain feels half-baked when most of us use relatively high level languages on non-realtime OSes that obfuscate so much that they might as well be well worded prompts compared to how deterministic the underlying primitives they were built on are... at least until you zoom in too far.
Obfuscation and abstraction are not the same thing. The other core difference is the precision and the determinism both of which are lacking with LLMs.
- This is the only approach that seems even remotely reasonable
“Prompt engineering” just seems dumb as hell. It’s literally just an imprecise nondeterministic programming language.
Before a couple years so, we all would have said that was a bad language and moved on.
- If we stopped using products of every company that had a CEO that lied about their products, we’d all be sitting in caves staring at the dirt
- While no one has explicitly said that, it is the implied justification of rewriting so much stuff in rust
- Just because it doesn’t leave the cycle doesn’t mean it’s not an issue. Where it comes back down matters and as climate change makes wet places wetter and dry places drier, that means it’s less distributed
That said, the water issue is overblown. Most of the water calculation comes from power generation (which uses a ton) and is non-potable water.
The potable water consumed is not zero, but it’s like 15% or something
The big issue is power and the fact that most of it comes from fossil fuels
- Any gc pause is unacceptable if your goal is predictable throughput and latency
Modern gcs can be pauseless, but either way you’re spending CPU on gc and not servicing requests/customers.
As for c++, std::unique_ptr has no ref counting at all.
shared_ptr does, but that’s why you avoid it at all costs if you need to move things around. you only pay the cost when copying the shared_ptr itself, but you almost never need a shared_ptr and even when you need it, you can always avoid copying in the hot path
- > TDD doesn’t ensure the code is maintainable, extendable, follows best practices, etc, and while AI might write some code
None of that matters of its not a person writing the code
- If it’s decrypted, you don’t need to steal it from RAM.
You just do normal file operations and copy the files
- The blog post and several anecdotes in the comments prove otherwise
- There’s nothing trivial about running or scaling an ldap server.
Ldap is also not Active Directory. Ldap is one very small part of it
- Comparing postfix/dovecot to exchange is grossly misunderstanding what’s happening
If you’re using exchange/outlook, you’re using Active Directory.
The only real “altetnative” is the reimplementation in samba v4.. calling that an alternative is a bit of a stretch. And it barely scales to one user let alone millions like AD can
- Its also in an entirely different part of the moon, from an entirely different orbit
It’s like saying visiting the marianas trench and Everest are the same because they’re both on earth
- I don’t work for aws, but a different cloud provider so this is not a description of this incident, but an example of the kind of thing that can happen
One particular “dns” issue that caused an outage was actually a bug in software that monitors healthchecks.
It would actively monitor all servers for a particular service (by updating itself based on what was deployed) and update dns based on those checks.
So when the health check monitors failed, servers would get removed from dns within a few milliseconds.
Bug gets deployed to health check service. All of a sudden users can’t resolve dns names because everything is marked as unhealthy and removed from dns.
So not really a “dns” issue, but it looks like one to users
What isn’t viewed positively is when you refuse to accept a decision after it’s been made and your concerns have been heard. People get pissed if you keep relitigating the same points over and over again