Preferences

You’re right about the trees but wrong (hear me out) about the forest.

Yes, programming isn’t always deterministic, not just due to the leftpad API endpoint being down, but by design - you can’t deterministically tell which button the user is going to click. So far so good.

But, you program for the things that you expect to happen, and handle the rest as errors. If you look at the branching topology of well-written code, the majority of paths lead to an error. Most strings are not valid json, but are handled perfectly well as errors. The paths you didn’t predict can cause bugs, and those bugs can be fixed.

Within this system, you have effective local determinism. In practice, this gives you the following guarantee: if the program executed correctly until point X, the local state is known. This state is used to build on top of that, and continue the chain of bounded determinism, which is so incredibly reliable on modern CPUs that you can run massive financial transactions and be sure it works. Or, run a weapons system or a flight control system.

So when people point out that LLMs are non-deterministic (or technically unstable, to avoid bike-shedding), they mean that it’s a fundamentally different type of component in an engineering system. It’s not like retrying an HTTP request, because when things go wrong it doesn’t produce “errors”, it produces garbage that looks like gold.


loudmax
Programmers aren't deterministic either. If I ask ten programmers to come up with a solution to the same problem, I'm not likely to get ten identical copies. Different programmers, even competent experienced programmers, might have different priorities that aren't in the requirements. For example, trading off program maintainability or portability over performance.

The same could apply to LLMs, or even different runs from the same LLMs.

klabb3 OP
> Programmers aren't deterministic either.

No but programs are. An LLM can be a programmer too, but it’s not a program the way we want and expect programs to behave: deterministically. Even if a programmer could perform a TLS handshake manually very fast, ignoring the immense waste of energy, the program is a much better engineering component, simply because it is deterministic and does the same thing every time. If there’s a bug, it can be fixed, and then the bug will not re-appear.

> If I ask ten programmers to come up with a solution to the same problem, I'm not likely to get ten identical copies.

Right, but you only want one copy. If you need different clients speaking with each other you need to define a protocol and run conformance tests, which is a lot of work. It’s certainly doable, but you don’t want a different program every time you run it.

I really didn’t expect arguing for reproducibility in engineering to be controversial. The primary way we fix bugs is by literally asking for steps to reproduction. This is not possible when you have a chaos agent in the middle, no matter how good. The only reasonable conclusion is to treat AI systems as entirely different components and isolate them such that you can keep the boring predictability of mechanistic programs. Basically separating engineering from the alchemy.

skydhash
Not really, we have many implementations of web servers or ftp clients, but they all follow the same protocol. So you can pair any two things that talk the same protocol and have a consistent systems. If you gave ten programmers a specs, you get ten implementations that follows the specs. With LLMs, you get random things.

This item has no comments currently.