Preferences

Programmers aren't deterministic either. If I ask ten programmers to come up with a solution to the same problem, I'm not likely to get ten identical copies. Different programmers, even competent experienced programmers, might have different priorities that aren't in the requirements. For example, trading off program maintainability or portability over performance.

The same could apply to LLMs, or even different runs from the same LLMs.


klabb3
> Programmers aren't deterministic either.

No but programs are. An LLM can be a programmer too, but it’s not a program the way we want and expect programs to behave: deterministically. Even if a programmer could perform a TLS handshake manually very fast, ignoring the immense waste of energy, the program is a much better engineering component, simply because it is deterministic and does the same thing every time. If there’s a bug, it can be fixed, and then the bug will not re-appear.

> If I ask ten programmers to come up with a solution to the same problem, I'm not likely to get ten identical copies.

Right, but you only want one copy. If you need different clients speaking with each other you need to define a protocol and run conformance tests, which is a lot of work. It’s certainly doable, but you don’t want a different program every time you run it.

I really didn’t expect arguing for reproducibility in engineering to be controversial. The primary way we fix bugs is by literally asking for steps to reproduction. This is not possible when you have a chaos agent in the middle, no matter how good. The only reasonable conclusion is to treat AI systems as entirely different components and isolate them such that you can keep the boring predictability of mechanistic programs. Basically separating engineering from the alchemy.

skydhash
Not really, we have many implementations of web servers or ftp clients, but they all follow the same protocol. So you can pair any two things that talk the same protocol and have a consistent systems. If you gave ten programmers a specs, you get ten implementations that follows the specs. With LLMs, you get random things.

This item has no comments currently.