Preferences

  > Something that really frustrates me about interacting with
Something that frustrates me with LLMs is that they are optimized such that errors are as silent as possible.

It is just bad design. You want errors to be as loud as possible. So they can be traced and resolved. On the other hand, LLMs optimize human preference (or some proxy of this). While humans prefer accuracy, it would be naive to ignore all the other things that optimize this objective. Specifically, humans prefer answers that they don't know are wrong over those that they do know are wrong.

This doesn't make LLMs useless but certainly it should strongly inform how we use them. Frankly, you cannot trust outputs, so you have to verify. I think this is where there's a big divergence between LLM users (and non-users). Those that blindly trust and those that don't (extreme case is non-users). If you need to constantly verify AND recognize that verification is extra hard (because it is optimized to be invisible to you), it can create extra work, not less.

It really is two camps and I think it says a lot:

  - "Blindly" trust
  - "Trust" but verify
Wide range of opinions in these two camps, but I think it comes down to some threshold of default trust or default suspicion.

This item has no comments currently.