Preferences

I don't really find this a helpful line to traverse. By this line of inquiry most of the things in software are psychological.

Whether something is a bug or feature.

Whether the right thing was built.

Whether the thing is behaving correctly in general.

Whether it's better at the very moment that the thing occasionally works for a whole range of stuff or that it works perfectly for a small subset.

Whether fast results are more important than absolutely correct results for a given context.

Yes, all things above are also related with each other.

The most we have for LLMs is tallying up each user's experience using an LLM for a period of time for a wide rane of "compelling" use cases (the pairing of their prompts and results are empirical though right?).

This should be no surprise, as humans often can't agree on an end-all-be-all intelligence test for humans either.


No. I'm saying that if you take the same exact LLM on the same exact set of hardware and serve it to the same exact humans, a sizeable amount of them will still complain about "model nerfs".

Why? Because humans suck.

This item has no comments currently.

Keyboard Shortcuts

Story Lists

j
Next story
k
Previous story
Shift+j
Last story
Shift+k
First story
o Enter
Go to story URL
c
Go to comments
u
Go to author

Navigation

Shift+t
Go to top stories
Shift+n
Go to new stories
Shift+b
Go to best stories
Shift+a
Go to Ask HN
Shift+s
Go to Show HN

Miscellaneous

?
Show this modal