"I am so sorry and heartbroken about having suggested that to play a sound you should use the, as you now inform me, non existing command and parameter `oboe --weird-format mysound.snd`, I'll check my information more thoroughly next time and make sure it will not happen again"...
Are you ok?
When a web site says "Sorry, page not found" do you start punching your monitor?
When the delivery guy leaves a note saying "Sorry we missed you" do you go to the depot to beat up the employees?
I think you are on a good trail to having understood what they meant.
The use of 'sorry' is not generally a problem because it is normally framed within expected behaviour and it can be taken as adequate for a true representation, or not blatantly false. But you could imagine scenarios in which the term would be misused into inappropriate formality or manipulation and yes, disrespect is "eliciting violence". You normally work a way in the situation to avoid violence - that is another story.
In "sorry, page not found" 'sorry' is the descriptor for a state (i.e. "not the better case"); in "sorry we missed you" it is just courtesy - and it does not generally cover fault or negligence. But look: there are regions that adopt "your call is important to us", and regions that tend to avoid it - because the suspect of it being inappropriate (false) can be strong.
The outputs of LLMs I have used frequently passes the threshold, and possibly their structural engineering - if you had in front of you a worker, in flesh and bones, that in its outputs wrote plausible fiction ("I imagined a command `oboe` because it sounded good in the story") as opposed to answering your question, but under the veneer of answering questions (which implies, outputting relevant world assessments, Truth based), that would be a right "sore" for "sorry". The anthropomorphic features of LLMs compromise the quality of their outputs in terms of form, especially in solution-finding attempts that become loops of "This is the solution" // "Are you sure?" // "Definitely" // "It is not" // "Oh, I'm so sorry! It will not happen again. This is the solution" (loop...).
Edit: it seems you may have also asked for clarifications about the contextual expression «clean societies». Those societies cybernetically healthy, in which feedback mechanisms work properly to fine-tune general mechanisms - with particular regard to fixing individual, then collective behaviour.
Ctrl+F for "Central nervous system":
https://en.wikipedia.org/wiki/List_of_human_cell_types
Choose any five wikilinks. Skim their distinct functions and pathologies:
https://en.wikipedia.org/wiki/List_of_regions_in_the_human_b...
https://en.wikipedia.org/wiki/Large-scale_brain_network
Evolution's many things, but maybe most of all lazy. Human intelligence has dozens of distinct neuron types and at least hundreds of differentiated regions/neural subnetworks because we need all those parts in order to be both sentient and sapient. If you lesion parts of the human brain, you lose the associated functions, and eventually end up with what we'd call mental/neurological illnesses. Delusions, obsessions, solipsism, amorality, shakes, self-contradiction, aggression, manipulation, etc.
LLMs don't have any of those parts at all. They only have pattern-matching. They can only lie, because they don't have the sensory, object permanence, and memory faculties to conceive of an immutable external "truth"/reality. They can only be hypocritical, because they don't have the internal identity and introspective abilities to be able to have consistent values. They cannot apologize in substance, because they have neither the theory of mind and self-awareness to understand what they did wrong, the social motivation to care, nor the neuroplasticity to change and be better. They can only ever be manipulative, because they don't have emotions to express honestly. And I think it speaks to a not-atypical Silicon Valley arrogance to pretend that they can replicate "intelligence", without apparently ever considering a high-school-level philosophy or psychology course to understand what actually lets human intelligence tick.
At most they're mechanical psychopaths [1]. They might have some uses, but never outweighing the dangers for anything serious. Some of the individuals who think this technology is anything remotely close to "intelligent" have probably genuinely fallen for it. The rest, I suppose, see nothing wrong because they've created a tool in their own image…
[1]: I use this term loosely. "Psychopathy" is not a diagnosis in the DSM-V, but psychopathic traits are associated with multiple disorders that share similar characteristics.
https://github.com/mukhal/intrinsic-source-citation
This is not something that can be LoRa finetuned after the pretraining step.
What we need is a human curated benchmark for different types of source-aware training, to allow competition, and an extra column in the most popular leaderboards, including it in the Average column, to incentivice AI companies to train in a source aware way, of course this will instantly invalidate the black-box-veil LLM companies love to hide behind so as not to credit original authors and content creators, they prefer regulators to believe such a thing can not be done.
In meantime such regulators are not thinking creatively and are clearly just looking for ways to tax AI companies, and in turn hiding behind copyright complications as an excuse to tax the flow of money wherever they smell it.
Source aware training also has the potential to decentralize search!
But I find the anthropomorphization and "AGI" narrative really creepy and grifty. Such a waste that that's the direction it's going.
And I wouldn't say lazy at _all_. I would say efficient. Even evolutionary features that look "bad" on the surface can still make sense if you look at the wider system they're a part of. If our tailbone caused us problems, then we'd evolve it away, but instead we have a vestigial part that remains because there are no forces driving its removal.
But the issue is with calling finished products what are laboratory partials. "Oh look, they invented a puppet" // "Oh, nice!" // "It's alive..."
In terms of people thinking LLMs are smarter than they really are, well...that's just people. Who hate each other for skin colour and sexuality, who believe that throwing salt over your shoulder wards away bad luck; we're still biological at the end of the day, we're not machines. Yet.
That is definitely not true.
In the context of the comment chain I replied to, and the behaviour in question, any statement by an LLM pretending to be be capable of self-awareness/metacognition is also necessarily a lie. "I should be more careful", "I sincerely apologize", "I realize", "Thank you for bringing this to my attention", etc.
The problem is the anthropomorphization. Since it pretends to be like a person, if you ascribe intention to it then I think it is most accurately described as always lying. If you don't ascribe intention to it, then it's just a messy PRNG that aligns with reality an impressive amount of the time, and words like "lying" have no meaning. But again, it's presented and marketed as if it's a trustworthy sapient intelligence.
Some parts seemingly stopped at "output something plausible", but it does not seem theoretically impossible to direct the output towards "adhere to the truth", if a world model is there.
We would still need to implement the "reason on your world model and refine it" part, for the purpose of AGI - meanwhile, fixing the "impersonation" fumble ("probabilistic calculus say your interlocutor should offer stochastic condolences") would be a decent move. After a while with present chatbots it seems clear that "this is writing a fiction, not answering questions".
Feels like they were trained with a gun to their heads. If I don't tell it it doesn't have to answer it'll generate nonsense in a confident voice.
It turns out that this process makes it useful at producing mostly sensible predictions (generate output) for text that is not present in the training set (generalization).
The reason that works is because there are a lot of patterns and redundancy in the stuff that we feed to the models and the stuff that we ask the models so there is a good chance that interpolating between words and higher level semantics relationship between sentences will make sense quite often.
However that doesn't work all the time. And when it doesn't, current models have no way to tell they "don't know".
The whole point was to let them generalize beyond the training set and interpolate in order to make decent guesses.
There is a lot of research in making models actually reason.
That being said, I'm aware that the model doesn't reason in the classical sense. Yet, as I mentioned, it does give me less confabulation when I tell it it's ok not to answer.
I will note that when I've tried the same kind of prompts with Phi 3 instruct, it's way worse than Gemma. Though I'm not sure if that's just because of a weak instruction tuning or the underlying training as well, as it frequently ignores parts of my instructions.
For example you can confabulate "facts" or you can make logical or coherence mistakes.
Current LLMs are encouraged to be creative and effectively "make up facts".
That's what created the first wow factor. The models are able to write a Star Trek fan fiction model in the style of Shakespeare. They are able to take a poorly written email and make it "sound" better (for some definition of better, e.g. more formal, less formal etc).
But then, human psychology kicked in and as soon as you have something that can talk like a human and some marketing folks label as "AI" you start expecting it to be useful also for other tasks, some of which require factual knowledge.
Now, it's in theory possible to have a system that you can converse with which can _also_ search and verify knowledge. My point is that this is not the place where LLMs start from. You have to add stuff on top of them (and people are actively researching that)
I hate this kind of thing so much.