Comment by keiferski - Hacker Neue

keiferski Dec 8, 2025 parent

The thing that bothers me the most about LLMs is how they never seem to understand "the flow" of an actual conversation between humans. When I ask a person something, I expect them to give me a short reply which includes another question/asks for details/clarification. A conversation is thus an ongoing "dance" where the questioner and answerer gradually arrive to the same shared meaning.

LLMs don't do this. Instead, every question is immediately responded to with extreme confidence with a paragraph or more of text. I know you can minimize this by configuring the settings on your account, but to me it just highlights how it's not operating in a way remotely similar to the human-human one I mentioned above. I constantly find myself saying, "No, I meant [concept] in this way, not that way," and then getting annoyed at the robot because it's masquerading as a human.

ryandrake Dec 8, 2025

LLMs all behave as if they are semi-competent (yet eager, ambitious, and career-minded) interns or administrative assistants, working for a powerful CEO-founder. All sycophancy, confidence and positive energy. "You're absolutely right!" "Here's the answer you are looking for!" "Let me do that for you immediately!" "Here is everything I know about what you just mentioned." Never admitting a mistake unless you directly point it out, and then all sorry-this and apologize-that and "here's the actual answer!" It's exactly the kind of personality you always see bubbling up into the orbit of a rich and powerful tech CEO.

No surprise that these products are all dreamt up by powerful tech CEOs who are used to all of their human interactions being with servile people-pleasers. I bet each and every one of them are subtly or overtly shaped by feedback from executives about how they should respond to conversation.

jacquesm Dec 8, 2025

> "You're absolutely right!" "Here's the answer you are looking for!" "Let me do that for you immediately!" "Here is everything I know about what you just mentioned." Never admitting a mistake unless you directly point it out, and then all sorry-this and apologize-that and "here's the actual answer!" It's exactly the kind of personality you always see bubbling up into the orbit of a rich and powerful tech CEO.

You may be on to something there: the guys and gals that build this stuff may very well be imbibing these products with the kind of attitude that they like to see in their subordinates. They're cosplaying the 'eager to please' element to the point of massive irritation and left out the one feature that could serve to redeem such behavior which is competence.

ryandrake Dec 8, 2025

> the guys and gals that build this stuff may very well be imbibing these products with the kind of attitude that they like to see in their subordinates

Or that the individual developers see in themselves. Every team I've worked with in my career had one or two of these guys: When the Director or VP came in to town, they'd instantly launch into brown-nose mode. One guy was overt about it and would say things like "So-and-so is visiting the office tomorrow--time to do some petting!" Both the executive and the subordinate have normalized the "royal treatment" on the giving and receiving end.

ako Dec 9, 2025

Maybe it’s just the fact that many models are trained by americans? I’ve seen great improvements in answers by asking it to “tone it down, answer like you’re British”.

jacquesm Dec 9, 2025

Oh interesting insight, I should try to see what that does. Jolly good old chap, let me check on why the laaabrary is on faaahre... ;)

pixelmelt Dec 8, 2025

An alternative is that these patterns just increase the likelihood of the next thing it outputs being correct, thus are useful to insert during training as the first thing the model says before giving an answer

jacquesm Dec 8, 2025

What's next, motivational speaking for LLMs?

monkpit Dec 8, 2025

I remember reading about speaking in an encouraging manner to agentic AI leading to better results, but I can’t seem to find a citation for this.

2 More Comments →

yannyu Dec 8, 2025

I agree entirely, and I think it's worthwhile to note that it may not even be the LLM that has that behavior. It's the entire deterministic machinery between the user and the LLM that creates that behavior, with the system prompt, personality prompt, RLHF, temperature, and the interface as a whole.

LLMs have an entire wrapper around them tuned to be as engaging as possible. Most people's experience of LLMs is a strongly social media and engagement economy influenced design.

rockskon Dec 8, 2025

Analogies of LLMs to humans obfuscates the problem. LLMs aren't like humans of any sort in any context. They're chat bots. They do not "think" like humans and applying human-like logic to them does not work.

not2b Dec 8, 2025

You're right, mostly, but the fact remains that the behavior we see is produced by training, and the training is driven by companies run by execs who like this kind of sycophancy. So it's certainly a factor. Humans are producing them, humans are deciding when the new model is good enough for release.

rockskon Dec 8, 2025

Do you honestly think an executive wanted a chat bot that confidently lies?

jandrese Dec 8, 2025

Do the lies look really good in a demo when you're pitching it to investors? Are they obscure enough that they aren't going to stand out? If so no problem.

dontlikeyoueith Dec 8, 2025

In practice, yes, though they wouldn't think of it that way because that's the kind of people they surround themselves with, so it's what they think human interaction is actually like.

7 More Comments →

jacquesm Dec 8, 2025

Given the matrix 'competent/incompetent' / 'sycophant/critic' I would not take it as read that the 'incompetent/sycophant' quadrant would have no adherents, and I would not be surprised if it was the dominant one.

mrguyorama Dec 8, 2025

People with immense wealth, connections, influence, and power demonstrably struggle to not surround themselves with people who only say what the powerful person already wants to hear regardless of reality.

Putin didn't think Russia could take Ukraine in 3 days with literal celebration by the populace because he only works with honest folks for example.

Rich people get disconnected from reality because people who insist on speaking truth and reality around them tend to stop getting invited to the influence peddling sessions.

ryandrake Dec 8, 2025

They may say they don't want to be lied to, but the incentives they put in place often inevitably result in them being surrounded by lying yes-men. We've all worked for someone where we were warned to never give them bad news, or you're done for. So everyone just lies to them and tells them everything is on track. The Emperor's New Clothes[1].

1: https://en.wikipedia.org/wiki/The_Emperor%27s_New_Clothes

not2b Dec 8, 2025

No, but they like the sycophancy.

Retric Dec 8, 2025

It’s not about thinking, it’s about what they are trained to do. You could train a LLM to always respond to every prompt by repeating the prompt in Spanish, but that’s not the desired behavior.

rzwitserloot Dec 8, 2025

I don't think these LLMs were explicitly designed based on the CEO's detailed input that boils down to 'reproduce these servile yes-men in LLM form please'.

Which makes it more interesting. Apparently reddit was a particularly hefty source for most LLMs; your average reddit conversation is absolutely nothing like this.

Separate observation: That kind of semi-slimey obsequious behaviour annoys me. Significantly so. It raises my hackles; I get the feeling I'm being sold something on the sly. Even if I know the content in between all the sycophancy is objectively decent, my instant emotional response is negative and I have to use my rational self to dismiss that part of the ego.

But I notice plenty of people around me that respond positively to it. Some will even flat out ignore any advice if it is not couched in multiple layers of obsequious deference.

Thus, that raises a question for me: Is it innate? Are all people placed on a presumably bell-curve shaped chart of 'emotional response to such things', with the bell curve quite smeared out?

Because if so, that would explain why some folks have turned into absolute zealots for the AI thing, on both sides of it. If you respond negatively to it, any serious attempt to play with it should leave you feeling like it sucks to high heavens. And if you respond positively to it - the reverse.

Idle musings.

jordanb Dec 8, 2025

The servile stuff was trained into them with RLHF with the trainers largely being low-wage workers in the global south. That's also where some of the other stuff like excessive em-dash stuff came from. I think it's a combination of those workers anticipating how they would be expected to respond by a first-world employer, and also explicit instructions given to them about how the robot should be trained.

OkayPhysicist Dec 8, 2025

I suspect a lot of the em-dash usage also comes from transcriptions of verbal media. In the spoken word, people use the kinds of asides that elicit an em-dash a lot.

mrguyorama Dec 8, 2025

I would bet like a dollar that the supposed em-dash usage (which I'm not convinced is an accurate take in the first place) would have come from an enterprising dev somewhere being like "Well, we probably don't need multiple tokens for hyphens" and coercing every dash type thing to just one hyphen like token.

But I'm also showing off my ignorance with how these machines turn text into tokens in practice.

3 More Comments →

Folcon Dec 8, 2025

This is a really interesting observation, as someone who feels disquiet as the obsequiousness, but have been getting used to just mentally skipping over the first paragraph that's put an interesting spin on my behaviour

Thanks!

DenisM Dec 8, 2025

It’s not innate. Purpose trained llm can be quite stubborn and not very polite.

mannanj Dec 8, 2025

Isn’t it kind of true that the systems we as servile people-pleasers have to operate out of are exactly these? The hierarchical status games and alpha-animal tribal dynamics are these. Our leaders who are so might and rich and powerful want to keep their position, and we don’t want to admit they have more influence than we do for things like AI now and so we stand and watch naively as they reward the people pleasers and eventually historically we learn(ed) it pays to please until leadership changes.

code_for_monkey Dec 8, 2025

thats the audience! Incompetent CEOS!

LogicFailsMe Dec 8, 2025

Nearly every woman I know who is an English as a second language speaker is leaning hard into these things currently to make their prose sound more natural. And that has segued into them being treated almost as a confidant or a friend.

As flawed as they are currently, I remain astounded that people think they will never improve and that people don't want a plastic pal who's fun to be with(tm).

I find them frustrating personally, but then I ask them deep technical questions on obscure subjects and I get science fiction in return.

pessimizer Dec 8, 2025

> I get science fiction in return.

And once this garbage is in your context, it's polluting everything that comes after. If they don't know, I need them to shut up. But they don't know when they don't know. They don't know shit.

LogicFailsMe Dec 8, 2025

I am reminded of AI summaries and Microsoft Copilot. All push low value. But I separate that from the underlying potential of the technology. And I wish we heard more from deep domain experts like Karpathy and less from influencer dilettantes like Dylan Patel about where this is going.

thfuran Dec 8, 2025

I want to query a bayesian ontology, not a Markov chain with delusions of grandeur.

TeMPOraL Dec 8, 2025

Alas, computation costs energy, so you get what you can afford.

Also one thing I thought LLMs did already is kill the misguided idea of applying prescriptive, formal categorization to the real world.

bwahah4 Dec 8, 2025

As an EE working in engineering 30 years, I ran out of fingers and toes 29 years ago trying to count the number of asocial, incompetent programmer Dark Triads who can only relate to the world through esoteric semantics unrelated to engineering problems right in front of them.

"To add two numbers I must first simulate the universe." types that created a bespoke DSL for every problem. Software engineering is a field full of educated idiots.

Programmers really need to stop patting themselves on the back. Same old biology with the same old faults. Programmers are subjected to the same old physics as everyone else.

pardon_me Dec 8, 2025

The problem with these LLM chat-bots is they are too human, like a mirror held up to the plastic-fantastic society we have morphed into. Naturally programmed to serve as a slave to authority, this type of fake conversation is what we've come to expect as standard. Big smiles everyone! Big smiles!!

pessimizer Dec 8, 2025

Nah. Talking like an LLM would get you fired in a day. People are already suspicious of ass-kissers, they hate it when they think people are not listening to them, and if you're an ass-kisser who's not listening and is then wrong about everything, they want you escorted out by security.

The real human position would be to be an ass-kisser who hangs on every word you say, asks flattering questions to keep you talking, and takes copious notes to figure out how they can please you. LLMs aren't taking notes correctly yet, and they don't use their notes to figure out what they should be asking next. They're just constantly talking.

bwahah4 Dec 8, 2025 (dead)

andoando Dec 8, 2025

There is sort of the opposite problem as well, as the top comment was saying, where it can super confidently propose that its absolutely right and you're wrong instead of asking questions to try and understand what you mean.

ranger_danger Dec 10, 2025

> LLMs all behave as if they are semi-competent

Only in the same way that all humans behave the same.

You can prompt an LLM to talk to you however you want it to, it doesn't have to be nice to you.

ares623 Dec 8, 2025

Looking forward to living in a society where everyone feels like they’re CEOs.

catigula Dec 8, 2025

This is partly true, partly false, partly false in the opposite direction, with various new models. You really need to keep updating and have tons of interactions regularly in order to speak intelligently on this topic.

skeeter2020 Dec 8, 2025

maybe this is also part of the problem? Once I learn the idiosyncrasies of a person I don't expect them to dramatically change overnight, I know their conversational rhythms and beat; how to ask / prompt / respond. LLMs are like a eager sycophantic intern how completely changes their personality from conversation to conversation, or - surprise - exactly like a machine

catigula Dec 8, 2025

>LLMs are like a eager sycophantic intern how completely changes their personality from conversation to conversation

Again, this isn't really true with some recent models. Some have the opposite problem.

Cheer2171 Dec 8, 2025

> LLMs all

Sounds like you don't know how RLHF works. Everything you describe is post-training. Base models can't even chat, they have to be trained to even do basic conversational turn taking.

jacquesm Dec 8, 2025

> Everything you describe is post-training. Base models can't even chat, they have to be trained to even do basic conversational turn taking.

So, that's still training then, so not 'post-training'. Just a different training phase.

dingnuts Dec 8, 2025 (dead)

jodrellblank Dec 8, 2025

> LMs don't do this. Instead, every question is immediately responded with extreme confidence with a paragraph or more of text.

Having just read a load of Quora answers like this, which did not cover the thing I was looking for, that is how humans on the internet behave and how people have to write books, blog posts, articles, documentation. Without the "dance" to choose a path through a topic on the fly, the author has to take the burden of providing all relevant context, choosing a path, explaining why, and guessing at any objections and questions and including those as well.

It's why "this could have been an email" is a bad shout. The summary could have been an email, but the bit which decided on that being the summary would be pages of guessing all the things which what might have been in the call and which ones to include or exclude.

goalieca Dec 8, 2025

This is a recent phenomenon. It seems most of the pages today are SEO optimized LLM garbage with the aim of having you scroll past three pages of ads.

THe internet really used to be efficient and i could always find exactly what i wanted with an imprecise google search ~ 15 years ago.

Pxtl Dec 8, 2025

You'd think with the reputation of LLMs being trained on Twitter (pre-Musk radicalization) and Reddit, they'd be better at understanding normal conversation flow since twitter requires short responses and Reddit... while Wall of Text happens occasionally, it's not the typical cadence of the discussion.

9rx Dec 8, 2025

Reddit and Twitter don't have human conversations. They have exchanges of confident assertions followed with rebuttals. In fact, both of our comments are perfect demonstrations of exactly that too. Fairly reflective of how LLMs behave — except nobody wants to "argue" with an LLM like Twitter and Reddit users want to.

This is not how humans converse in human social settings. The medium is the message, as they say.

shagie Dec 8, 2025

Twitter, Reddit, HN don't always have the consistency of conversation that two people talking do.

Even here, I'm responding to you on a thread that I haven't been in on previously.

There's also a lot more material out there in the format of Stack Exchange questions and answers, Quora posts, blog posts and such than there is for consistent back and forth interplay between two people.

IRC chat logs might have been better...ish.

The cadence for discussion is unique to the medium in which the discussion happens. What's more, the prompt may require further investigation and elaboration prior to a more complete response, while other times it may be something that requires story telling and making it up as it goes.

AznHisoka Dec 8, 2025

Don’t you get this today with AI Overviews summarizing everything on top of most Google results?

i80and Dec 8, 2025

The AI Overviews are... extremely bad. For most of my queries, Google's AI Overview misrepresents its own citations, or almost as bad, confidently asserts a falsehood or half-truth based on results that don't actually contain an answer to my search query.

I had the same issue with Kagi, where I'd follow the citation and it would say the opposite of the summary.

A human can make sense of search results with a little time and effort, but current AI models don't seem to be able to.

wat10000 Dec 8, 2025

Cheap AI models aren't good at this, anyway, and AI Overviews have to use cheap models since they get used so much. They would be a lot better (still need to check, but they'd be much less stupid) if they used something like GPT-5, but that's just not feasible right now.

Pxtl Dec 8, 2025

From a UX perspective, the AI overview summary being a multi-paragraph summary makes sense since that was a single query that isn't expected to have conversational context. Where it does not make sense is in conversation-based interfaces. Like, the most popular product is literally called "chat".

"I ask a short and vague question and you response with a scrollbar-full of information based on some invalid assumptions" is not, by any reasonable definition, a "chat".

djeastm Dec 8, 2025

I find myself skipping the AI overview like I used to skip over "Sponsored" results back in the day, looking for a trustworthy domain name.

jacquesm Dec 8, 2025

Those AI overviews are dumb and wrong so often I have cut them out of the results entirely. They're embarrassing, really.

ses1984 Dec 8, 2025

It’s fine about 80% of the time, but the other 20% is a lot harder to answer because of lower quality results.

jtr1 Dec 8, 2025

Interesting. Like many people here, I've thought a great deal about what it means for LLMs to be trained on the whole available corpus of written text, but real world conversation is a kind of dark matter of language as far as LLMs are concerned, isn't it? I imagine there is plenty of transcription in training data, but the total amount of language use in real conversational surely far exceeds any available written output and is qualitatively different in character.

This also makes me curious to what degree this phenomenon manifests when interacting with LLMs in languages other than English? Which languages have less tendency toward sycophantic confidence? More? Or does it exist at a layer abstracted from the particular language?

Ajedi32 Dec 8, 2025

That's part of it, but I think another part is just the way the LLMs are tuned. They're capable of more conversational tones, but human feedback in post-training biases them toward a writing style that's more of a Quora / StackOverflow / Reddit Q&A style because that's what gets the best ratings during the RLHF process.

rafamct Dec 8, 2025

Yes you're totally right! I misunderstood what you meant, let me write six more paragraphs based on a similar misunderstanding rather than just trying to get clarification from you

wlesieutre Dec 8, 2025

My favorite is when it bounces back and forth between the same two wrong answers, each time admitting that the most recent answer is wrong and going back to the previous wrong answer.

Doesn't matter if you tell it "that's not correct and neither is ____ so don't try that instead," it likes those two answers and it's going to keep using them.

BubbleRings Dec 8, 2025

Ha! Just experienced this. It was very frustrating.

amelius Dec 8, 2025

They really need to add a "punish the LLM" button.

danuker Dec 8, 2025

Some services have the down thumb

amelius Dec 9, 2025

I need something stronger than that.

heavyset_go Dec 8, 2025

The false info baked into its context at that point in the conversation and it will get stuck in a local minima trying to generate a response to the given context.

j16sdiz Dec 8, 2025

Once the context is polluted with wrong information, it is almost impossible to get it right again.

The only reliable way to recover is to edit your previous question to include the clarification, and let it regenerate the answer.

zenoprax Dec 8, 2025

ChatGPT offered a "robotic" personality which really improved my experience. My frustrations were basically decimated right away and I quickly switched to a more "You get out of it what you put in" mindset.

And less than two weeks in they removed it and replaced it with some sort of "plain and clear" personality which is human-like. And my frustrations ramped up again.

That brief experiment taught me two things: 1. I need to ensure that any robots/LLMs/mech-turks in my life act at least as cold and rational as Data from Star Trek. 2. I should be running my own LLM locally to not be at the whims of $MEGACORP.

danuker Dec 8, 2025

> I should be running my own LLM

I approve of this, but in your place I'd wait for hardware to become cheaper when the bubble blows over. I have a i9-10900, and bought an M.2 SSD and 64GB of RAM in july for it, and get useful results with Qwen3-30B-A3B (some 4-bit quant from unsloth running on llama.cpp).

It's much slower than an online service (~5-10 t/s), and lower quality, but it still offers me value for my use cases (many small prototypes and tests).

In the mean time, check out LLM service prices on https://artificialanalysis.ai/ Open source ones are cheap! Lower on the homepage there's a Cost Efficiency section with a Cost vs Intelligence chart.

zenoprax Dec 9, 2025

I have a 9070 XT (16 GB VRAM) and it is fast with deepseek-r1:14B but I didn't know about that Qwen model. Most of the 'better' models will crash for lack of RAM.

https://dev.to/composiodev/qwen-3-vs-deep-seek-r1-evaluation...

If it runs, it looks like I can get a bit more quality. Thanks for the suggestion.

QuercusMax Dec 8, 2025

Sort of a personal modified Butlerian Jihad? Robots / chatbots are fine as long as you KNOW they're not real humans and they don't pretend to be.

Archelaos Dec 8, 2025

I never expected LLMs to be like an actual conversation between humans. The model is in some respects more capable and in some respects more limited than a human. I mean, one could strive for an exact replica of a human -- but for what purpose? The whole thing is a huge association machine. It is a surealistic inspiration generator for me. This is how it works at the moment, until the next break through ...

wongarsu Dec 8, 2025

> but for what purpose?

I recently introduced a non-technical person to Claude Code, and this non-human behavior was a big sticking point. They tried to talk to Claude similar as to a human, presenting it one piece of information at a time. With humans this is generally beneficial, and they will either nod for you to continue or ask clarifying questions. With Claude this does not work well, you have to infodump as much as possible in each message

So even from a perspective of "how do we make this automaton into the best tool", a more human-like conversation flow might be beneficial. And that doesn't seem beyond the technological capabilities at all, it's just not what we encourage in today's RLHF

falcor84 Dec 8, 2025

I often find myself in these situations where I'm afraid that if I don't finish infodumping everything in a single message, it'll go in the wrong direction. So what I've been doing is switching it back to Plan Mode (even when I don't need a plan as such), just as a way of telling it "Hold still, we're still having a conversation".

rkj93 Dec 8, 2025

I do this with cursor ai too. I tell, don't change anything, let me hear out what you plan to fix and what you will change

monerozcash Dec 8, 2025

I haven't tried claude, but Codex manages this fine as long as you prompt it correctly to get started.

A lazy example:

"This goal of this project is to do x. Let's prepare a .md file where we spec out the task. Ask me a bunch of questions, one at a time, to help define the task"

Or you could just ask it to be more conversational, instead of just asking questions. It will do that.

paddleon Dec 8, 2025

also, this is what chat-style interfaces encourage. Anything where the "enter" key sends the message instead of creating a paragraph block is just hell.

I'm prompting Gemini, and I write:

I have the following code, can you help me analyze it? <press return>

but Gemnini is already generating output, usually saying "I'm waiting for you to enter the code"

lkbm Dec 8, 2025

Yeah, seems like current models might benefit from a more email-like UI, and this'll be more true as they get longer task time horizons.

Maybe we want a smaller model tuned for back and forth to help clarify the "planning doc" email. Makes sense that having it all in a single chat-like interface would create confusion and misbehavior.

TheGoddessInari Dec 8, 2025

Like many chat-style interfaces, it's typically shift-enter to insert a newline.

bwat49 Dec 8, 2025

its so easy to accidentally hit enter though lol, I usually type larger prompts in my notes and copy paste then finished

HPsquared Dec 8, 2025

I usually do the "drip feed" with ChatGPT, but maybe that's not optimal. Hmm, maybe info dump is a good thing to try.

lkbm Dec 8, 2025

There a recent(ish: May 2025) paper about how drip-feeding information is worse than restarting with a revised prompt once you realize details are missing.[0]

[0] https://arxiv.org/abs/2505.06120

__del__ Dec 9, 2025

this has been my casual finding as well. why would i want all that conversational crap in the context window?

jay_kyburz Dec 8, 2025

I hate when I accidentally hit return halfway through writing my prompt and it gives me two pages of advice about some nonsense half sentence.

lxgr Dec 8, 2025

Clarifying ambiguity in questions before dedicating more resources to search and reasoning about the answer seems both essential and almost trivial to elicit via RLHF.

I'd be surprised if you can't already make current models behave like that with an appropriate system prompt.

keiferski OP Dec 8, 2025

The disconnect is that companies are trying desperately to frame LLMs as actual entities and not just an inert tech tool. AGI as a concept is the biggest example of this, and the constant push to "achieve AGI" is what's driving a lot of stock prices and investment.

A strictly machinelike tool doesn't begin answers by saying "Great question!"

herf Dec 8, 2025

Training data is quite literally weighted this way - long responses on Reddit have lots of tokens, and brief responses don't get counted nearly as much.

The same goes for "rules" - you train an LLM with trillions of tokens and try to regulate its behavior with thousands. If you think of a person in high school, grading and feedback is a much higher percentage of the training.

9rx Dec 8, 2025

Not to mention that Reddit users seek "confident idiots". Look at where the thoughtful questions that you'd expect to hear in a human social setting end up (hint: Downvoted until they disappear). Users on Reddit don't want to have to answer questions. They want to read the long responses that they can then nitpick. LLMs have no doubt picked up on that in the way they are trained.

heresie-dabord Dec 8, 2025

> The thing that bothers me the most about LLMs is

What bothers me the most is the seemingly unshakable tendency of many people to anthropomorphise this class of software tool as though it is in any way capable of being human.

What is it going to take? Actual, significant loss of life in a medical (or worse, military) context?

gmueckl Dec 8, 2025

It's the fact that these are competent human language word salad generators that messes with human psychology.

heresie-dabord Dec 8, 2025

My calculator produces accurate, verifiable results.

My calculator is a great tool but it is not a mathematician. Not by a long shot.

thfuran Dec 8, 2025

And you can tell this pretty easily because it can't have a conversation with you.

ux266478 Dec 8, 2025

That qualifier only makes the anthropormorphization more sound. Have you actually thought it through? Give an untrained and unspecialized human the power to cause significant loss of life in a medical context in the same exact capacity, and it's all but guaranteed that's the outcome you'll end up with.

I think it's important to be skeptical and push back against a lot of the ridiculous mass-adoption of LLMs, but not if you can't actually make a well-reasoned point. I don't think you realize the damage you do when the people gunning for mass proliferation of LLMs in places they don't belong can only find examples of incoherent critique.

heresie-dabord Dec 8, 2025

> an untrained and unspecialized human

An untrained and unspecialised human can be trained quickly and reliably for the cost of meals and lodging and will very likely actually try to do the right thing because of personal accountability.

Delegating responsibility to badly-designed or outright unfit-for-purpose systems because of incoherent confidence is plainly a bad plan.

As for the other nuances of your post, I will assume the best intention.

vidarh Dec 8, 2025

A lot of this, I suspect, on the basis of having worked on a supervised fine-tuning project for one of the largest companies in this space, is that providers have invested a lot of money in fine-tuning datasets that sound this way.

On the project I did work on, reviewers were not allowed to e.g. answer that they didn't know - they had to provide an answer to every prompt provided. And so when auditing responses, a lot of difficult questions had "confidently wrong" answers because the reviewer tried and failed, or all kinds of evasive workarounds because they knew they couldn't answer.

Presumbly these providers will eventually understand (hopefully already has - this was a year ago) that they also need to train the models to understand when the correct answer is "I don't know", or "I'm not sure. I think maybe X, but ..."

ActorNightly Dec 8, 2025

Its not the training/tuning, its pretty much the nature of llms. The whole idea is to give a best quess of the token. The more complex dynamics behind the meaning of the words and how those words relate to real world concepts isn't learned.

vidarh Dec 8, 2025

You're not making any sense. The best guess will often be refusals if they see enough of that in the training data, so of course it is down to training

And I literally saw the effect of this first hand, in seeing how the project I worked on was actively part of training this behaviour into a major model.

As for your assertion they don't learn the more complex dynamics, that was trite and not true already several years ago.

Workaccount2 Dec 8, 2025

They are purposely trained to be this way.

In a way it's benchmaxxing because people like subservient beings that help them and praise them. People want a friend, but they don't want any of that annoying friction that comes with having to deal with another person.

jacquesm Dec 8, 2025

If you're paying per token then there is a big business incentive for the counterparty to burn tokens as much as possible.

dboon Dec 8, 2025

Making a few pennies more from inference is not even on the radar of the labs making frontier models. The financial stakes are so much higher than that for them.

lxgr Dec 8, 2025

As long as there's no moat (and arguably current LLM inference APIs are far from having one), it arguably doesn't really matter what users pay by.

The only thing I care about are whether the answer helps me out and how much I paid for it, whether it took the model a million tokens or one to get to it.

lkbm Dec 8, 2025

If I'll pay to get a fixed result, sure. I'd expect a Jevons paradox effect: if LLMs got me results twice as fast for the same cost, I'm going to use it more and end up paying more in total.

Maximizing the utility of your product for users is usually the winning strategy.

wincy Dec 8, 2025

Cursor Plan mode works like this. It restricts the LLMs access to your environment and will allow you to iteratively ask and clarify and it’ll piece together a plan that it allows you to review before it takes any action.

ChatGPT deep research does this but it’s weird and forced because it asks one series of questions and then goes off to the races, spending a half hour or more building a report. It’s frustrating if you don’t know what to expect and my wife got really mad the first time she wasted a deep research request asking it “can you answer multiple series of questions?” Or some other functionality clarifying question.

I’ve found Crusor’s plan mode extremely useful, similar to having a conversation with a junior or offshore team member who is eager to get to work but not TOO eager. These tools are extremely useful we just need to get the guard rails and user experience correct.

HPsquared Dec 8, 2025

There are plenty of LLM services that have a conversational style. The paragraph blocks thing is just a style.

LogicFailsMe Dec 8, 2025

My favorite description of an LLM so far is of a typical 37-year-old male Reddit user. And in that sense, we have already created the AGI.

rossant Dec 8, 2025

Lately, ChatGPT 5.1 has been less guilty of this and sometimes holds off answering fully and just asks me to clarify what I meant.

chemotaxis Dec 8, 2025

This is not necessarily a fundamental limitation. It's a consequence of a fine-tuning process where human raters decide how "good" an answer is. They're not rating the flow of the conversation, but looking at how complete / comprehensive the answer to a one-shot question looks like. This selects for walls of overconfident text.

Another thing the vendors are selecting for is safety / PR risk. If an LLM answers to a hobby chemistry question in a matter-of-factly way, that's a disastrous PR headline in the making. If they open with several paragraphs of disclaimers or just refuse to answer, that's a win.

nowittyusername Dec 8, 2025

Its not a magic technology, they can only represent data they were trained on. Naturally most represented data in their training data is NOT conversational. Consider that such data is very limited and who knows how it was labeled if at all during pretraining. But with that in mind, LLM's definitely can do all the things you describe, but a very robust and well tested system prompt has to be used to coax this behavior out. Also a proper model has to be used, as some models are simply not trained for this type of interaction.

solumunus Dec 8, 2025

You just need to be more explicit. Including “ask clarifying questions” in your prompt makes a huge difference. Not sure if you use Claude Code but if you do, use plan mode for almost every task.

jstummbillig Dec 8, 2025

a) I find myself fairly regularly irritated by the flow of human-human conversations. In fact, it's more common than not. Of course, I have years of practice handling that more or less automatically, so it rarely raises to the level of annoyance, but it's definitely work I bring to most conversations. I don't know about you but that's not really a courtesy I extend to the LLM.

b) If it is, in fact, just one setting away, then I would say it's operating fairly similarly?

not_ai Dec 8, 2025

I didn't have the words to articulate some of my frustrations, but I think you summed it up nicely.

For example, there's been many times when they take it too literally instead of looking at the totality of the context and what was written. I'm not an LLM, so I don't have perfect grasp on every vocab term for every domain and it feels especially pandering when they repeat back the wrong word but put it in quotes or bold instead of simply asking if I meant something else.

max51 Dec 8, 2025

>LLMs don't do this

They did at the beginning. It used to be that if you wanted a full answer with an intro, bullet points, lists of pros/cons, etc., you had to explicitly ask for it in the prompt. The answers were also a lot more influenced by the tone of the prompt instead of being forced into answering with a specific format like it does right now.

quietbritishjim Dec 8, 2025

That just means that you need to learn to adapt to the situation: Make your prompt a carefully crafted multi-paragraph description of every detail of the problem and what you want from the solution, with bullet points if appropriate.

Maybe it feels a bit sad that you have follow what the LLM wants, but that's just how any tool works really.

DoneWithAllThat Dec 8, 2025

When using an LLM for anything serious (such as at work) I have a standard canned postscript along the lines of “if anything about what I am asking is unclear or ambiguous, or if you need more context to understand what I’m asking, you will ask for clarification rather than try to provide an answer”. This is usually highly effective.

zby Dec 8, 2025

When I expect it to do that I just end my prompt with '. Discuss' - usually this works really well. Not exactly human like - it tries to list all questions and variants at once - but most with good default answers so I only need to engage with a couple of them.

luijk Dec 8, 2025

By default they don't ask questions. You can craft that behaviour with the system message or account settings. Though they will tend to ask 20 questions at once so you have to request it to limit to one question at a time to get a more natural experience.

adamisom Dec 8, 2025

I like Manus's suggested follow-up questions.

In fact, sometimes I screenshot them and use Mac's new built-in OCR to copy them, because Manus gives me three options but they disappear if I click one, and sometimes I really like 2 or even all 3.

__turbobrew__ Dec 8, 2025

The day when the LLM responds to my question with another question will be quite interesting. Especially at work, when someone asks me a question I need to ask for clarifying information to answer the original question fully.

djeastm Dec 8, 2025

Have you tried adding a system prompt asking for this behavior? They seem to readily oblige when I ask for this (e.g. brainstorming)

cortesoft Dec 8, 2025

Have you used Claude much? It often responds to things with questions

heavyset_go Dec 8, 2025

I don't want to talk to a computer like I would a human

morksinaanab Dec 8, 2025

I suspect that's because, trained on website content, seo values more text (see recipe websites). So the default response is fluff.

morkalork Dec 8, 2025

This drives me nuts when trying to bounce an architecture or coding solution idea off an LLM. A human would answer with something like "what if you split up the responsibility and had X service or Y whatever". No matter how many times you tell the LLM not to return code, it returns code. Like it can't think or reason about something without writing it out first.

shagie Dec 8, 2025

> Like it can't think or reason about something without writing it out first.

Setting aside the philosophical questions around "think" and "reason"... it can't.

In my mind, as I write this, I think through various possibilities and ideas that never reach the keyboard, but yet stay within my awareness.

For an LLM, that awareness and thinking through can only be done via its context window. It has to produce text that maintains what it thought about in order for that past to be something that it has moving forward.

There are aspects to a prompt that can (in some interfaces) hide this internal thought process. For example, the ChatGPT has the "internal thinking" which can be shown - https://chatgpt.com/share/69278cef-8fc0-8011-8498-18ec077ede... - if you expand the first "thought for 32 seconds" bit it starts out with:

    I'm thinking the physics of gravity assists should be stable enough for me to skip browsing since it's not time-sensitive. However, the instructions say I must browse when in doubt. I’m not sure if I’m in doubt here, but since I can still provide an answer without needing updates, I’ll skip it.

(aside: that still makes me chuckle - in a question about gravity assists around Jupiter, it notes that its not time-sensitive... and the passage "I’m not sure if I’m in doubt here" is amusing)

However, this is in the ChatGPT interface. If I'm using an interface that doesn't allow internal self-prompts / thoughts to be collapsed then such an interface would often be displaying code as part of its working through the problem.

You'll also note a bit of the system prompt leaking in there - "the instructions say I must browse when in doubt". For an interface where code is the expected product, then there could be system prompts that also get in there that try to always produce code.

dwaltrip Dec 8, 2025

I have architectural discussions all the time with coding agents.

basscomm Dec 8, 2025

> Like it can't think or reason about something without writing it out first.

LLM's neither think nor reason at all.

ModernMech Dec 8, 2025

Right, so LLM companies should stop advertising their models can think and reason.

dev1ycan Dec 8, 2025

But that would burst their valuation bubble as investors would realize it's a technology that already hit its realistic ceiling in usability.

TimPC Dec 8, 2025

The benchmarks are dumb but highly followed so everyone optimizes for the wrong thing.

motoboi Dec 8, 2025

Reflect a moment over the fact that LLMs currently are just text generators.

Also that the conversational behavior we see it’s just examples of conversations that we have the model to mimic so when we say “System: you are a helpful assistant. User: let’s talk. Assistant:” it will complete the text in a way that mimics a conversation?.

Yeah, we improved over that using reinforcement learning to steer the text generation into paths that lead to problem solving and more “agentic” traces (“I need to open this file the user talked about to read it and then I should run bash grep over it to find the function the user cited”), but that’s just a clever way we found to let the model itself discover which text generation paths we like the most (or are more useful to us).

So to comment on your discomfort, we (humans) trained the model to spill out answers (there are thousand of human being right now writing nicely though and formatted answers to common questions so that we can train the models on that).

If we try to train the models to mimic long dances into shared meaning we will probably decrease their utility. And we won’t be able anyway to do that because then we would have to have customized text traces for each individual instead of question-answers pairs.

Downvoters: I simplified things a lot here, in name of understanding, so bear with me.

MangoToupe Dec 8, 2025

> Reflect a moment over the fact that LLMs currently are just text generators.

You could say the same thing about humans.

y0eswddl Dec 8, 2025

No, you actually can't.

Humans existed for 10s to 100s of thousands of years without text. or even words for that matter.

MangoToupe Dec 8, 2025

I disagree: it is language that makes us human.

andoando Dec 8, 2025

I disagree. You're still human if you're deaf and mute. Our intellectual processing powers, or of animals for that matter, has nothing to do with language.

3 More Comments →

vlowther Dec 8, 2025

No, you cannot. Our abstract language abilities (especially the written word part) are a very thin layer on top of hundreds of millions of years of evolution in an information dense environment.

MangoToupe Dec 8, 2025

Sure, but language is the only thing that meaningfully separates us from other great apes

1718627440 Dec 8, 2025

Not it isn't most animals also have a language and humans do way more things differently, than just speak.

MangoToupe Dec 8, 2025

> most animals also have a language

Bruh

nosianu Dec 8, 2025

The human world model is based on physical sensors and actions. LLMs are based on our formal text communication. Very different!

Just yesterday I observed myself acting on an external stimulus without any internal words (this happens continuously, but it is hard to notice because we usually don't pay attention to how we do things): I sat in a waiting area of a cinema. A woman walked by and dropped her scarf without noticing. I automatically without thinking raised arm and pointer finger towards her, and when I had her attention pointed behind her. I did not have time to think even a single word while that happened.

Most of what we do does not involved any words or even just "symbols", not even internally. Instead, it is a neural signal from sensors into the brain, doing some loops, directly to muscle activation. Without going through the add-on complexity of language, or even "symbols".

Our word generator is not the core of our being, it is an add-on. When we generate words it's also very far from being a direct representation of internal state. Instead, we have to meander and iterate to come up with appropriate words for an internal state we are not even quite aware of. That's why artists came up with all kinds of experiments to better represent our internal state, because people always knew the words we produce don't represent it very well.

That is also how people always get into arguments about definitions. Because the words are secondary, and the further from the center of established meaning for some word you get the more the differences show between various people. (The best option is to drop insisting of words being the center of the universe, even just the human universe, and/or to choose words that have the subject of discussion more firmly in the center of their established use).

We are text generators in some areas, I don't doubt that. Just a few months ago I listened to some guy speaking to a small rally. I am certain that not a single sentence he said was of his own making, he was just using things he had read and parroted them (as a former East German, I know enough Marx/Engels/Lenin to recognize it). I don't want to single that person out, we all have those moments, when we speak about things we don't have any experiences with. We read text, and when prompted we regurgitate a version of it. In those moments we are probably closest to LLM output. When prompted, we cannot fall back on generating fresh text from our own actual experience, instead we keep using text we heard or read, with only very superficial understanding, and as soon as an actual expert shows up we become defensive and try to change the context frame.

MangoToupe Dec 8, 2025

Without language we're just bald, bipedal chimps. Language is what makes us human.

> The human world model

Bruh this concept is insane

emp17344 Dec 8, 2025

How do you reconcile this belief with the fact that we evolved from organisms that had no concept of text?

MangoToupe Dec 8, 2025

What is there to reconcile? Humans are not the things we evolved from.

smikhanov Dec 8, 2025

You could, but you’d be missing a big part of the picture. Humans are also (at least) symbol manipulators.

MangoToupe Dec 8, 2025

Same thing

dominotw Dec 8, 2025

same experience. i try to learn with it but i can't really tell if what its teaching me is actually correct or merely making up when i challenge it with followup questions.

gowld Dec 8, 2025

There are billions of humans. Not every one speaks the same way all the time. The default behavior is trying to be useful for most people.

It's easy to skip and skim content you don't care about. It's hard to prod and prod to get to to say something you do care about it if the machine is traint to be very concise.

Complaining the AI can't read your mind is exceptionally high praise for the AI, frankly.

catigula Dec 8, 2025

Claude doesn't really have this problem.

Traubenfuchs Dec 8, 2025

> When I ask a person something, I expect them to give me a short reply which includes another question/asks for details/clarification. A conversation is thus an ongoing "dance" where the questioner and answerer gradually arrive to the same shared meaning.

You obviously never wasted countless hours trying to talk to other people on online dating apps.

bwahah4 Dec 8, 2025

In the US anyway, most adults read at a middle school level.

It's not "masquerading as a human". The majority of humans are functional illiterates who only understand the world through the elementary principles of their local culture.

It's the minority of the human species that take what amounts to little more than arguing semantics that need the reality check. Unless one is involved in work that directly impacts public safety (defined as harm to biology) the demand to apply one concept or another is arbitrary preference.

Healthcare, infrastructure, and essential biological support services are all most humans care about. Everything else the majority see as academic wank.

This item has no comments currently.

Preferences

Keyboard Shortcuts

Story Lists

Navigation

Miscellaneous