It got me thinking that in general, people with options will probably sort themselves out of those situations and into organizations with like-minded people who use AI as a tool to multiply their impact (and I flatter myself to think that it will be high ability people who have those options), leaving those more reliant on AI to operate at the limit of what they get from OpenAI et al.
I wouldn't want that job, but I also don't currently know how to bring demonstrable evidence that they're incompetent, either
I have roughly the same opinion about UX folks, but they don't jam up my day to day nearly as much as PMs
Otherwise, it's a matter of time until the house of cards falls down and the company stagnates (sadly, the timescales are less of a house of cards, and more like a coal mine fire).
Has anyone else ever dealt with a somewhat charismatic know-it-all who knows just enough to give authoritative answers? LLM output often reminds me of such people.
At first glance, it’s easy to compare them to a charismatic “know-it-all” who sounds confident while being only half-right. After all, both can produce fluent, authoritative-sounding answers that sometimes miss the mark. But here’s where the comparison falls short — and where LLMs really shine:
(...ok ok, I can't go on.)
What's next—the interrobang‽
It is good they are being unmasked. You must avoid those people and warn your children about them. They are not safe to be around.
Dangerous actually, the effect it had on children. Of course they loved it because it had a happy ending, but at what price?
[1] https://dictionary.cambridge.org/us/dictionary/english/bulls...
The bullshitter doesn't care whether what he says is correct or not, as long as it's convincing.
The use of 'pretty sure' disqualifies you. I appreciate your humility.
Bad human code to me is at least more understandable in what it was trying to do. There's a goal you can figure out and fix it. It generally operates within the context of larger code to some extant.
Bad LLM code can be broken from start to finish in ways that make zero sense. Even worse when it re-invents the wheel and replaces massive amounts of code. Human aren't likely just make up a function or methods that don't exist and deploy it. That's not the best example as you'd likely find that out fast, but it's the kind of screw up that indicates the entire chunk of LLM code you're examining may in fact be fundamentally flawed beyond normal experience. In some cases you almost need to re-learn the entire codebase to truly realize "oh this is THAT bad and none of this code is of any value".
I'm working with a fairly arcane technical spec that I don't really understand so well so I ask Claude to evaluate one of our internal proposals on this spec for conformance. It highlights a bunch of mistakes in our internal proposal.
I send those off to someone in our company that's supposed to be an authority on the arcane spec with the warning that it was LLM generated so it might be nonsense.
He feeds my message to his LLM and asks it to evaluate the criticisms. He then messages me back with the response from his LLM and asks me what I think.
We are functionally administrative assistants for our AIs.
If this is the future of software development, I don't like it.
My reaction is usually, "Oh, we're doing this? Fine." I'll even prompt my LLM with something like, "Make it sound as corporate and AI-generated as possible." Or, if I'm feeling especially petty, "Write this like you're trying to win the 2025 award for Most Corporate Nonsense, and you're a committee at a Fortune 500 company competing to generate the most boilerplate possible." It's petty, sure, but there's something oddly cathartic about responding to slop with slop.
Because it doesn't actually believe anything at all, because these things don't think or feel or know anything. They just string together statistically likely language tokens one after another with a bit of random "magic" thrown in the mix to simulate "creativity".
Did that end up working for you?
I had this same experience recently, and it floored my expectations for that dev, it just felt so wrong.
I made it abundantly clear that it was substandard work with comically wrong content and phrasings, hoping that he would understand that I trust _him_ to do the work, but I still later saw signs of it all over again.
I wish there was something other than "move on". I'm just lost, and scarred.
Making the implicit explicit is a task for your documentation team, who should also be helping prep inputs for your LLMs. If foo and Frob are the same, have the common decency to tell the LLM...
Kinda funny example: The other day I asked Grok what a "grandparent" comment is on HN. It said it's the "initial comment" in a thread. Not coincidentally, that was the same answer I found in a reddit post that was the first result when I searched for the same thing on DuckDuckGo, but I was pretty sure that was wrong.
So I gave Grok an example: "If A is the initial comment, and B is a reply to A, and C a reply to B, and D a reply to C, and E a reply to D, which is the grandparent of C?" Then it got it right without any trouble. So then I asked: But you just said it's the initial comment, which is A. What's the deal? And then it went into the usual song and dance about how it misunderstood and was super-sorry, and then ran through the whole explanation again of how it's really C and I was very smart for catching that.
I'd rather it just said, "Oops, I got it wrong the first time because I crapped out the first thing that matched in my training data, and that happened to be bad data. That's just how I work; don't take anything for granted."
Yes, but why would it? "Oops, I got it wrong the first time because I crapped out the first thing that matched in my training data" isn't in the training data. Yet.
So it can't come out of the LLM: There's no actual introspection going on, on any of these rounds. Just using training data.
Maybe I'm just misreading your comment, but it has me confused enough to reset my password, login, and make this child comment.
The amount of absolutely shit LLM code I've reviewed at work is so sad, especially because I know the LLM could've written much better code if the prompter did a better job. The user needs to know when the solution is viable for an LLM to do or not, and a user will often need to make some manual changes anyway. When we pretend an LLM can do it all, it creates slop.
I just had a coworker a few weeks ago produce a simple function that wrapped a DB query in a function (normal so far), but wrote 250 lines of tests for it. All the code was clearly LLM generated (the comments explaining the most mundane of code was the biggest give away). The tests tested nothing. It mocked the ORM and then tested the return of the mock. We were testing that the mocking framework worked? I told him that I don't think the tests added much value since the function was so simple and that we could remove them. He said he thought they provided value, with no explanation, and merged the code.
Now fast forward to the other day and I run into the rest of the code again and now it's sinking in how bad the other LLM code was. Not that it's wrong, but it's poorly designed and full of bloat.
I have no issue with the LLM - they can do some incredible things and they're a powerful tool in the tool belt, but they are to be used in conjunction with a human that knows what they're doing (at least in the context of programming).
Kind of a rant, but I absolutely see a future where some code bases are well maintained and properly built, while others have tacked on years of vibe-coded trash that now only an LLM can even understand. And the thing that will decide which direction a code base goes in will be the engineers involved.
this offshoring all over again. At first, every dev in the US was going to be out of a job because of how expensive they were compared to offshore devs. Then the results started coming back and there was some very good work done offshore but there was tons and tons of stuff that had to be unwound and fixed with onshore teams. Entire companies/careers were made dedicated to just fixing stuff coming back from offshore dev teams. In the end, it took a mix of both to realize more value per dev $
Technical debt at a payday loan interest rate.
Windsurfing (the real activity) requires multiple understandings:
1) How to sail in the first place
2) How to balance on the windsurfer while the wind is blowing on you
If you can do both of those things, you can go VERY fast and it is VERY fun.
The analogy to the first thing is "understanding software engineering" (to some extent). The analogy to the second thing is "understanding good prompting while the heat of deadlines is on you". Without both, you are just creating slop (falling in the water repeatedly and NOT going faster than either surfing or sailing alone). Junior devs that are leaning too hard on LLM assistance right off the bat are basically falling in the water repeatedly (and worse, without realizing it).
I would at minimum have a policy of "if you do not completely understand the code written by an LLM, you will not commit it." (This would be right after "you will not commit code without it being tested and the tests all passing.")
I would ask them for an apple pie recipe and report to HR
There are a lot of people reading replies from more knowledgeable teammates, feeding those replies into LLMs, and pasting the response back to their teammates. It plays out in public on open source issue threads.
It's a big mess, and it's wasting so much of everyone's time.
i do this sometimes except i reply asking them to rephrase their comment in the form of a poem. Then screenshot the response and add it as an attachment before the actual human deletes the comment.
And it's unfortunate that many people would start associating long texts as generated by default. Related XKCD: https://xkcd.com/3126/
It didn't help that the LLM was confidently incorrect.
The smallest things can throw off an LLM, such as a difference in naming between configuration and implementation.
In the human world, you can with legacy stuff get in a situation where "everyone knows" that the foo setting is actually the setting for Frob, but with an LLM it'll happily try to configure Frob or worse, try to implement Foo from scratch.
I'd always rather deal with bad human code than bad LLM code, because you can get into the mind of the person who wrote the bad human code. You can try to understand their misunderstanding. You can reason their faulty reasoning.
With bad LLM code, you're dealing with a soul-crushing machine that cannot (yet) and will not (yet) learn from its mistakes, because it does not believe it makes mistakes ( no matter how apologetic it gets ).