Preferences

there’s one depressing anecdote that I keep on seeing: the junior engineer, empowered by some class of LLM tool, who deposits giant, untested PRs on their coworkers—or open source maintainers—and expects the “code review” process to handle the rest.

Is anyone else seeing this in their orgs? I'm not...


I am currently going through this with someone in our organization.

Unfortunately, this person is vibe coding completely, and even the PR process is painful: * The coding agent reverts previously applied feedback * Coding agent not following standards throughout the code base * Coding agent re-inventing solutions that already exist * PR feedback is being responded to with agent output * 50k line PRs that required a 10-20 line change * Lack of testing (though there are some automated tests, but their validations are slim/lacking) * Bad error handling/flow handling

> 50k line PRs that required a 10-20 line change

This is hilarious. Not when you're the reviewer, of course, but as a bystander, this is expert-level enterprise-grade trolling.

Fire them?
I believe it is getting close to this. Things like this just take time though, and when this person talks to management/leadership they talk about how much they are producing and how everyone is blocking their work. So it becomes a challenging political maneuvering depending on the ability of certain leadership to see through the BS.

(By my organization, I meant my company - this person doesn't report to me or in my tree).

This is not really an option for your standard IC.
Just reject the PR?
I'm seeing a little bit of this. However, I will add that the primary culprits are engineers that were submitting low quality PRs before they had access to LLMs, they can just submit them faster now.
LLMs are tools that make mediocre devs 100x more "productive" and good devs 2x more productive
From my vantage I would argue LLMs make good devs around 0.65x more productive
I think they make good devs 2x more productive for the first month, which then slowly declines as that good dev spends less time actually writing and understanding and debugging code until it falls well below the 1x mark. It’s basically a high interest loan people take against their own skills. For some people that loan might be worth it. Maybe they’re trying to change their role in an organization and need the boost to start taking up new responsibilities they want to own. I think it’s temporary though. The slow shift into “skim mode”, where the authors just don’t quite put that same amount of effort into understanding what’s being churned out. I dunno, that’s just what I’ve seen.
Because there's a mental overhead when you're not writing the code that is arguably worse than when you are writing the code. No one is talking about this enough IMO but that's why everyone is so exhausted when using LLMs and end up just pulling the slot machine until it works without actually reading it.

Reading code sucks, it always has. The flow state we all crave is when the code is in our working memory as an understood construct and we're just translating our mental model to a programming language. You don't get that with LLMs. It devolves into prorgamming minutae equivalent to "a little to the left" but with the added complexity that "left" is hundreds of lines of code.

I really feel this myself.

If I write home-grown organic code then I have no choice but to fully understand the problem. Using an LLM it's very easy to be lazy, at least in the short term

Where does that get me after 3 months? I end up working on a codebase I barely understand. My own skills have degraded. It just gets worse the longer you go

This is also coming from my experience in the best case scenario: I enjoy coding and am working on something I care about the quality of. Lots of people don't have even that

I think on average a dev can be x percent more productive, but there is a best case and worst case scenario. Sometimes it's a shortcut to crank out a solution quickly, other times the LLM can spin you in circles and you lose the whole day in a loop where the LLM is fixing its own mistakes, and it would've been easier to just spend some time working it out yourself.
Good devs are still learning how to use LLMs, and so are willing to accept the 0.65x once in a while. Any complex tool will have a learning curve. Most tools improve over time. As such good devs either have found how to use LLMs to make them more productive (probably not 10x, but even 1.1x is something), or they try them again every few months to see if things are better.
I just spent a day trying to get Claude to write reasonable unit tests and then after sleeping on it, reverted everything and did it myself. I’m not gonna be using it for a while because it 0.5x’d me once again
Yep, that's why very accomplished, widely regarded developers like Mitchell Hashimoto and Antirez use them. They need to make programming more challenging to keep it fun.
LLMs are great at spewing content and code is a form of "content". I think what we're seeing is software development turning into youtube. Content creators cranking out content, some is great, most is meh, a lot is really bad. I do find it all a bit funny and ironic. My wife was a journalist and bemoaned news blogs and social media for terrible terrible writing claiming it was journalism. She would tell me about how much work quality journalism is and all the mistakes these bloggers and social media make and how detrimental it was to society at large blah blah blah

Now the power to create tons and tons of code (ie content) is in the hands of everyone and here we are complaining about it just like my wife use to complain about journalism. I think the myth of the highly regarded Software Developer perched in front of the warming glow of a screen solving and automating critical problems is coming to an end. Deservedly really, there's nothing more special about typing words into an editor than, say, framing a house. The novelty is over.

[citation needed]. No study I've seen shows an even 50% productivity improvement for programming, let alone a 100% or 9900% improvement.
What's the ratio of people who things the right way vs not? I mean, is it a matter of giving them feedback to remind them what a "quality PR" is? Does that help?
It's roughly 1/10 that are causing issues. Not a huge deal but dealing with them inevitably takes up a couple hours a week. We also have a codebase that is shared with some other teams and our primary offenders are on one of those separate teams.

I think this is largely an issue that can be solved culturally within a team, we just unfortunately only have so much input on how other teams work. It doesn't help either when their manager doesn't seem to care about the feedback... Corporate politics are fun.

Yeah, I mean to get back to the original statement in the blog, this seems like less of a tech issue and more of a culture issue. The LLM enables the junior to do this once. It's the team culture that allows them to continue doing it.
LLMs have dramatically empowered sociopath software developers.

If you are sufficiently motivated to appear more "productive" than your coworkers, you can force them to review thousands of lines of incorrect AI slop code while you sit back and mess around with your chatbots.

Your coworkers no longer have enough time to work on their in-progress PRs, so you can dominate the development team in terms of LOC shipped.

Understand that sociopaths are skilled at navigating social and bureaucratic environments. A sociopath who ships the most LOC will get the promotion every single time.

Only if leadership lets them. Right now (anecdotally) a lot of “leaders” don’t understand the difference between AI generated and human generated work, and just look at loc as productivity so all incentives are on AI coding, but that will change.
It will never change. Managers will consider every stupid metric players push to sell their solutions. Be it code coverage, extensive CI/CD pipelines with useless steps, "productivity gains" with gen tools. The gen tools euphoria is stupid and will cease to exist, but before this was bdd,tdd,DDD, test before, test after, test your mocks, transpile to a different language and then ignore the output, code maturity, best practices, oop, pants in head oriented programming... There is always something stupid on the horizon this is certainly not the last stupid craze
Voila:

https://github.com/WireGuard/wireguard-android/pull/82 https://github.com/WireGuard/wireguard-android/pull/80

In that first one, the double pasted AI retort in the last comment is pretty wild. In both of these, look at the actual "files changed" tab for the wtf.

That's a good example of what we're seeing as leads, thanks.
Scary stuff.

I’d love to hear your thoughts on LLMs, Jason. How do you use them in your projects? Do they play a role in your workflow at all?

Yeah this guys comment here is spot on: https://github.com/WireGuard/wireguard-android/pull/80#issue...

I recently reviewed a PR that I suspect is AI generated. It added a function that doesn't appear to be called from anywhere.

It's shit because AI is absolutely not on the level of a good developer yet. So it changes the expectation. If a PR is not AI generated then there is a reasonable expectation that a vaguely competent human has actually thought about it. If it's AI generated then the expectation is that they didn't really think about it at all and are just hoping the AI got it right (which it very often doesn't). It's rude because you're essentially pawning off work that the author should have done to the reviewer.

Obviously not everyone dumps raw AI generated code straight into a PR, so I don't have any problem with using AI in general. But if I can tell that your code is AI generated (as you easily can in the cases you linked), then you've definitely done it wrong.

A friend of mine is working for a small-ish startup (11 people) and he gets to work and sees the CTO push 10k loc changes straight to main at 3 am.

Probs fine when you are still in the exploration phase of a startup, scary once you get to some kind of stability

I feel like this becomes kind of unacceptable as soon as you take on your first developer employee. 10K LOC changes from the CTO is fine when it's only the CTO working on the project.

Hell, for my hobby projects, I try to keep individual commits under 50-100 lines of code.

Templates and templating languages are still a thing. Source generators are a thing. Languages that support macros exist. Metaprogramming is always an option. Systems that write systems…

If these AIs are so smart, why the giant LOCs?

Sure, it’s cheaper today than yesterday to write out boilerplate, but programming is about eliminating boilerplate and using more powerful abstractions. It’s easy to save time doing lots of repetitive nonsense, stopping the nonsense should be the point.

Lol I worked at a startup where the CTO did this. The problem was that it was pure spaghetti code. It was so bad it kept me up at night, thinking about how to fix things. I left within 30 days
I worked with a “CTO” who did that before LLMs - one of the worst jobs I have had in the last 10 years. I spent at least 50% of my time putting out fires or refactoring his garbage code
The cto is ultimately responsible for the outcome and will be there at 4am to fix stuff.
Yes .. and no. Someone who does this will definitely make the staff clean up after them.
I'd go mental if I was a SWE having to mop that up later
That's...idiotic.
A friend of mine has a junior engineer who does this and then responds to questions like "Why did you do X?" with "I didn't, Claude did, I don't know why".
That would be an immidiate reason of termination in my book.
Yes, if they can't debug + fix the reason the production system is down or not working correctly then they're not doing their job, imo.

Developers aren't hired to write code that's never run (at least in my opinion). We're also responsible for running the code/keeping it running.

Some other comments suggest immediately firing.. but a junior engineer needs to be mentored. It should be explained to them clearly that they need to understand the changes they have made. They should also be pointed towards the coding standards and SDLC documentation. If they refuse to change their ways, then firing makes sense.
I think words that would follow from me would get me send to HR...

And if it was repeated... Well I would probably get fired...

no hate but i would try to fire someone for saying that
This but with hate.
See also "Why did you do X?" → Flurry of new commits → Conversation marked as resolved

And not just from juniors

Yep. Remember, people not posting on this website are just grinding away at jobs where their individual output does not matter, and entire motivation is work JUST hard enough not to get fired. They don't get stock grants, extremely favorable stock options or anything else, they get salary and MAYBE a small bonus based off business factors they have little control over.

My eyes were wide open when 2 jobs ago, they said they would be blocking all personal web browsing from work computers. Multiple Software Devs were unhappy because they were using their work laptop for booking flights, dealing with their kids schools stuff and other personal things. They did not have personal computer at all.

There are people posting on this website that are in that category; or in those companies. For example most people working outside America as a SWE who like the profession. The options to work for a place that gives stock options, and equity in general is small -> and generally in many countries is heavily penalised tax wise.
They don't have phones?
They do but obviously laptop is easier than doing it on their phone. That’s what most of them ended up doing.
I don't see most PRs because they happen in other teams, but I am part of Slack channel where there are too many "oops" messages for my liking.

I.e. 1-2 times a month, there's an SQL script posted that will be run against prod to "hopefully fix data for all customers who were put into a bad state from a previous code release".

The person who posts this type of message most often is also the one running internal demos of the latest AI flows and trying to get everyone else onboard.

Similar, at my last job. And the pushback was greater because super duper clever AI helped write it, who obviously knows more than any other senior engineer could know, so they were expecting an immediate PR approval and got all uppity when you tried to suggest changes.
Hah! I've been trying to push back on this sort of thought. The bot writes code for you, not you for the bot.
I thought we were not, but we had just been lucky. A sequence of events lately have shown that the struggle is real. This was not a junior developer though, but an experienced one. Experience does not equal skill, evidently.
i left my last job because this was endemic
More likely you were fired for being an asshole.
Hey, personal attacks are against site rules. You've been here long enough to know that.
Definitely seeing a bit of this, but it isn't constrained to junior devs. It's also pretty solvable by explaining to the person why it's not great, and just updating team norms.
I'm seeing it on some open source projects I maintain. Recently had 10 or so PRs come in. All very valid features - but from looking at them, not actually tested.
Quite a few FOSS maintainers have been speaking up about it.
Not at all. Submitting untested PRs is a wildly outside of my experience. Having tests written to cover your code is a pre-requisite for having your PR reviewed on our team. "Does it work" aka passing manual testing, is literally the bare minimum before submitting a PR
If it's all vibe coded, how do you know — without review — that the new tests, for a new feature, test anything useful at all?
When I was in a test-driven development environment, one of our rules was that you had to see the test fail. You had to prove that it would actually test what you were trying to test.
It isn't only junior engineers, but otherwise. It is a small number of people from all levels.

People do what they think they will be rewarded for. When you think your job is to write a lot of code then LLMs are great. When you need quality code you start to ask if LLMs are better or not?

I started seeing it from a particularly poor developer sometime last year. I was the only reviewer for him so I saw all of his PRs. He refused to stop despite my polite and then not so polite admonishments, and was soon fired for it.
I'm not either

But LLMs don't really perform well enough on our codebase to allow you to generate things that even appear to work. And I'm the most junior member of my team at 37 years of age, hired in 2019.

I really tried to follow the mandate from on high to use Copilot, but the Agent mode can't even write code that compiles with the tools available to it.

Luckily I hooked it up to gptel so I can at least ask it quick questions about big functions I don't want to read in emacs.

> And I'm the most junior member of my team at 37 years of age

This sounds fucking awesome.

Would be nice to have someone enthusiastic junior to me.

Most of the team is comfortable in their wheelhouse and when new stuff comes down the pipe it's hard to get them mobilized. I had leadership on a big green-field project and felt like we could have really used a junior.

Yes, in the only successful OSS project that I “maintain.”

Fully vibe coded, which at least they admitted. And when I pointed out the thing is off by an order of magnitude, and as such doesn't implement said feature — at all — we get pressed on our AI policy, so as to not waste their time.

I don't have an AI policy, like I don't have an IDE policy, but things get ridiculous fast with vibe coding.

It's been a struggle with a few teammates that we are trying to solve through explicit policy, feedback, and ultimately management action.
Yeah, a slice of this is technology related, but it's really a policy issue. It's probably easier to manage with a tighter team. Maybe I'm taking team size for granted.
I feel like a story about some open-source project getting (and rejecting) mammoth-sized PRs hits HN every week!
Not so much the huge PRs, but definitely the LLM generated code that the “developer” doesn’t understand.
It's not a new phenomenon. Time was, people would copy-paste from blog posts with the same effect.
Always the same old tiring "this has always been possible before in some remotely similar fashion hence we should not criticise anything ever again" argument.

You could intuitively think it's just a difference of degree, but it's more akin to a difference of kind. Same for a nuke vs a spear, both are weapons, no one argues they're similar enough that we can treat them the same way

Yes, I'm so over this argument. It can literally be made for anything, and it is!

At the end of the day we're not performing war by poking other people with long sticks and we're not getting the word out by sending out a carrier pigeon.

Methods and medium matters.

I would bet in most organizations you can find a copy-pasted version of the top SO answer for email regex in their language of choice, and if you chase down the original commit author they couldn't explain how it works.
I think it's impossible to actually write an email regex because addresses can have arbitrarily deeply nested escaping. I may have that wrong. I'd hope that regex would be .+@.+ and that's it (watch me get Cunninghammed because there is some valid address wherein those plusses should be stars).
TIL Cunningham's Law[0]. I knew about that phenomenon but not the proper name. Thanks!

[0] https://en.wikipedia.org/wiki/Ward_Cunningham#Law

Yeah, but being able to produce nuclear-sized 10k+ LOC PRs to open-source projects in minutes with relatively-zero effort definitely is. At least you had to use your brain to know which blog posts/SO answers to copypasta from.
I don't see the problem with fentanyl given that people have been using caffeine forever.
I used to do that in simpler days. I'd add a link to where I copied it from so we could reference it if there were problems. This was for relatively small projects with just a few people.
> I'd add a link to where I copied it from

LLMs can't do this.

Your code is unambiguously better than any LLM code if you can comment a link to the stackoverflow post you copied it from.

Agreed on the first part for sure since an LLM is the computer/software version of a blender.

So, I'm agreed on the second part too then.

> Your code is unambiguously better than any LLM code if you can comment a link to the stackoverflow post you copied it from.

This is not a truism. "My" code might come from an LLM and that's fine if I can be reasonably confident it works. I might try to gain that confidence by testing the code and reading it to understand what it's doing. It is also true of blog post code, regardless of how I refer to the code; if I link to the blog post, it's because it does a better job of explaining than I ever could in code comments. Whether LLMs make one more productive is hard to measure but it seems to be missing the point to write this.

The point is, including the code is a choice and one should be mindful of it, no matter the code's origin. At that point, this comes off like you just have something to prove; there doesn't seem to be a reason not to use the LLM code if you know it works and you know why it works.

Believing you know how it works and why it works is not the same as that actually being the case. If the code has no author (in that it's been plagiarised by a statistical process that introduces errors), there's nowhere to go if you realise "oops, I didn't understand that as well as I had thought!".
I’ve been seeing obviously LLM generated PRs, but not huge ones.
first time we’d see this there would be a warning, second one is pink slip

This item has no comments currently.