- What's your point, though? Let's assume your hypothesis and 5 years from now everyone has access to an LLM that's as good as a typical staff engineer. Is it now acceptable for a junior engineer to submit LLM-generated PRs without having tested them?
> It was thought impossible for a computer to reach the point of being able to beat a grandmaster at chess.
This is oft-cited but it takes only some cursory research to show that it has never been close to a universally-held view.
- > A robust stdlib or framework is in line with what I'm suggesting, not a counterexample.
Maybe I didn't argue this well, but my point is that it's a spectrum. What about libraries in the java ecosystem like Google's Guava and Apache Commons? These are not stdlbibs, but they almost might as well be. Every non-trivial java codebase I've worked in has pulled in Guava and at least some of the Apache commons libraries. Unless you have some other mitigating factor or requirement, I think it'd be silly not to pull these in as dependencies to a project the first time you encounter something they solve. They're still large codebaes you're not using 99% of though.
I don't feel like my position on this is black-and-white. It is not always correct to solve a problem by adding a new dependency - and in the situation you describe - adding a sprawling UI framework would be a mistake. Maybe the situation is different in front-end land, but I don't see how AI really shifts that balance. My colleagues were not doing the bad or wrong thing by copying that incorrect code - tasked with displaying a human-readable file size I would probably either write out the boundaries by hand or copy-paste the first reasonable looking result from stack overflow without much thought too.
> At no point have I advised copying code from libraries instead of importing them.
I didn't say copying, though. I said replicating. If you ask AI to implement something that appears in its training data, there is a high probability it will produce something that looks very similar and even a non-zero possibility it will replicate it exactly. Setting aside value judgements, this is functionally the same as a copy, even if what was done to produce it was not copying.
- > Introducing a library with two GitHub stars from an unknown developer
I'd still rather have the original than the AI's un-attributed regurgitation. Of course the fewer users something has, the more scrutiny it requires, and below a certain threshold I will be sure to specify an exact version and leave a comment for the person bumping deps in the future to take care with these.
> Introducing a library that was last updated a decade ago
Here I'm mostly with you, if only because I will likely want to apply whatever modernisations were not possible in the language a decade ago. On the other hand, if it has been working without updates in a decade, and people are STILL using it, that sounds pretty damn battle-hardened by this point.
> Introducing a library with a list of aging unresolved CVEs
How common is this in practice? I don't think I've ever gone library hunting and found myself with a choice between "use a thing with unsolved CVEs" and "rewrite it myself". Normally the way projects end up depending on libraries with lists of unresolved CVEs is by adopting a library that subsequently becomes unmaintained. Obviously this is a painful situation to be in, but I'm not sure its worse than if you had replicated the code instead.
> Pulling in a million lines of code that you're reasonably confident you'll never have a use for 99% of
It very much depends - not all imported-and-unused code is equal. Like yeah, if you have Flask for your web framework, SQLAlchemy for your ORM, Jinja for your templates, well you probably shouldn't pull in Django for your authentication system. On the other hand, I would be shocked if I had ever used more than 5% of the standard library in the languages I work with regularly. I am definitely NOT about to start writing my rust as no_std though.
> Relying on an insufficiently stable API relative to the team's budget, which risks eventually becoming an obstacle to applying future security updates (if you're stuck on version 11.22.63 of a library with a current release of 20.2.5, you have a problem)
If a team does not have the resources to keep up to date with their maintenance work, that's a problem. A problem that is far too common, and a situation that is unlikely to be improved by that team replicating the parts of the library they need into their own codebase. In my experience, "this dependency has a CVE and the security team is forcing us to update" can be one of the few ways to get leadership to care about maintenance work at all for teams in this situation.
> Each line of code included is a liability, regardless of whether that code is first-party or third-party. Each dependency in and of itself is also a liability and ongoing cost center.
First-party code is an individual liability. Third-party code can be a shared one.
- At one stage in my career the startup I was working at was being acquired, and I was conscripted into the due-diligence effort. An external auditor had run a scanning tool over all of our repos and the team I was on was tasked with going through thousands of snippets across ~100 services and doing something about them.
In many cases I was able to replace 10s of lines of code with a single function call to a dependency the project already had. In very few cases did I have to add a new dependency.
But directly relevant to this discussion is the story of the most copied code snippet on stack overflow of all time [1]. Turns out, it was buggy. And we had more than once copy of it. If it hadn't been for the due diligence effort I'm 100% certain they would still be there.
- I've seen this argument made frequently. It's clearly a popular sentiment, but I can't help feel that it's one of those things that sounds nice in theory if you don't think about it too hard. (Also, cards on the table, I personally really like being able to pull in a tried-and-tested implementation of code to solve a common problem that's also used by in some cases literally millions of other projects. I dislike having to re-solve the same problem I have already solved elsewhere.)
Can you cite an example of a moderately-widely-used open source project or library that is pulling in code as a dependency that you feel it should have replicated itself?
What are some examples of "everything libraries" that you view as problematic?
- No-one's forcing you to use crates published on the crates.io registry - cargo is perfectly happy to pull dependencies from a different public or private registry, elsewhere in the same repo, somewhere else on the filesystem, pinned to a git hash, or I think some other ways.
"We shouldn't use the thing that has memory safety built in because it also has a thriving ecosystem of open source dependencies available" is a very weird argument.
- > Well, synchronous blocking approaches (as opposed to asynchronous nonblocking) provide that stuff for free.
Not really. The talk describes problems that can show up in any environment where you have concurrency and cancellation. To adapt some examples: a thread that consumes a message from a channel but is killed before it can process it, has still resulted in that message being lost. A synchronous task that needs to temporarily violate invariants in some data structure that can't be updated atomically, has still left that data structure in an invalid state when it gets killed part way through.
> Arguably the Go language's goroutines strike a good balance between cooperate and preemptive threads/multitasking.
Goroutines are pretty nice. It's especially nice that Go has avoided the function colouring problem. I'm not convinced that having to litter your code with select's if you need to make your goroutine's cancel-able is good though. And if you don't care about being able to cancel tasks, you can write async rust in a way that ensures they won't be cancelled by accident fairly easily. Unless there's some better way to write cancel-able goroutines that I'm not familiar with.
> The key insight is that manual management of tasks is, for the most part, not tenable by humans. It's better to take a step back and work at a higher level of abstraction.
Of course it's always important to look at systems as a whole. But to build larger systems out of smaller components you need to actually build the small components.
> I'd probably point to CockroachDB as one of the best task-cancellers, since it doesn't have a shutdown procedure. Its process can simply be terminated by the user with control-c, then it reconciles any outstanding transactions the next time it's booted, which just adds some latency. If an entire database can do that, then "this is the way".
I'm not familiar with CockroachDB specifically, but I do think a database should generally have a more involved happy-path shutdown procedure than that. In particular, I would like the database not to begin processing new transactions if it is not going to be able to finish them before it needs to shut down, even if not finishing them wouldn't violate ACID or any of my invariants.
- I'd argue the reverse is true. On your local system, which only need to operate when a named user with a (hopefully) strong password is present, you can encrypt the secrets with the user's login password and the OS can verify that it's handing the secret out to the correct binary before doing so. The binary can also take steps to verify that it is being called directly from a user interaction and not from a build script of some random package.
The extent to which any of this is actually implemented varies wildly between different OSes, ecosystems and tools. On macOS, docker desktop does quite well here. There's also an app called Secretive which does even better for SSH keys - generating a non-exportable key in the CPU's secure enclave. It can even optionally prompt for login password or fingerprint before allowing the key to be used. It's practically almost as secure as using a separate hardware token for SSH but significantly more convenient.
In contrast, most of the time the only thing protecting the keys in your CI vault from being exfiltrated is that the malware needs to know the specific name / API call / whatever to read them. Plenty of CI systems you don't even need that, because the build script that uses the secrets will read them into environment variables before starting the build proper.
- Part of it is an ecosystem thing. It's a lot better now, but there were times when libraries would throw a variety of checked exceptions that usually couldn't be handled sensibly other than by being caught and logged by your top-level exception handler. This forced you to either
(a) pollute many methods on many layers of your class hierarchy with `throws X`, potentially also polluting your external API
(b) catch, wrap and re-throw the checked exception at every call site that could throw it
(c) Use Lombok's `@SneakyThrows` to convert a checked exception into an unchecked one. Which they advise against using on production code, and which I have definitely never used on production code.
There are specific situations where checked exceptions work well - where you want to say to specific code "I can fail in this specific way, and there is something sensible you can do about it". But those are fairly rare in my experience.
- Every time one of these comes up, I have similar thoughts. A threat actor is in the position to pull off a large-scale supply chain compromise, and the best thing you can think of to do with that is also the thing that will guarantee you are discovered immediately? Mine crypto on the damn CPU, or publicly post the victim's credentials to their own GitHub account?
On one hand, I cannot accept that the actors that we see who pull these off are the best and brightest. My gut tells me that these attacks must be happening in more subtle ways from time to time. Maybe they're more targeted, maybe they're not but just have more subtle exfil mechanisms.
On the other, well we have exactly one data point of an attempt at a more subtle attack. And it was thwarted right before it started to see wide-spread distribution.
But also there was a significant amount of luck involved. And what if it hadn't been discovered? We'd still have zero data points, but some unknown actor would possess an SSH skeleton key.
So I don't know what to think.
- When Apple switched to their own silicon, I was maintaining the build systems at a scaleup.
After I saw the announcement, I immediately knew I needed to try out our workflows on the new architecture. There was just no way that we wouldn't have x86_64 as an implicit dependency all throughout our stack. I raised the issue with my manager and the corporate IT team. They acknowledged the concern but claimed they had enough of a stockpile of new Intel machines that there was no urgency and engineers wouldn't start to see the Apple Silicon machines for at least another 6-12 months.
Eventually I do get allocated a machine for testing. I start working through all the breakages but there's a lot going on at the time and it's not my biggest priority. After all, corporate IT said these wouldn't be allocated to engineers for several more months, right? Less than a week later, my team gets a ticket from a new-starter who has just joined and was allocated an M1 and of course nothing works. Turns out we grew a bit faster than anticipated and that stockpile didn't last as long as planned.
It took a few months before we were able to fix most of the issues. In that time we ended up having to scavenge under-specced machines form people in non-technical roles. The amount of completely avoidable productivity wasted from people swapping machines would have easily reached into the person-years. And of course myself and my team took the blame for not preparing ahead of time.
Budgets and expenditure are visible and easy to measure. Productivity losses due to poor budgetry decisions, however, are invisible and extremely difficult to measure.
- Question for those in this thread who are okay with this: If I have endpoints that are computationally expensive server-side, what mechanism do you propose I could use to avoid being overwhelmed?
The web will be a much worse place if such services are all forced behind captchas or logins.
- > The neat part is that I did it with only three additional 8 TB disks and never transferred my data to external storage.
That's neat! I didn't know there was a way to do this while maintaining data redundancy.
> Step 1: Borrow one disk to create a RAIDZ2 pool
> To begin, I remove one disk from my original RAIDZ1 pool, leaving it in a degraded state.
Oh, there isn't. :facepalm:
- A thinker might say "LLMs are inevitable, here's why" and then make specific arguments that either convince me to change my mind, or that I can refute.
A tech executive making an inevitablist argument won't back it up with any justification, or if they do it will be so vague as to be unfalsifiable.
- I’m at a roughly A2 - B1 level at the language I’m learning and I picked up a whole lot of pretty basic grammar errors in the first conversation.
The app also used a bunch of constructions I’m not familiar with even though I specified I’m a beginner.
If I hired a human tutor and had this experience, I would ask for my money back.
Today I learned that AI advocates being overly optimistic about its trajectory is actually not a new phenomenon - it's been happening for more than twice my lifetime.