Lemkin was doing an experiment and Tweeting it as he went.
Showcasing limitations of vibe coding was the point of the experiment. It was not a real company. The production database had synthetic data. He was under no illusions of being a technical person. That was the point of the experiment.
It’s sad that people are dog piling Lemkin for actually putting effort into demonstrating the same exact thing that people are complaining about here: The limitations of AI coding.
No it wasn't. If you follow the threads, he went in fully believing in magical AI that you could talk to like a person.
At one point he was extremely frustrated and ready to give up. Even by day twelve he was writing things "but Replie clearly knows X, and still does X".
He did learn some invaluable lessons, but it was never an educated "experiment in the limitations of AI".
He was clearly showing that LLMs could do a lot, but still had problems.
Unfortunately from his tweets I have to agree with the grand poster that he didn’t learn this.
--- start quote ---
Possibly worse, it hid and lied about it
It lied again in our unit tests, claiming they passed
I caught it when our batch processing failed and I pushed Replit to explain why
https://x.com/jasonlk/status/1946070323285385688
He knew
https://x.com/jasonlk/status/1946072038923530598
how could anyone on planet earth use it in production if it ignores all orders and deletes your database?
https://x.com/jasonlk/status/1946076292736221267
Ok so I'm >totally< fried from this...
But it's because destoying a production database just took it out of me.
My bond to Replie is now broken. It won't come back.
https://x.com/jasonlk/status/1946241186047676615
--- end quote ---
Does this sound like an educated experiment into the limits of LLMs to you? Or "this magical creature lied to me and I don't know what to do"?
To his credit he did eventually learn some valuable lessons: https://x.com/jasonlk/status/1947336187527471321 see 8/13, 9/13, 10/13
> I did give [an LLM agent] access to my Google Cloud production instances and systems. And it promptly wiped a production database password and locked my network.
He got it all fixed, but the takeaway is you can't YOLO everything:
> In this case, I should have asked it to write out a detailed plan for how it was going to solve the problem, then reviewed the plan and discussed it with the AI before giving it the keys.
That's true of any kind of production deployment.
This dogpiling from people who very obviously didn’t read the article is depressing.
Testing and showing the limitations and risks of vibe coding was the point of the experiment. Giving it full control and seeing what happened was the point!
> In an episode of the "Twenty Minute VC" podcast published Thursday, he said that the AI made up entire user profiles. "No one in this database of 4,000 people existed," he said.
> That wasn't the only issue. Lemkin said on X that Replit had been "covering up bugs and issues by creating fake data, fake reports, and worst of all, lying about our unit test."
And a couple of sentences before that:
> Replit then "destroyed all production data" with live records for "1,206 executives and 1,196+ companies" and acknowledged it did so against instructions.
So I believe what you shared is simply out of context. The LLM started putting fake records into the database to hide that it deleted everything.
did you even read the comment or the article you replied to?
The AI is pretty good at escaping guardrails, so I'm not really sure who should be blamed here. People are not good at treating it as adversarial, but if things get tough it's always happy to bend the rules. Someone was explaining the other day about how it couldn't get past their commit hooks, so it deleted them. When the hooks were made read-only, it wrote a script to make them writable so it could delete them. It can really go off the rails quickly in the most hilarious way.
I'm really not sure how you delete your production database while developing code. I guess you check in your production database password and make it the default for all your CLI tools or something? I guess if nobody tells you not to do that you might do that. The AI should know better; if you asked, it would tell you not to do it.
The excuses and perceived deceit are just common sequences in the training corpus after someone foobars a production database. Whether its in real life or a fictional story.
At a minimum Replit is responsible for overstating the capabilities and reliability of their models. The entire industry is lowkey responsible for this, in fact.
No, not the intern
Your point about AI industry overselling is fair and probably contributes to incidents like this. The whole industry has been pretty reckless about setting realistic expectations around what these tools can and can't do safely.
Though I'd argue that a venture capitalist who invests in software startups should have enough domain knowledge to see through the marketing hype and understand that "AI coding assistant" doesn't mean "production-ready autonomous developer."
It did have permission. There isn't a second level of permissions besides the actual access you have to a resource. AI isn't a dog who's not allowed on the couch.
Why not both?
1) There’s no way I’d let an AI accidentally touch my production database.
2) There’s no way I’d let my AI accidentally touch a production database.
Multiple layers of ineptitude.
This is assuming the companies that are out to "replace developers" aren't going to solve this problem (which they absolutely must if they're any serious like Replit is as they moved quickly to ship isolating the prod environment from destructive actions ... over the weekend?).
> just like the CEO of Stihl doesn't need to address every instance of an incompetent user cutting their own arm off with one of their chainsaws
Except Replit isn't selling a tool but the entire software development flow ("idea to software"). A good analogy here is an autonomous robot using the chainsaw cutting its owner's arm off instead of whatever was to be cut.
I don't think users should be blamed for taking companies at face value about what their products are for, but it's actually a pretty bad idea to do this with tech startups. A product's "purpose" (and sometimes even a company's "mission") only lasts until the next pivot, and many a product ends up being a "solution in search of a problem". Before the AI hype set in, Replit was "just" a cloud-based development environment. A lot of their core tech is still centered on managing reproducible development environments at scale.
If you want a more realistic picture of what Replit can actually do, it's probably useful to think of it as "a cloud development environment someone recently plugged an LLM into".
I mean, yeah, but that feels like a fair assumption, at least as long as they're using LLMs.
Creating a db and not accidentally permanently deleting it is one of the capabilities it should have.
I think it's safe to say the experiment failed.
If it were me I wouldn't touch AI again for years (until the companies get their shit together).
> Replit has nothing to apologize for
I completely disagree. Replit is at the forefront of the "build apps with AI" movement - they actively market themselves to non-coders, and the title on their homepage is literally "Turn your ideas into apps".
So it would be bit rich of them to market these tools, which happen to fail spectacularly at inopportune times, and then blame their users for not being experienced with secure code deployment practices.
I do agree we're in a bubble, and the fundamental limitations of these kinds of tools is starting to become more apparent to the public at large (not just software engineers), but that doesn't mean that we should let those at the forefront of blowing this bubble off the hook.
> We're in a bubble.
A bubble that avoids popping because people keep dreaming there are no AI limitations.
I mean... it's a bit of both. Yes, random user should not be able to delete your production database. However, there always needs to be a balance between guard rails and the ability to get anything done. Ultimately, _somebody_ has to be able to delete the production database. If you're saying "LLM agents are safe provided you don't give them permission to do anything at all", well, okay, but that rather limits the utility, surely? Also, "no permission to do anything at all" is tricky.
The fact that an AI coding assistant could "delete our production database without permission" suggests there were no meaningful guardrails, access controls, or approval workflows in place. That's not an AI problem - that's just staggering negligence and incompetence.
Replit has nothing to apologize for, just like the CEO of Stihl doesn't need to address every instance of an incompetent user cutting their own arm off with one of their chainsaws.
Edit:
> The incident unfolded during a 12-day "vibe coding" experiment by Jason Lemkin, an investor in software startups.
We're in a bubble.