- > Could a rogue agent theoretically run a destructive command? Sure. Have I seen it happen in weeks of usage? Never.
I've been in cybersecurity over a decade, and this still blows my mind. It’s classic cognitive dissonance or just normalized deviance. People get used to doing unsafe things until the system breaks.
Best analogy I use: seatbelts. In the U.S., wearing a seatbelt is automatic. Zero thought. In other parts of the world, not so much. Ask why, and you’ll hear: “I’m a good driver. Never had an accident. Don’t need it.” That’s not logic. That’s luck confused with safety.
The difference? Conditioning. Risk comprehension. Cultural defaults.
Same thing happens in software. No amount of UI warnings will stop people from doing dumb things. Running as root, disabling SELinux, exposing prod databases to the open internet. Happens constantly.
Anthropic gave a user the ability to do something they know is risky. Anthropic understands "LLM Trifecta" vulns. This person has no idea.
Strap in, we're in for a wild ride.
- At the last job we deployed to thousands of nodes across AWS, Azure, and Aliyun. There was no unique needs across those environments, from a deploy perspective. There where some minor pain points from a config and monitor perspective. There where massive pain points from a infra management perspective. Our SREs knew AWS inside and out but struggled with Azure and Aliyun documentation and support. We had to build out API middleware, so existing could could seamlessly operate in all 3 environments. That's not set it and forget it. Keeping everything consistent across 3 cloud APIs requires constant care and feeding.
- If you want alignment, set clear rules that everyone understands. At SpaceX, their spending policy was simple: If it helps us get to Mars faster, spend it. If not, don’t. That simple policy helped keep the whole company focused.
Simpler rules, better decisions, faster progress.
- Most people and by extension, most businesses don’t think from first principles. They copy what others do because it’s easier. It reduces cognitive load. But that kind of thinking leads to cargo cults. People doing things that look right but make no sense when you break them down. Business schools teach unit economics and first-principles logic (like in The Goal), but most companies still optimize for quarterly performance. It’s backwards. You end up with systems that reward short-term hacks instead of long-term efficiency.
In the real world, business moves fast. Too fast for most people to stop and think. If you don’t build in ways to slow down and reason from fundamentals, you’ll just react your way into mediocrity. The companies that win long term (Amazon, Toyota, SpaceX) they go back to physics-level thinking. They understand the real constraints and design around them. That’s the cheat code. First-principles thinking isn’t optional—it’s the only way to build something that actually works and lasts.
- I haven't seen anything useful in the agent space. I definitely haven't seen anything I would trust in a business process.
At the same time, I'm constantly amazed at how accessible LLMs have made automating things with Python. I'm seeing more non-SDEs describe what they want to do and then iterate on a solution with the LLM.
So I see more happening in this space, but it's a little more deterministic and a little less abstracted than the current products are headed.
I also see Gitlab or GitHub dabbling in this area. At some point you need to deploy the code. GitHub actions, workspaces, and pages are not that far off from a product that could cater to this need.
- Security is about real risk reduction, not chasing whatever’s trendy - but that's what most security teams do and then complain about the results.
Most business functions are metric-driven. Security should be no different. The right approach: convert qualitative insights into hard data, then systematically drive that metric down.
It's not easy. It's hard work, but I've done it at 3 companies. It's doable.
- I look at this through the following perspectives:
As a security engineer, this is pretty obvious - blindly handing over login credentials to AI agents? What could possibly go wrong? Feels like a ticking time bomb until courts start hashing out liability.
As a technologist, it really makes you rethink UI design. Do we even need traditional interfaces anymore? If LLMs handle backward compatibility so seamlessly, we might just skip straight to conversational AI as the default interface.
As an investor, there’s going to be a gold rush. Early movers won’t need perfect accuracy—just "good enough" to capture market share. Classic first-mover advantage.
And as a manager, this screams inefficiency. Expect something like Six Sigma 2.0 to emerge. AI driven quality control to fix AI driven errors. Irony is strong with this one.
- I see two possibilities here. Either Musk is a foreign agent attempting to cripple the US government or he's a student of Blitzscaling and Jim Collins' The Map.
Incentives can help determine his motives. What could a foreign government offer Musk that his billions can't already purchase?
The spending policy at SpaceX could offer insight into motive. As I understand it, it's simply: If spending the money gets us to Mars sooner, then it's authorized.
So is this all about getting to Mars sooner? If so, then if your bottleck is government bureaucracy, then wouldn't it make sense to spend hundreds of millions on a campaign so you can seize control of the government and overhaul it in the service of getting to Mars sooner?
- I doubt they're no performers, they're likely selective performers. Issues like this can often be explained by Public Goods Game theory. If there's no economic incentive, then some participants will choose to not contribute or under-contribute.
- The key is modularity with good interface design, then you have AI generate each component and play more of a QA role to validate each component is functional.
- I code but my day job is closer to CISO. I've been working with react for years and one of the biggest pain points is the near immediate security debt you take on through the complex ecosystem that is node packages. That debt keeps growing over time, until some critical vuln forces you to deal with it - but, by that time you're in dependency hell and it's no simple task that LLMs can help you fix.
So when I read posts like these, my thoughts are 1) it's great that react is more accessible then ever but 2) there's a world of cyber security pain just around the corner.
- Cooling is a single DC factor with costs derived from electrical costs.
In most DC build outs you're looking for favorable network peering (ie pipe size, latency, or both) and low electricity costs.
If peering is the highest priority, then you build in hot climates and reduce your cooling costs as much as possible with outside air when temperatures are cool (is at night). They've been doing this in Las Vegas for decades.
- As a manager I'm always trying to figure out if my direct reports are vice or virtue motivated. With vice being money and title. Virtue being what you get to work and with who. In this context I'd say Grifters & Grinders are vice motivated. Believers and Coasters are virtue motivated.
- I've built security programs at 3 companies. This is how I would solve these problems.
1. SSO everywhere. Okta if budget is no concern and Keycloak if it is.
2. Password manager for the entire company. Even if it's possible to go SSO everywhere, there are still secrets employees will need to manage. Give them a solution or they'll solve it on their own and not in a good way. I like 1Password.
3. All services use a secret solution that can broker short lived secrets and a policy that limits secret TTL to a day or less. I like HashiCorp Vault.
- Having worked in adtech for 6 years I learned through many-many conversations with friends and family that 1) they don't care to understand how it all works and 2) they don't believe it influences them.
Try to walk people through the Senate report on the 2016 election interference or The United States vs The Internet Research Agency. Most will dismiss it as nonsense.
This problem is like tobacco and sugar. It will only get better through regulation. The masses have no desire to fix it.
- The book Trillion Dollar Coach ( written by Eric Schmidt and other ex-Google leadership) has a first hand account of how they went from a philosophy of few managers to more. Surprisingly the debate was settled by impromptu interviews with ICs who said they wanted more managers to help make decisions.
Managers can bring value from multiple aspects, one being faster decision making by shifting away from decisions through consensus. What Amazon calls disagree and commit.
If you go back even further, the no manager philosophy traces to Steve Jobs and his "It's better to be a pirate than in the Navy" quote. With the idea that pirates self organize and are a stronger team. Works great in a skunkworks department, where Jobs personally selected pirates. Falls apart at scale.
- Check out Beyond The Goal and Beyond The Phoenix Project for a deeper dive in this area.
I work in cyber security and use many of the concepts often. The root cause of many poor outcomes are poor assumptions, prioritizing ideology over customer value, and misaligned shared mental models.
I use a simple doc format to address. It's based on evaporating clouds.
1. What's the shared (understanding of current state with a focus on objectivity? 2. What are the problems with current state? Subjectivity is OK here. 3. What's desired state? 4. What are the experiments we can run ASAP to learn if our understanding of desired state is correct and learn how we can get closer.
Simple concept. Works great.
- So many questions.
Why not reveal what sites/code this was tested on, so others can try to repeat? What was the false positive rate? Why didn't they compare results with commodity automated scanners like Burp or Zap?
- This. I've established security programs at 3 companies over 10+ years. I've rarely encountered an engineer who didn't care. I've encountered many with competing priorities.
What gets measured gets done. Establish the right measurement to equip engineers to prioritize security and they'll get it done.
- My mind immediately goes to Bezos & "Resist Proxies". From a guy who has built security programs at 3 companies.
https://www.aboutamazon.com/news/company-news/2016-letter-to...
- Sometimes these events provoke regulators to take a closer look at the company.
https://www.ftc.gov/news-events/news/press-releases/2023/11/...
- Predictable based on a large body of knowledge. From Orson Scott Card's How Software Companies Die to David Packards HP Way. There's also Deming, Goldratt, and Jim Collins.
https://medium.com/riow/how-software-companies-die-by-orson-...
- This problem is not exclusive to Meta. It's the product of "big ball of mud" design. It relates to Dunbar's Number and the limitation of mental models.
- When I read this it fascinates me how Plato's Allegory of the Cave continues to evolve in our modern digital world. Propaganda, visual witnessing, etc, etc - the theory stays the same, just the tools change.
- Sounds like you want a government job.
- Summary: A guy wrote a book about LLMs, prompt engineering, and how to write code to interface with LLM APIs. Oh, and you can use it to role play table top incident response exercises.
IMHO the LLMs themselves are a better way to learn this knowledge.
Here’s the hard truth: cybersecurity today is basically fashion. It’s not science, it’s herd behavior. The industry is still running on the “nobody gets fired for buying IBM” mentality. Careers aren’t built on being right, they’re built on chasing whatever tool is trending on LinkedIn this quarter.
If studies like this mattered, then the security community would have paid attention to this study long ago:
https://er.educause.edu/articles/2005/1/fostering-email-secu...