Comment by stingraycharles

stingraycharles 2 days ago parent

Yeah, security really is an afterthought with most of these tools, but man the community is moving insanely fast — probably because most of these people are using these automation tools to develop their MCP servers in the first place.

It’s interesting to see other tools struggling to keep up. ChatGPT supposedly will get proper MCP client support “any day now”, but I don’t see codex supporting it any time soon.

Aider is very much struggling to adapt as well, as their whole workflow of editing and navigating files is easily replaced by MCP servers (probably better as well, as it provides much effective ways of reducing noise vs signal), so it’ll be interesting to see how tools adapt.

I’d love for Claude Code (or any tool for that matter) to fully embrace the agentic way of coding, e.g. have multiple agents specialize in different topics and some “main” agent directing them all. Those workflows seem to be working really well.

ljm 2 days ago

The real security issue is around the use of ‘YOLO mode’ where you just let the agent invoke tools in a completely unattended manner. It’s not much different than slapping sudo in front of every shell command or running as root.

People are going to continue doing that because these agentic tasks can take some time to run and checking in to approve a command so often becomes an annoyance.

I can’t see a way around that except to have some kind of sandboxing or a concept of untrusted or tainted input rather than treating all tokens as the same. Maybe a way of detecting if the response of a tool is within a threshold of acceptability within the definition of the MCP (which is easier with structured output), which is used to force a manual confirmation or straight up rejection if it’s deemed to be unusual or unsafe.

samcat116 2 days ago

> I can’t see a way around that except to have some kind of sandboxing or a concept of untrusted or tainted input rather than treating all tokens as the same. Maybe a way of detecting if the response of a tool is within a threshold of acceptability within the definition of the MCP (which is easier with structured output), which is used to force a manual confirmation or straight up rejection if it’s deemed to be unusual or unsafe.

I think we are starting to see these remote agent environments where each agent session gets its own sandbox environment to run things in. I bet thats where this is going.

alvis 2 days ago

It's indeed an issue. I love codex that it contains everything in a sandbox and I can review what has changed. It's proper and I've much better idea what's going on.

That said, I ditched codex for claude code... Sorry open ai. No MCP and no way to interact during execution is a huge drawback.

wunderwuzzi23 2 days ago

ChatGPT Codex has internet access since a few weeks ago. It's super configurable on where it can connect to.

anuramat 2 days ago

anthropic provides a custom devcontainer for sandboxing, but I have fallen in love with bubblewrap - it's a single command, and I get to keep all the infrastructure: e.g. it can do nix flakes without duplicating every derivation

Maxious 2 days ago

https://github.com/ruvnet/claude-code-flow adds some of the multiagent features ontop

stingraycharles OP 2 days ago

Yeah that’s what I’m experimenting with, but I think it’s overengineered, especially with the whole dogmatic SPARC approach. I’m personally a more minimalistic person, and I would prefer it to be natively integrated into the app and being able to define exactly the (system) prompts for each of the agents.

vessenes 2 days ago

The aider slowdown is a real bummer. I’d love to have Claude code UI with the model choice aider gets me, but I’m not willing to give up tool integration.

stingraycharles OP 1 day ago

There are a dozen of PRs that are not getting accepted, I’m using a custom Aider build and tested their MCP client support but it’s just not getting merged nor reviewed.

owebmaster 1 day ago

> The aider slowdown is a real bummer.

Quite ironic isn't it?

vessenes 14 hours ago

I imagine he might be burned out. HUGE project, hundreds of issues. He started as a crazy productive solo dev (on this project at least). Now it's an important open source project.

This item has no comments currently.