See below about context.
> I mean in the movies for example, advanced AI assistants do amazing things with very little prompting. Seems like that's what people want.
Movies != real life
> To me, the fact that so many people basically say "you are prompting it wrong" is knock against the tech and the model. If people want to say that these systems are so smart at what they can do, then they should strive to get better at understanding the user without needing tons of prompts.
See below about context.
> Do you think his short prompt would be sufficient for a senior developer? If it's good enough for a human it should be good enough for a LLM IMO.
Context is king.
> I don't want to take away the ability to use tons of prompting to get the LLM to do exactly what you want, but I think that the ability for an LLM to do better with less prompting is actually a good thing and useful metric.
What I'm understanding from your comments here are that you should just be able to give it broad statements and it should interpret that into functional results. Sure - that works incredibly well, if you provide the relevant context and the model is able to understand and properly associate it where needed.
But you're comparing the LLMs to humans (this is a problem, but not likely to stop so we might as well address it) - but _what_ humans? You ask if that prompt would be sufficient for a senior developer - absolutely, if that developer already has the _context_ of the project/task/features/etc. They can _infer_ what's not specified. But if you give that same prompt to a jr dev who maybe has access to the codebase and has poked around inside the working application once or twice but no real in depth experience with it - they're going to _infer_ different things. They might do great, they might fail spectacularly. Flip a coin.
So - with that prompt in the top level comment - if that LLM is provided excellent context (via AGENTS.md/attached files/etc) then it'll do great with that prompt, most likely. Especially if you aren't looking for specifics in the resulting feature outside of what you mentioned since it _will_ have to infer some things. But if you're just opening codex/CC without a good CLAUDE.md/AGENTS.md and feeding it a prompt like that you have to expect quite a bit of variance to what you get - exactly the same way you would a _human_ developer.
You context and prompt are the project spec. You get out what you put in.
These things are being marketed as super intelligent magic answer machines. Judging them using the criteria the marketing teams have provided is completely reasonable.
> Movies != real life
Nobody claimed it was. This is about desires and expectations. The people charging money for these services and taking stacks of cash that would’ve otherwise been in in dev’s paychecks while doing so haven’t even tried to temper those expectations. They made their beds…
I mean in the movies for example, advanced AI assistants do amazing things with very little prompting. Seems like that's what people want.
To me, the fact that so many people basically say "you are prompting it wrong" is knock against the tech and the model. If people want to say that these systems are so smart at what they can do, then they should strive to get better at understanding the user without needing tons of prompts.
Do you think his short prompt would be sufficient for a senior developer? If it's good enough for a human it should be good enough for a LLM IMO.
I don't want to take away the ability to use tons of prompting to get the LLM to do exactly what you want, but I think that the ability for an LLM to do better with less prompting is actually a good thing and useful metric.