I see claude code as pair programming with a junior/mid dev that knows all fields of computer engineering. I still need to nudge it here and there, it will still make noob mistakes that I need to correct and I let it know how to properly do things when it gets them wrong. But coding sessions have been great and productive.
In the end, I use it when working with software that I barely know. Once I'm up and running, I rarely use it.
I did, but I always approached LLM for coding this way and I have never been let down. You need to be as specific as possible, be a part of the whole process. I have no issues with it.
It... sort of worked well? I had to have a few back-and-forth because it tried to use Objective-C features that did not exist back then (e.g. ARC), but all in all it was a success.
So yeah, niche things are harder, but on the other hand I didn't have to read 300 pages of stuff just to do this...
Also, fun names like `makeFunctionNameInCommentLongAndDescriptiveWithNaturalLanguage:(NSLanguage *)language`
In some cases, it just doesn't have the necessary information because the problem is too niche.
In other cases, it does have all the necessary information but fails to connect the dots, i.e. reasoning fails.
It is the latter issue that is affecting all LLMs to such a degree that I'm really becoming very sceptical of the current generation of LLMs for tasks that require reasoning.
They are still incredibly useful of course, but those reasoning claims are just false. There are no reasoning models.
I don't mean to be treading on feet but I'm noticing this more and more in the debates around AI. Imagine if there are developers out there that could have done this task in 30 mins without AI.
The level of performanace of AI solutions is heavily related to the experience level of the developer and of the problem space being tackled - as this thread points out.
Unfortunately the marketing around AI ignores this and makes every developer not using AI for coding seem like a dinosauer, even though they might well be faster in solving their particular problems.
AI is moving problem solving skills from coding to writing the correct prompts and teaching AI to do the right thing - which, again, is subjective, since the "right thing" for one developer isn't the "right thing" for the another developer. "Right thing" being the correct solution, the understandable solution, the fastest solution, etc depending on the needs of the developer using the AI.
The other is to look at the non-working solution you get, read through it, and think "Oh, I didn't know about that framework/system/product/library, that's neat" and then do some combination of further research and more hand-holding to get to something that does work.
This is useful, more or less, no matter what your level.
It's also good for explaining core industry tooling you've maybe never used before. If you're new to Postgres/NoSQL/AWS/Docker/SwiftUI/whatever it can talk you through it and give you an instant bootcamp with entry-level examples and decent solutions.
And for providing fixes for widely known bugs and issues in products that may not be widely known to you (yet.)
IME ChatGPT5 is pretty solid with most science/tech up to undergrad. It gets hallucinatory past that, and it's still flattering, which is annoying, but you can tell it to cut that out.
Generally you can use it as a dumb offshore developer, or as an infinitely patient private tutor.
That latter option is very useful. The first, not always.
Spelling out exactly what you want and checking/fixing what you receive is still faster than typing out the code. Moreover, nobody's job involves nothing but brainiac coding, day after day. You have to clean up and lay foundations, whatever level you are at.
For me, that's too general. Of course, perhaps for this particular, specific problem it might be true. But as this thread points out, anything niche and AI fails to help productively. Of course then comes the marketing: just wait, AI will be able to cover those niche cases also.
> want and checking/fixing what you receive is still faster than typing out the code
Then I do wonder why there are developers at all. After all that's what AI is so good at - if one believes the marketing - being precise and describing exactly what needs to be done. Surely it must be faster having two AIs talking to each and hammering out the code.
And even typing is subjective: ten fingers versus two, versus four .. etc. There are developers that can type faster than they can think - in certain cases.
There is also the developer in flow versus the stop and go using an AI prompts to get it just right. I dunno, if it comes true, then thankfully there won't be any humans to create bugs in code but somehow, I can't see it happening.
You're not necessarily wrong, but I think it's worth noting that very few developers are only ever coding deep in their one domain that they're good at. There's just too many things to be deeply good at everything. For example, it's common that infra and CI tasks are stuff that most developers haven't learned by heart, because you don't tend to touch them very often.
Claude shines here — I've made a lot more useful GitHub Actions jobs recently, because while I could automate something, if I know I'm going to have to look up API docs (especially multiple APIs I'm not super familiar with) then I tend to figure that the automation will lose out the trade-off between doing the task (see https://xkcd.com/1205/). Claude being able to hash out those rapidly, and in a way that's easily verifiable that it's doing the right thing, has changed that arithmetic for me substantially.
1. Find out how to access metadata about the node running my code (assumption: some kind of an environment variable) [1-10 minutes depending on familiarity with AWS]
2. Google "RDS certificates" and find the bundle URL after skimming the page [1] for important info [1-5 minutes]
3. Write code to download the certificate bundle, fallback being "global-bundle.pem" if step 1 failed for some reason? [5-20 minutes depending on all the bells and whistles you need]
Did I miss anything or completely misunderstand the task?
[1] https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Using...
edit: I asked Claude Sonnet 4 to write robust code for a Node.JS application that downloads RDS CA bundle for the AWS region that the code is currently running in and saves it at the supplied filesystem path.
0. It generated about 250 lines of code
1. Fallback was us-east (not global)
2. The download URLs for each region were hardcoded as KV pairs instead of being constructed dynamically
3. Half of the regions were missing
4. It wrote a function that verifies whether the certificate bundle looks valid (i.e. includes a PEM header)... but only calls it on the next application startup, instead of doing so before saving a potentially invalid certificate bundle to disk and proceeding with the application startup.
5. When I complained that half of my instances are downloading global bundles instead of regional ones (because they're not present in the hardcoded list), it:
- incorrectly concluded that not all regions have CA bundles available and hardcoded a duplicate list in 2 places containing regions that are known to offer CA bundles (which is all of them). These lists were even shorter than the last ones.
- wrote a completely unnecessary function that checks whether a regional CA bundle exists with a HEAD request before actually downloading it with a GET request, adding another 50 lines of code
Now I'm having to scrutinize 300 lines of code to make sure it's nothing doing something even more unexpected.
If a business needs the equivalent of a Toyota Corolla, why be upset about the factory workers making the millionth Toyota Corolla?
In my experience, that's not entirely true. Sure, a lot of app are CRUD apps, but they are not the same. The spice lies in the business logic, not in programming the CRUD operations. And then of course, scaling, performance, security, organization, etc etc.
(edit: /s to indicate sarcasm)
---
[0]: https://lovr.org
It’ll successfully produce _something_ like that, because there’s millions of examples of those technologies online. If you do anything remotely niche, you need to hold its hand far more.
The more complicated your requirements are, the closer you are to having “spicy autocomplete”. If you’re just making a crud react app, you can talk in high level natural language.