Preferences

This is not the way

these agents are not up to the task of writing production level code at any meaningful scale

looking forward to high paying gigs to go in and clean up after people take them too far and the hype cycle fades

---

I recommend the opposite, work on custom agents so you have a better understanding of how these things work and fail. Get deep in the code to understand how context and values flow and get presented within the system.


> these agents are not up to the task of writing production level code at any meaningful scale

This is obviously not true, starting with the AI companies themselves.

It's like the old saying "half of all advertising doesn't work; we just don't which half that is." Some organizations are having great results, while some are not. From the multiple dev podcasts I've listened to by AI skeptics have had a lightbulb moment where they get AI is where everything is headed.

Not a skeptic, I use AI for coding daily and am working on a custom agent setup because, through my experience for more than a year, they are not up to hard tasks.

This is well known I thought, as even the people who build the AIs we use talk about this and acknowledge their limitations.

I'm pretty sure at this point more than half of Anthropic's new production code is LLM-written. That seems incompatible with "these agents are not up to the task of writing production level code at any meaningful scale".
how are you pretty sure? What are you basing that on?

If true, could this explain why Anthropics APIs are less reliable than Gemini's? (I've never gotten a service overloaded response from Google like I did from Anthropic)

Quoting a month old post: https://www.lesswrong.com/posts/prSnGGAgfWtZexYLp/is-90-of-c...

  My current understanding (based on this text and other sources) is:
  - There exist some teams at Anthropic where around 90% of lines of code that get merged are written by AI, but this is a minority of teams.
  - The average over all of Anthropic for lines of merged code written by AI is much less than 90%, more like 50%.
> I've never gotten a service overloaded response from Google like I did from Anthropic

They're Google, they out-scale everyone. They run more than 1.3 quadrillion tokens per month through LLMs!

You cannot clean up the code, it is too verbose. That said, you can produce production ready code with AI, you just need to put up very strong boundaries and not let it get too creative.

Also, the quality of production ready code is often highly exaggerated.

I have AI generated, production quality code running, but it was isolated, not at scale or broad in view / spanning many files or systems

What I mean more is that as soon as the task becomes even moderately sized, these things fail hard

> these agents are not up to the task of writing production level code at any meaningful scale

I think the new one is. I could be the fool and be proven wrong though.

It's marginally better, no where close to game changing, which I agree will require moving beyond transformers to something we don't know yet

This item has no comments currently.

Keyboard Shortcuts

Story Lists

j
Next story
k
Previous story
Shift+j
Last story
Shift+k
First story
o Enter
Go to story URL
c
Go to comments
u
Go to author

Navigation

Shift+t
Go to top stories
Shift+n
Go to new stories
Shift+b
Go to best stories
Shift+a
Go to Ask HN
Shift+s
Go to Show HN

Miscellaneous

?
Show this modal