Preferences

If left to its own devices, Claude will resort to writing passable looking BS as tests pretty quickly when the going gets tough e.g. if it has to interface with stateful real world systems and its tests struggle to pass

This is my experience as well. If you want it to write good tests, you have to take a much more involved approach of first making it establish what needs testing in each module, writing each test one at a time, and making it prove that it can break the test by modifying the source code to introduce a bug, modify the test to be appropriate, rinse and repeat. I haven't done this much because it's very expensive in terms of time and premium tokens...right now, I just write most tests myself so at least I have faith in the verification suite.
claude is like an intern, someone has to code review and approve before final delivery imho
While I agree with the sentiment, you are being to generous. Claude is like a new intern every 15..30 minutes, or however long it takes to fill the context. "Oh, I know what is the issue, let me delete this code for now" and proceeds to nuke the fix it spent last context window making. No intern can be that malicious.

This item has no comments currently.

Keyboard Shortcuts

Story Lists

j
Next story
k
Previous story
Shift+j
Last story
Shift+k
First story
o Enter
Go to story URL
c
Go to comments
u
Go to author

Navigation

Shift+t
Go to top stories
Shift+n
Go to new stories
Shift+b
Go to best stories
Shift+a
Go to Ask HN
Shift+s
Go to Show HN

Miscellaneous

?
Show this modal