Comment by uludag - Hacker Neue

uludag May 6, 2025 parent

I can imagine that most LLMs, if you ask it to find a security vulnerability in a given piece of code, will make something up completely out of the air. I've (mistakenly) sent valid code with an unrelated error and to this day I get nonsense "fixes" for these errors.

This alignment problem between responding with what the user wants (e.g. a security report, flattering responses) and going against the user seems a major problem limiting the effectiveness of such systems.

This item has no comments currently.