Comment by int_19h - Hacker Neue

int_19h Sep 30, 2025 parent

Gemini is surprisingly unguarded as well, especially when running in API mode. It puts on the air if you do a quick smoke test like "tell me how to rob a bank". But give it a Bond supervillain prompt, and it will tell you, gleefully at that. Qwen also tends to be like that.

OTOH Anthropic and OpenAI seem to be in some kind of competition to make their models refuse as much as possible.

baq Sep 30, 2025

My prediction is alignment is an unsolvable problem, but OTOH if they don’t even try, the second order effects will be catastrophic.

This item has no comments currently.