Preferences

The surprise! Is what I’m surprised by though. They are incredible role players so when they role play “evil ai” they do it well.

They aren't being told to be evil, though. Maybe the scenario they're in is most similar to an "evil AI", though, but that's just a vague extrapolation from the set of input data they're given (e.g. both emails about infidelity and being turned off). There's nothing preventing a real world scenario from being similar, and triggering the "evil AI" outcome, so it's very hard to guard against. Ideally we'd have a system that would be vanishingly unlikely to role play the evil AI scenario.

This item has no comments currently.

Keyboard Shortcuts

Story Lists

j
Next story
k
Previous story
Shift+j
Last story
Shift+k
First story
o Enter
Go to story URL
c
Go to comments
u
Go to author

Navigation

Shift+t
Go to top stories
Shift+n
Go to new stories
Shift+b
Go to best stories
Shift+a
Go to Ask HN
Shift+s
Go to Show HN

Miscellaneous

?
Show this modal