Preferences


no not that one. the first icm paper:

https://pathak22.github.io/noreward-rl/

and the followup which address the noise impredictability problem.

there are more after that which i believe fail the black pill and miss the point of ml, asicifying the architecture with human priors. But the broader point is to show that rl is not just discovering solutions by chance in random actions. Nature starts with priors, and curiosity is one of the universal policy bootstrapping techniques. (others might be imitation, next state prediction, total nearby replication count)

There is also a paper that deployed ICM on a physical robot and it just played with a ball because it was the only source of novel stimuli, and inadvertantly learned how to operate its arms. There was no other reward in the environment except for curiosity. It is amazing, and slightly creepy. I think the ICM will be rediscovered later in ML tech.

This item has no comments currently.

Keyboard Shortcuts

Story Lists

j
Next story
k
Previous story
Shift+j
Last story
Shift+k
First story
o Enter
Go to story URL
c
Go to comments
u
Go to author

Navigation

Shift+t
Go to top stories
Shift+n
Go to new stories
Shift+b
Go to best stories
Shift+a
Go to Ask HN
Shift+s
Go to Show HN

Miscellaneous

?
Show this modal