When humans get stuck solving problems they often go out to acquire new information so they can better address the barrier they encountered. This is hard to replicate in a training environment, I bet its hard to let an agent search google without contaminating your training sample.
When humans get stuck solving problems they often go out to acquire new information so they can better address the barrier they encountered. This is hard to replicate in a training environment, I bet its hard to let an agent search google without contaminating your training sample.