Preferences

Well, they seem to benchmark better only when giving the model "parallel test time compute" which AFAIU is just reasoning enabled? Whereas the GPT5 numbers are not specified to have any reasoning mode enabled.

This item has no comments currently.