lysecret parent
Hadn’t had much luck with o3. One thing that came to my mind with these test time compute models is that they have a tendency to “overthink” and “overcomplicate” things this is just a feeling for now but has anyone done some study on this? E.g. potentially degrading performance in simpler questions for these types of models?