"minimal" is a bit weird.
> Matches the “no thinking” setting for most queries. The model may think very minimally for complex coding tasks. Minimizes latency for chat or high throughput applications.
I'd prefer a hard "no thinking" rule than what this is.
> They went too far, now the Flash model is competing with their Pro version
Wasn't this the case with the 2.5 Flash models too? I remember being very confused at that time.
This is similar to how Anthropic has treated sonnet/opus as well. At least pre opus 4.5.
To me it seems like the big model has been "look what we can do", and the smaller model is "actually use this one though".
I'm not sure how I'm going to live with this!
Also I don't see it written in the blog post but Flash supports more granular settings for reasoning: minimal, low, medium, high (like openai models), while pro is only low and high.