> I’m waiting for the dust to settle and the independent benchmarks (that are more credible than my ridiculous pelicans) to roll out, but I think it’s likely that OpenAI now offer the best available open weights models.
You told me off for jumping to conclusions and in the same comment quoted me saying "I think OpenAI may have taken" - that's not a conclusion, it's tentative speculation.
You know you have a significant audience, so don't act like you don't know what you're doing when you chose to say "TLDR: I think OpenAI may have taken the medal for best available open weight model back from the Chinese AI labs" then defend what I was calling out based on word choices like "conclusions" (I'm sure you have read conclusions in academic journals?), "I think", and "speculation".
I try to be careful about my choice of words, even in forum comments.
IMO, the "I think..." bit could be ambiguous and read as, "In my opinion, OpenAI may have...".
I agree with you it's not a hard/bold endorsement but perhaps leading with the disclaimer that you're reserving final judgement could assuage these concerns.
That's just straight up not the case. Not sure how you can jump to that conclusion not least when you stated that you haven't tested tool calling in your post too.
Many people in the community are finding it substantially lobotomized to the point that there are "safe" memes everywhere now. Maybe you need to develop better tests that and pay more attention to benchmaxxing.
There are good things that came out of these release from OpenAI but we'd appreciate more objective analyses...