Comment by h4ny - Hacker Neue

h4ny Aug 6, 2025 parent

> TLDR: I think OpenAI may have taken the medal for best available open weight model back from the Chinese AI labs.

That's just straight up not the case. Not sure how you can jump to that conclusion not least when you stated that you haven't tested tool calling in your post too.

Many people in the community are finding it substantially lobotomized to the point that there are "safe" memes everywhere now. Maybe you need to develop better tests that and pay more attention to benchmaxxing.

There are good things that came out of these release from OpenAI but we'd appreciate more objective analyses...

simonw Aug 6, 2025

If you read my full post, it ends with this:

> I’m waiting for the dust to settle and the independent benchmarks (that are more credible than my ridiculous pelicans) to roll out, but I think it’s likely that OpenAI now offer the best available open weights models.

You told me off for jumping to conclusions and in the same comment quoted me saying "I think OpenAI may have taken" - that's not a conclusion, it's tentative speculation.

h4ny OP Aug 6, 2025

I did read that and it doesn't change what I said about your comment on HN, I was calling out the fact that you are making a very bold statement without having done careful analysis.

You know you have a significant audience, so don't act like you don't know what you're doing when you chose to say "TLDR: I think OpenAI may have taken the medal for best available open weight model back from the Chinese AI labs" then defend what I was calling out based on word choices like "conclusions" (I'm sure you have read conclusions in academic journals?), "I think", and "speculation".

simonw Aug 6, 2025

I'm going to double down on "I think OpenAI may have taken the medal..." not being a "bold statement".

I try to be careful about my choice of words, even in forum comments.

bavell Aug 6, 2025

> I think OpenAI may have taken the medal for best available open weight model back from the Chinese AI labs.

IMO, the "I think..." bit could be ambiguous and read as, "In my opinion, OpenAI may have...".

I agree with you it's not a hard/bold endorsement but perhaps leading with the disclaimer that you're reserving final judgement could assuage these concerns.

This item has no comments currently.

Preferences

Keyboard Shortcuts

Story Lists

Navigation

Miscellaneous