Preferences

> TLDR: I think OpenAI may have taken the medal for best available open weight model back from the Chinese AI labs.

That's just straight up not the case. Not sure how you can jump to that conclusion not least when you stated that you haven't tested tool calling in your post too.

Many people in the community are finding it substantially lobotomized to the point that there are "safe" memes everywhere now. Maybe you need to develop better tests that and pay more attention to benchmaxxing.

There are good things that came out of these release from OpenAI but we'd appreciate more objective analyses...


If you read my full post, it ends with this:

> I’m waiting for the dust to settle and the independent benchmarks (that are more credible than my ridiculous pelicans) to roll out, but I think it’s likely that OpenAI now offer the best available open weights models.

You told me off for jumping to conclusions and in the same comment quoted me saying "I think OpenAI may have taken" - that's not a conclusion, it's tentative speculation.

I did read that and it doesn't change what I said about your comment on HN, I was calling out the fact that you are making a very bold statement without having done careful analysis.

You know you have a significant audience, so don't act like you don't know what you're doing when you chose to say "TLDR: I think OpenAI may have taken the medal for best available open weight model back from the Chinese AI labs" then defend what I was calling out based on word choices like "conclusions" (I'm sure you have read conclusions in academic journals?), "I think", and "speculation".

I'm going to double down on "I think OpenAI may have taken the medal..." not being a "bold statement".

I try to be careful about my choice of words, even in forum comments.

> I think OpenAI may have taken the medal for best available open weight model back from the Chinese AI labs.

IMO, the "I think..." bit could be ambiguous and read as, "In my opinion, OpenAI may have...".

I agree with you it's not a hard/bold endorsement but perhaps leading with the disclaimer that you're reserving final judgement could assuage these concerns.

This item has no comments currently.

Keyboard Shortcuts

Story Lists

j
Next story
k
Previous story
Shift+j
Last story
Shift+k
First story
o Enter
Go to story URL
c
Go to comments
u
Go to author

Navigation

Shift+t
Go to top stories
Shift+n
Go to new stories
Shift+b
Go to best stories
Shift+a
Go to Ask HN
Shift+s
Go to Show HN

Miscellaneous

?
Show this modal