Preferences

Oh nothing official. There are people who estimate the sizes based on tok/s, cost, benchmarks etc. The one that most go on is https://lifearchitect.substack.com/p/the-memo-special-editio.... This guy estimated Claude 3 opus to be 2T param model (given the pricing + speed). Opus 4 is 1.2T param according to him (but then I dont understand why the price remained the same.). Sonnet is estimated by various people to be around 100B-200B params.

[1]: https://docs.google.com/spreadsheets/d/1kc262HZSMAWI6FVsh0zJ...


If you're using the api cost of the model to estimate it's size, then you can't use this size estimate to estimate the inference cost.
tok/s cannot in any way be used to estimate parameters. It's a tradeoff made at inference time. You can adjust your batch size to serve 1 user at a huge tok/s or many users at a slow tok/s.

This item has no comments currently.

Keyboard Shortcuts

Story Lists

j
Next story
k
Previous story
Shift+j
Last story
Shift+k
First story
o Enter
Go to story URL
c
Go to comments
u
Go to author

Navigation

Shift+t
Go to top stories
Shift+n
Go to new stories
Shift+b
Go to best stories
Shift+a
Go to Ask HN
Shift+s
Go to Show HN

Miscellaneous

?
Show this modal