"omni" announced (multimodal fusion, initial promise of gpt-4o, but cost effectively distilled down with additional multimodal aspects)
gpt-4o-mini -> gpt-4o (multimodal, realtime)
gpt-4o + "reasoning" exposed via tools in ChatGPT (you can see it in export formats) -> "o" series
o1 -> o1 premium / o1-mini (equivalent of gpt-4 "god model" becoming basis for lots of other stuff)
o1-pro-mode, o1-premium, o1-mini, somewhere in that is the "o1-2024-12-17" model with not streaming, function calling, and structured outputs and vision
now, distilled o1-pro-mode probably is o3-mini and o3-mini-high-mode (the naming is becoming just as bad as android)
its the repeat, take model, scale it up, run evals, detect innefficiencies, retrain, scale, distill, see what's not working. when you find a good little zone in the efficiency frontier, release it with a cool name
anticensor
No, o3-mini is a distillation of (not-yet-released) o3, not a distillation of o1.
arthurcolleOP
o1-"pro mode" could just be o3
anticensor
It's not that either, benchmarks list the two as separate models.
gpt-3.5 -> gpt-4 (gpt-4-32k premium)
"omni" announced (multimodal fusion, initial promise of gpt-4o, but cost effectively distilled down with additional multimodal aspects)
gpt-4o-mini -> gpt-4o (multimodal, realtime)
gpt-4o + "reasoning" exposed via tools in ChatGPT (you can see it in export formats) -> "o" series
o1 -> o1 premium / o1-mini (equivalent of gpt-4 "god model" becoming basis for lots of other stuff)
o1-pro-mode, o1-premium, o1-mini, somewhere in that is the "o1-2024-12-17" model with not streaming, function calling, and structured outputs and vision
now, distilled o1-pro-mode probably is o3-mini and o3-mini-high-mode (the naming is becoming just as bad as android)
its the repeat, take model, scale it up, run evals, detect innefficiencies, retrain, scale, distill, see what's not working. when you find a good little zone in the efficiency frontier, release it with a cool name