> In the meantime, the LLMs by major providers get smarter every day.
Are they though? Or are they just getting better at gaming benchmarks?
Subjectively, there has been modest progress in the past year, but I'm curious to hear other anecdotes from people that aren't firmly invested in the hype.
If you have used Sonnet 3.5, 3.7 and 4 in the last few months, you know how much the model has improved. I am achieving 3-5x complexity with latest Sonnet as compared to what was possible with the earlier versions.
They are getting much much better.
Specialized, fine-tuned models sit somewhere in between LLMs and traditional procedural code. The fine-tuning process takes time and is a risk if it goes wrong. In the meantime, the LLMs by major providers get smarter every day.
Sure enough, latency and cost are a thing. But unless you have a very specific task performed at a huge scale, you might be better off using an off-the-shelf LLM.