Preferences

CS is full of trivial examples of this. You can use an optimized parallel SIMD merge sort to sort a huge list of ten trillion records, or you can sort it just as fast with a bubble sort if you throw more hardware at it.

The real bitter lesson in AI is that we don't really know what we're doing. We're hacking on models looking for architectures that train well but we don't fully understand why they work. Because we don't fully understand it, we can't design anything optimal or know how good a solution can possibly get.


xg15
> You can use an optimized parallel SIMD merge sort to sort a huge list of ten trillion records, or you can sort it just as fast with a bubble sort if you throw more hardware at it.

Well, technically, that's not true: The entire idea behind complexity theory is that there are some tasks that you can't throw more hardware at - at least not for interesting problem sizes or remotely feasible amounts of hardware.

I wonder if we'll reach a similar situation in AI where "throw more context/layers/training data at the problem" won't help anymore and people will be forced to care more about understanding again.

svachalek
I think it can be argued that ChatGPT 4.5 was that situation.
jimbokun
And whether that understanding will be done by humans or the AIs themselves.
dan-robertson
Do you have a good reference for sims merge sort? The only examples I found are pairwise-merging large numbers of streams but it seems pretty hard to optimise the late steps where you only have a few streams. I guess you can do some binary-search-in-binary-search to change a merge of 2 similarly sized arrays into two merges of similarly sized arrays into sequential outputs and so on.

More precisely, I think producing a good fast merge of ca 5 lists was a problem I didn’t have good answers for but maybe I was too fixated on a streaming solution and didn’t apply enough tricks.

This item has no comments currently.