It's exactly like lexers for compilers. This parsing strategy coupled with the decision to then map the results into an embedding space of arbitrary dimensionality is why these models don't work and cannot be said to understand language. They cannot reliably handle fundamental aspects of meaning. They aren't equipped for it.
They're pretty good at coming up with well-formed sentences of English, though. They ought to be given the excessive amounts of data they've seen.
It's exactly like lexers for compilers. This parsing strategy coupled with the decision to then map the results into an embedding space of arbitrary dimensionality is why these models don't work and cannot be said to understand language. They cannot reliably handle fundamental aspects of meaning. They aren't equipped for it.
They're pretty good at coming up with well-formed sentences of English, though. They ought to be given the excessive amounts of data they've seen.