Essentially, pattern matching can outperform humans at many tasks. Just as computers and calculators can outperform humans at tasks.
So it is not that LLMs can't be better at tasks, it is that they have specific limits that are hard to discern as pattern matching on the entire world of data is kind of an opaque tool in which we can not easily perceive where the walls are and it falls completely off the rails.
Since it is not true intelligence, but a good mimic at times, we will continue to struggle with unexpected failures as it just doesn't have understanding for the task given.
"the fact that its hallucinations are quite sophisticated to me means that they are errors humans also could reach"
Fallacy of affirmation of the consequent.
I am not saying that the LLMs are better than you analyze but rather that average humans are worse. (Well trained humans will continue to be better alone than LLMs alone for some time. But compare an LLM to an 18 year old.)