Charitably, I don’t understand what those like you mean by the “whole facade” and why you use these old machine learning metrics like “accuracy rate” to assess what’s going on. Facade implies that the unprecedented and still exponential organic uptake of GPT (again see the actual data I linked earlier from Mary Meeker) is just a hype-generated fad, rather than people finding it actually useful to whatever end. Indeed, the main issue with the “facade” argument is that it’s actually what dominates the media (Marcus et al) much more than any hyperbolic pro-AI “hype.”
This “80-20” framing, moreover, implies we’re just trying to asymptotically optimize a classification model or some information retrieval system… If you’ve worked with LLMs daily on hard problems (non-trivial programming and scholarly research, for example), the progress over even just the last year is phenomenal — and even with the presently existing models I find most problems arise from failures of context management and the integration of LLMs with IR systems.
This “80-20” framing, moreover, implies we’re just trying to asymptotically optimize a classification model or some information retrieval system… If you’ve worked with LLMs daily on hard problems (non-trivial programming and scholarly research, for example), the progress over even just the last year is phenomenal — and even with the presently existing models I find most problems arise from failures of context management and the integration of LLMs with IR systems.