The system prompts I've seen are absolutely massive.
The n² time complexity smells like it could be reduced by algorithm engineering. Maybe doing a preprocessing pass to filter out attending to tokens (not sure what the right term of art is here) that do not contribute significantly to the meaning of the input. Basically some sort of context compression mechanism.
Maybe there is none, and this is just one example of a fundamental LLM limitation.
I think it's much more interesting to focus on use cases which don't require that, where gen AI is an intermediate step, a creator of input (whether for humans or for other programs).
Edit: Come to think of it, training on a Q&A format is probably better - "Is there a seahorse emoji? No, there isn't."
If I see some example of an LLM saying dumb stuff here, I know it's going to be fixed quickly. If I encounter an example myself and refuse to share it, it may be fixed with a model upgrade in a few years. Or it may still exist.
Before coming up with the solution, I think you’d need to understand the problem much more deeply.
For the time being this issue can be mitigated by not asking about seahorse emoji.
We are closing this support ticket as the issue is an inherent limitation of the underlying technology and not a bug in our specific implementation."