Any model can provide perfect JSON according to a schema if you discard non-conforming logits.
I imagine that validation as you go could slow things down though.
svachalek
The technical term is constrained decoding. OpenAI has had this for almost a year now. They say it requires generating some artifacts to do efficiently, which slows down the first response but can be cached.
ethbr1
Expect this is a problem pattern that will be seen a lot with LLMs.
Do I look at whether the data format is easily output by my target LLM?
Or do I just validate clamp/discard non-conforming output?
I imagine that validation as you go could slow things down though.