thom parent
Yeah, I've started to think AI smoke tests for cognitive complexity should be a fundamental part of API/schema design now. Even if you think the LLMs are dumb, Stupidity as a Service is genuinely useful.
Is this you have implemented in practice? Sounds like a great idea, but I have no idea how you would make it work it a structured way (or am I missing the point…?)
Can be easy depending on your setup - you can basically just write high level functional tests matching use cases of your API, but as prompts to a system with some sort of tool access, ideally MCP. You want to see those tests pass, but you want them to pass with the simplest possible prompt (a sort of regularization penalty, if you like). You can mutate the prompts using an LLM if you like to try different/shorter phrasings. The Pareto front of passing tests and prompt size/complexity is (arguably) how good a job you're doing structuring/documenting your API.