Not sure why people treat MCP like it's much more than smashing tool descriptions together and concatenating to the prompt, but here we are.
It is nice to have a standard definition of tools that models can be trained/fine tuned for, though.
As an example, anyone who’s coded email templates will tell you: it’s hard. While the major browsers adopted the W3C specs, email clients (I.e. email renderers) never adopted the spec, or such a W3C email HTML spec didn’t exist. So something that renders correctly in Gmail looks broken in Yahoo mail in Safari on iOS, etc.
How does it differ from providing a non MCP REST API?
However, as someone that has tried to use OpenAPI for that in the past (both via OpenAI's "Custom GPT"s and auto-converting OpenAPI specifications to a list of tools), in my experience almost every existing OpenAPI spec out there is insufficient as a basis for tool calling in one way or another:
- Largely insufficient documentation on the endpoints themselves
- REST is too open to interpretation, and without operationIds (which almost nobody in the wild defines), there is usually context missing on what "action" is being triggered by POST/PUT/DELETE endpoints (e.g. many APIs do a delete of a resource via a POST or PUT, and some APIs use DELETE to archive resources)
- baseUrls are often wrong/broken and assumed to be replaced by the API client
- underdocumented AuthZ/AuthN mechanisms (usually only present in the general description comment on the API, and missing on the individual endpoints)
In practice you often have to remedy that by patching the officially distributed OpenAPI specs to make them good enough for a basis of tool calling, making it not-very-plug-and-play.
I think the biggest upside that MCP brings (given all "content"/"functionality") being equal is that using it instead of just plain REST, is that it acts as a badge that says "we had AI usage in mind when building this".
On top of that, MCP also standardizes mechanisms like e.g. elicitation that with traditional REST APIs are completely up to the client to implement.
My gripe is that they had the opportunity to spec out tool use in models and they did not. The client->llm implementation is up to the implementor and many models differ with different tags like <|python_call|> etc.
I'm with you on an Agent -> LLM industry standard spec need. The APIs are all over the place and it's frustrating. If there was a spec for that, then agent development becomes simply focused on the business logic and the LLM and the Tools/Resource are just standardized components you plug together like Lego. I've basically done that for our internal agent development. I have a Universal LLM API that everything uses. It's helped a lot.
It has the physical plug, but what can it actually do?
It would be nice to see a standard aiming for better UX than USB C. (Imho they should have used colored micro dots on device and cable connector to physically declare capabilities)
If I am looking at a device/cable, with my eyes, in the physical world, and ask the question "What does this support?", there's no way to tell.
I have to consult documentation and specifications, which may not exist anymore.
So in the case of standards like MCP, I think it's important to come up with answers to discovery questions, lest we all just accept that nothing can be done and the clusterfuck in +10 years was inevitable.
A good analogy might be imagining how the web would have evolved if we'd had TCP but no HTTP.