- I think the future is likely one that mixes the kitchen-sink style MCP resources with custom skills.
Services can provide an MCP-like layer that provides semantic definitions of everything you can do with said service (API + docs).
Skills can then be built that combine some subset of the 3rd party interfaces, some bespoke code, etc. and then surface these more context-focused skills to the LLM/agent.
Couldn’t we just use APIs?
Yes, but not every API is documented in the same way. An “MCP-like” registry might be the right abstraction for 3rd parties to expose their services in a semantic-first way.
- Is that technically not a new pretrained model?
(Also not sure how that would work, but maybe I’ve missed a paper or two!)
- Not to be the “ai” guy, but LLMs have helped me explore areas of human knowledge that I had postponed otherwise
I am of the age where the internet was pivotal to my education, but the teacher’s still said “don’t trust Wikipedia”
Said another way: I grew up on Google
I think many of us take free access to information for granted
With LLMs, we’ve essentially compressed humanity’s knowledge into a magic mirror
Depending on what you present to the mirror, you get some recombined reflection of the training set out
Is it perfect? No. Does it hallucinate? Yes. It it useful? Extremely.
As a kid that often struggled with questions he didn’t have the words for, Google was my salvation
It allowed me to search with words I did know, to learn about words I didn’t know
These new words both had answer and opened new questions
LLMs are like Google, but you can ask your exact question (and another)
Are they perfect? No.
The benefit of having expertise in some area, means I can see the limits of the technology.
LLMs are not great for novelty, and sometimes struggle with the state of the art (necessarily so).
Their biggest issue is when you walk blindly, LLMs will happily lead the unknowing junior astray.
But so will a blogpost about a new language, a new TS package with a bunch of stars on GitHub, or a new runtime that “simplifies devops”
The biggest tech from the last five years is undoubtedly the magic mirror
Whether it can evolve to Strong AI or not is yet to be seen (and I think unlikely!)
- It’s possible they’re using some new architecture to get more up-to-date data, but I think that’d be even more of a headline.
My hunch is that this is the same 5.1 post-training on a new pretrained base.
Likely rushed out the door faster than they initially expected/planned.
- > “a new knowledge cutoff of August 2025”
This (and the price increase) points to a new pretrained model under-the-hood.
GPT-5.1, in contrast, was allegedly using the same pretraining as GPT-4o.
- 3 points
- I’ve had a similar thought about language, and the evolution or lack thereof with LLMs.
With the printing press+internet, one might argue that we’ve helped cement current languages, making it harder for language to evolve naturally.
(A counterpoint may be slang/memes/etc. which has likely increased the velocity of new words for any given language.)
In either case, one might see LLMs as further cementing language, as it’s the thing the machines understand (until their next training run).
Assuming we struggle to make LLMs that learn in realtime, one might suspect that these amazing new tools might further cement the status quo, meaning less new words than before.
With all that said, I think I’ve come to the conclusion that LLMs will likely speed up the evolution of language.
The hypothesis being, that future generations will develop communication that the robots can’t read, at least at first.
A never-ending game of cat and mouse; while the cat is on v6, the mouse is on v7. Ad infinitum.
- Claude, the albino alligator, has passed away :(
Though not related to the naming of Anthropic's Claude, he was a staple of their holiday party[0].
[0]https://www.wsj.com/lifestyle/workplace/claude-albino-alliga...
- Oh man this is awesome. Recently integrated xterm.js on a new project and was frustrated with the limitations. Great work!
- > we won’t work on product marketing for AI stuff, from a moral standpoint, but the vast majority of enquiries have been for exactly that
Although there’s a ton of hype in “AI” right now (and most products are over-promising and under-delivering), this seems like a strange hill to die on.
imo LLMs are (currently) good at 3 things:
1. Education
2. Structuring unstructured data
3. Turning natural language into code
From this viewpoint, it seems there is a lot of opportunity to both help new clients as well as create more compelling courses for your students.
No need to buy the hype, but no reason to die from it either.
- Thanks, updated to make more clear
- > Pricing is now $5/$25 per million [input/output] tokens
So it’s 1/3 the price of Opus 4.1…
> [..] matches Sonnet 4.5’s best score on SWE-bench Verified, but uses 76% fewer output tokens
…and potentially uses a lot less tokens?
Excited to stress test this in Claude Code, looks like a great model on paper!
- > Star Us on GitHub and Get Exclusive Day 1 Badge for Your Networks
This made me close the tab.
Stars have been gamed for awhile on GitHub, but given the single demo, my best guess is that this is trying to build hype before having any real utility.
- This is exciting news, as I have some elegantly scribed family diaries from the 1800s that I can barely read (:
With that said, the writing here is a bit hyperbolic, as the advances seem like standard improvements, rather than a huge leap or final solution.
- Curious if they've built their own library for this or if they're using the same one as OpenAI[0].
A quick look at the llguidance repo doesn't show any signs of Anthropic contributors, but I do see some from OpenAI and ByteDance Seed.
- The biggest risk isn’t strong AI rebelling, it’s humans using weak AI to attack other humans.
Ahahaha, not entirely wrong!