Comment by jvanderbot

jvanderbot Jun 24, 2025 parent

Ok great! This is precisely how I chunk numbers for comparison. And not to diminish a solid result or the usefulness of it or the baseline tech: its clear that it we keep having to create situation - specific inputs or processes, we're not at AGI with this baseline tech

chmod775 Jun 25, 2025

> [..] we're not at AGI with this baseline tech

DAG architectures fundamentally cannot be AGI and you cannot even use them as a building block for a hypothetical AGI if they're immutable at runtime.

Any time I hear the goal being "AGI" in the context of these LLMs, I feel like listening to a bunch of 18th-century aristocrats trying to get to the moon by growing trees.

Try to create useful approximations using what you have or look for new approaches, but don't waste time on the impossible. There's no iterative improvements here that will get you to AGI.

kristjansson Jun 25, 2025

> "So... what does the thinking?"

> "You're not understanding, are you? The brain does the thinking. The meat."

> "Thinking meat! You're asking me to believe in thinking meat!"

https://www.mit.edu/people/dpolicar/writing/prose/text/think...

munksbeer Jun 26, 2025

It doesn't feel particularly interesting to keep dismissing "these LLMs" as incapable of reaching AGI.

It feels more interesting to note that this time, it is different. I've been watching the field since the 90s when I first dabbled in crude neural nets. I am informed there was hype before, but in my time I've never seen progress like we've made in the last five years. If you showed it to people from the 90s, it would be mind blowing. And it keeps improving incrementally, and I do not think that is going to stop. The state of AI today is the worst it will ever be (trivially obvious but still capable of shocking me).

What I'm trying to say is that the shocking success of LLMs has become a powerful engine of progress, creating a positive feedback loop that is dramatically increasing investment, attracting top talent, and sharpening the focus of research into the next frontiers of artificial intelligence.

dTal Jun 26, 2025

>If you showed it to people from the 90s, it would be mind blowing

90's? It's mind blowing to me now.

My daily driver laptop is (internally) a Thinkpad T480, a very middle of the road business class laptop from 2018.

It now talks to me. Usually knowledgeably, in a variety of common languages, using software I can download and run for free. It understands human relationships and motivations. It can offer reasonably advice and write simple programs from a description. It notices my tone and tries to adapt its manner.

All of this was inconceivable when I bought the laptop - I would have called it very unrealistic sci-fi. I am trying not to forget that.

AllegedAlec Jun 25, 2025

Thank you. It's maddening how people keep making this fundamental mistake.

mgraczyk Jun 25, 2025

This is meant to be some kind of Chinese room argument? Surely a 1e18 context window model running at 1e6 tokens per second could be AGI.

chmod775 Jun 25, 2025

Personally I'm hoping for advancements that will eventually allow us to build vehicles capable of reaching the moon, but do keep me posted on those tree growing endeavors.

mgraczyk Jun 25, 2025

Tree growing?

And I don't follow, we've had vehicles capable of reaching the moon for over 55 years

anonymoushn Jun 25, 2025

It's about the immutability of the network at runtime. But I really don't think this is a big deal. General-purpose computers are immutable after they are manufactured, but can exhibit a variety of useful behaviors when supplied with different data. Human intelligence also doesn't rely on designing and manufacturing revised layouts for the nervous system (within a single human's lifetime, for use by that single human) to adapt to different settings. Is the level of mutability used by humans substantially more expressive than the limits of in-context learning? what about the limits of more unusual in-context learning techniques that are register-like, or that perform steps of gradient descent during inference? I don't know of a good argument that all of these techniques used in ML are fundamentally not expressive enough.

3 More Comments →

VonGallifrey Jun 25, 2025

Excuse me for the bad joke, but it seems like your context window was too small.

The Tree growing comment was a reference to another comment earlier in the comment chain.

mgraczyk Jun 25, 2025

It's not a tree though

lukan Jun 25, 2025

"Surely a 1e18 context window model running at 1e6 tokens per second could be AGI."

And why?

mgraczyk Jun 25, 2025

Because that's quite a bit more information processing than any human brain

lukan Jun 25, 2025

I don't think it is quantity that matters. Otherwise supercomputers are smart by definition.

2 More Comments →

rar00 Jun 25, 2025

This argument works better for state space models. A transformer would still steps context one token at a time, not maintain an internal 1e18 state.

mgraczyk Jun 25, 2025

That doesn't matter, are you familiar with any theoretical results in which the computation is somehow limited in ways that practically matter when the context length is very long? I am not

This item has no comments currently.