Preferences

mxkopy
Joined 477 karma
Mothers know more about AI than anyone else. People are right in that the human brain operates as a sort of transformer, but vastly underestimate how much surprise exists in our interactive world. AI sucks because its metaphors are meaningless to us: “moving fast like the wind” has a purely verbal rather than visual and kinesthetic association to it.

Anyone familiar with how we ourselves learn knows that this won’t lead very far in terms of intelligence. Mothers narrate things to us as children, giving descriptions of how they work; children’s books and TV shows act as supplementary material. Then, we go out and play with these things, discovering the meaning of the descriptions for ourselves. Over thousands of iterations we’ll recognize that a specific word pops up in similar contexts, and learn how it describes their similarities with our own internal abstractions. Our very first meanings relate our actions to their effects on the environment; LLMs know little of this.

This can be bootstrapped. Through sheer largesse current models know some things, and can create plausible simulations of uncomplicated physical events. It’s seen enough gamedev projects to know how to make a simple bouncing ball simulation. This will create the thing to be narrated, and give the narration. The child, trained on the simulation rather than the code, learns what the narration means in a real sense. Once it’s mastered it, it teaches the narrator, as it knows more meaningfully than it, who the applies this imperceptibly improved knowledge in other domains. And so forth.


  1. They must’ve forgot who created the first tech hype bubbles in the first place bc I’m about to replace some of these companies if I don’t get hired soon
  2. This is his ORCID profile, which lists his grants and published works:

    https://orcid.org/0000-0001-9755-6563

  3. Hacking GTA V’s graphics pipeline to get access to the depth buffer, so I can feed it into a self-driving machine learning model. There’s already tools that do this (ReShade & other DX11 hooks) but I want to learn how to do this in general for other types of data & processes.

    On a personal note, I’ve been trying to lean into my fears more. Disassembling binary was always something I knew would be helpful to know but I kind of avoided, so I think this is helps with that a little.

  4. As someone who’s excited to see this happen eventually, it’s not happening anytime soon. Combinatorial optimization techniques are far better suited for this and methods created 50 years ago run laps around LLMs
  5. Sure, thanks for bringing it up. Short-range flights should have a higher higher threshold for permitted use in service of the environment.

    Please, ask more questions.

  6. Unfortunately short of making a startup or monetizing projects it’s hard to get paid purely from the fundamentals
  7. As an unhired junior, I think this stems from a lack of unions and the ability of the workforce to make demands of capital (e.g. to prevent offshoring or discriminatory hiring processes)
  8. I think the point is, can a farm animal do this?

    > When Grandpa died, I chose to use his memory to do good things. Now I volunteer with multiple organizations related to aging farmers.

    If this is what gives life meaning in the universe, you can’t deny that we’re snuffing it out at an industrial scale.

  9. I’ve been thinking a long time about using AI to do binary decompilation for this exact purpose. Needless to say we’re short of a fundamental leap forward from doing that
  10. Horses also run faster than pictures of cars
  11. I’ve also independently concluded Moonlight was the best way to go after trying my hand at a very similar task. I didn’t want to dig through moonlight’s source, but I’m sure if you’re dedicated enough it would pay dividends later on, it basically does everything you’d need for realtime control in the setting of simulating human input.
  12. Python dxcam + windows hook API listening for HID messages
  13. I’m not sure if such an overview exists, but when caffe2 was still a thing and JAX was a big contender dynamic vs static computational graphs seemed to be a major focus point for people ranking the frameworks.
  14. PyTorch is one of those tools that’s so simple and easy to take apart that you feel like you might’ve been able to make it yourself. I can’t imagine how much engineering effort was behind all those moments where I thought to myself, “of course it should work like that, how can it be any other way?”
  15. It seems like some of the dismissals are just summaries of basic decidability theory, which don’t attack the underlying argument of the paper:

    > …the idea that reality can tell us if a statement about a theory is true, given that the theory is an accurate description of reality. So if there’s an accurate Turing complete theory of reality, and we see some process that’s supposed to encode a decision on an undecidable statement being resolved (I guess in a non-probabilistic way as well), then we can conclude that reality is deciding undecidable statements in some nontrivial way.

    One of the stronger skeptics confidently claims that discrete phenomena doesn’t exist in quantum mechanics. I think there’s a bit of a cult of skepticism around this topic, which is usually fine, except when people haven’t read the paper or don’t have basic prerequisite knowledge before announcing their conclusions.

  16. > Integers are fundamental in quantum mechanics, particularly as quantum numbers that define the discrete properties of particles, such as energy levels, angular momentum, and spin.

    > Quantum mechanics dictates that certain properties, like energy and angular momentum, are quantized, meaning they can only exist in discrete packets or "quanta".

    This was from a cursory google search.

  17. There’s a way to talk about this stuff already. LLMs can “think” counterfactually on continuous data, just like VAEs [0], and are able to interpolate smoothly between ‘concepts’ or projections of the input data. This is meaningless when the true input space isn’t actually smooth. It’s system I, shallow-nerve psychomotor reflex type of thinking.

    What LLMs can’t do is “think” counterfactually on discrete data. This is stuff like counting or adding integers. We can do this very naturally because we can think discretely very naturally, but LLMs are bad at this sort of thing because the underlying assumption behind gradient descent is that everything has a gradient (i.e. is continuous). They need discrete rules to be “burned in” [1] since minor perturbations are possible for and can affect continuous-valued weights.

    You can replace “thinking” here with “information processing”. Does an LLM “think” any more or less than say, a computer solving TSP on a very large input? Seeing as we can reduce the former to the latter I wouldn’t say they’re really at all different. It seems like semantics to me.

    In either case, counterfactual reasoning is good evidence of causal reasoning, which is typically one part of what we’d like AGI to be able to do (causal reasoning is deductive, the other part is inductive; this could be split into inference/training respectively but the holy grail is having these combined as zero-shot training). Regression is a basic form of counterfactual reasoning, and DL models are basically this. We don’t yet have a meaningful analogue for discrete/logic puzzley type of problems, and this is the area where I’d say that LLMs don’t “think”.

    This is somewhat touched on in GEB and I suspect “Fluid Concepts and Creative Analogies” as well.

    [0] https://human-interpretable-ai.github.io/assets/pdf/5_Genera...

    [1] https://www.sciencedirect.com/science/article/pii/S089360802...

This user hasn’t submitted anything.