Preferences

inciampati
Joined 1,541 karma

  1. To have a competitive code writer with ngrams you need more than to "scale up the ngrams" you need to have a corpus that includes all possible codes that someone would want to write. And at that point you'd be better off with a lossless full text index like an r-index. But, the lack of any generalizability in this approach, coupled with its markovian features, will make this kind of model extremely brittle. Although, it would be efficient. You just need to somehow compute all possible language before hand. tldr; language models really are reasoning and generalizing over the domain they're trained on.
  2. Very beautiful. Love this.

    If it helps, AFAIK (I do atomic force microscopy of DNA), DNA's height is closer to 2nm than 4.

  3. The authors study a bunch of wild low rank fine tunes and discover that they share a common... low rank! ... substructure which is itself base model dependent. Humans are (genetically) the same. You need only a handful of PCs to represent the cast majority of variation. But that's because of our shared ancestry. And maybe the same thing is going on here.
  4. > The entire reason for skin color variations is a genetic optimization for UV absorption at specific latitudes vs sunburn risks.

    This seems obvious but was not confirmed by genetic evidence. The rate of adaptation turns out to be much higher than can be explained by skin cancer.

    The real cause appears to be fertility. UV radiation breaks down folate (vitamin B9) in the bloodstream, and folate is critical for DNA synthesis and repair. Folate deficiency causes serious problems in pregnancy, neural tube defects like spina bifida, and may impair sperm production. So darker skin in high-UV equatorial regions likely evolved partly to protect reproductive capacity.

    In the other direction, lower melanin production helps with vitamin D synthesis in lower sunlight environments. Vitamin D requires UV-B radiation to be synthesized in the skin, and melanin inhibits this. Vitamin D is also linked to fertility. It's involved in sex hormone production and has been associated with successful implantation and pregnancy outcomes.

    If you're curious, check out Nina Jablonski and George Chaplin's work. Their hypothesis is that skin color evolution as fundamentally about reproductive fitness: dark enough to protect folate, light enough to synthesize vitamin D. Both nutrients affect fertility, fetal development, and offspring survival. They have an immediate primary impact on fertility and success, while skin cancer even in the most extreme environment/phenotype mismatch, has an onset after reproductive age.

  5. Markov chains have exponential falloff in correlations between tokens over time. That's dramatically different than real text which contains extremely long range correlations. They simply can't model long range correlations. As such, they can't be guided. They can memorize, but not generalize.
  6. Just had a fantastic experience applying agentic coding to CAD. I needed to add some threads to a few blanks in a 3d print. I used computational geometry to give the agent a way to "feel" around the model. I had it convolve a sphere of the radius of the connector across the entire model. It was able to use this technique to find the precise positions of the existing ports and then add threads to them. It took a few tries to get right, but if I had the technique in mind before it would be very quick. The lesson for me is that the models need to have a way to feel. In the end, the implementation of the 3d model had to be written in code, where it's auditable. Perhaps if the agent were able to see images directly and perfectly, I never would have made this discovery.
  7. Love that a term from Vinge has almost entered our lexicon. The author is a "programmer archeologist".
  8. And if you're alone it's worth running with a chair into the street to do it as visibly as possible.
  9. Effective coding is not code first think later.

    LLMs aren't effective when used this way.

    You still have to think.

    IMO a vibe coder who is speaking their ideas to an agent which implements them is going to have way more time to think than a hand coder who is spending 80% of their time editing text.

  10. It turns out you can use a fused triton kernel for a true RNN GRU and run just as fast as the minGRU model in training. Yeah, it doesn't work for very long context but neither does minGRU (activation memory...)
  11. matrix doesn't get used "in the wild" by normies because it's not marketed for anything. However, it does get use "in the wild" by groups who need an IRC/slack/discord system that's open source and truly federated.
  12. I wish I had radar eyes
  13. Alewife parking and the T in?
  14. Kindness would be plotting all of this with a log scale. The plots could be drawn on napkins for how much they explain.
  15. Yes! Strudel and tidal cycles are amazing livecoding systems. I'd like to match the expressiveness of bitwig though so I'm working on a merge of glicol (web based livecoding with signal processing) and studel.
  16. Now do mayonaise.
  17. Isn't webgpu 32-bit?
  18. You're correct, the distinction matters. Autoregressive models have no hidden state between tokens, just the visible sequence. Every forward pass starts fresh from the tokens alone.But that's precisely why they need chain-of-thought: they're using the output sequence itself as their working memory. It's computationally universal but absurdly inefficient, like having amnesia between every word and needing to re-read everything you've written.https://thinks.lol/2025/01/memory-makes-computation-universa...
  19. The setup in the post plus a speech to text system and aidertmux and aider would be sufficient for a very wide range of tasks. Or a multi screen setup with the phone as a multimodal input system and the AR as the screen.

This user hasn’t submitted anything.