Preferences

HarHarVeryFunny
Joined 5,330 karma

  1. > LLMs are just seemingly intelligent autocomplete engines

    Well, no, they are training set statistical predictors, not individual training sample predictors (autocomplete).

    The best mental model of what they are doing might be that you are talking to a football stadium full of people, where everyone in the stadium gets to vote on the next word of the response being generated. You are not getting an "autocomplete" answer from any one coherent source, but instead a strange composite response where each word is the result of different people trying to steer the response in different directions.

    An LLM will naturally generate responses that were not in the training set, even if ultimately limited by what was in the training set. The best way to think of this is perhaps that they are limited to the "generative closure" (cf mathematical set closure) of the training data - they can generate "novel" (to the training set) combinations of words and partial samples in the training data, by combining statistical patterns from different sources that never occurred together in the training data.

  2. I'm curious how you are testing/trying these latest models? Do you have specific test/benchmark tasks that they struggle with that you are trying, and/or are you working on a real project and just trying alternatives where another model is not performing well ?
  3. Presumably that would reflect Gemini 3.0 Flash having more extensive RL for coding training than Pro ? Maybe we can expect a "Gemini 3 Pro Coding" model in the future?

    Opus 4.5 seems different - Anthropic's best coding model, but also their frontier general purpose model.

  4. Surely Gemini 3.0 Pro would be the appropriate comparison.

    If you want to compare the weakest models from both companies then Gemini Flash vs GPT Instant would seem to be best comparison, although Claude Opus 4.5 is by all accounts the most powerful for coding.

    In any case, it will take a few weeks for any meaningful test comparisons to be made, and in the meantime it's hard not to see any release from OpenAI since they announced "Code Red" (aka "we're behind the competition") a few days ago as more marketing than anything else.

  5. I would say it more goes back to the Google Brain + DeepMind merger, creating Google DeepMind headed by Demis Hassabis.

    The merger happened in April 2023.

    Gemini 1.0 was released in Dec 2023, and the progress since then has been rapid and impressive.

  6. Does automation use generally pay taxes?

    Should Amazon pay taxes for using factory robots in lieu of people?

    Should fabric manufacturers pays taxes for using automated looms instead of hand weaving ?

    Even if lawmakers wanted to tax AI, how would they do it? How do you measure the AI usage level at a company, or the number of workers it has displaced?

  7. Sure, but ANNs are at least connectionist, learning connections/strengths and representations, etc - close enough at that level of abstraction that I think ANNs can suggest how the brain may be learning certain things.
  8. Having a deprecated API just randomly return failures is an awful idea!

    Better to give an actual timeline (future version & date) for when deprecated functionality / functions will be removed, and in the meantime, if the language supports it, mark those functions as deprecated (e.g. C++ [[deprecated]] attribute) so that developers see compilation warnings if they failed to read the release notes.

  9. I'd like to pre-register my complaint! :-)

    I think it depends on what they use it for. For fantasy stuff like cartoons, aliens and (not fantasy) dinosaurs it may be ok, and I guess they could train on old hand-animated cartoons to retain that charm (and cartoon tropes like running in place but not moving) if they wanted to. If they use it to generate photo-realistic humans then it's going to be uncanny valley and just feel fake.

    It would be interesting to see best effort at an AI dinosaur walking - a sauropod using the trained motion of an elephant perhaps, which may well be more animal-like than CGI attempts to do the same.

  10. Yes, but animal/human brains (cortex) appear to have evolved to be prediction machines, originally mostly predicting evolving sensory inputs (how external objects behave), and predicting real-world responses to the animal's actions.

    Language seems to be taking advantage of this pre-existing predictive architecture, and would have again learnt by predicting sensory inputs (heard language), which as we have seen is enough to induce ability to generate it too.

  11. Yes, but at least now we're comparing artificial to real neural networks, so the way it works at least has a chance of being similar.

    I do think that a transformer, a somewhat generic hierarchical/parallel predictive architecture, learning from prediction failure, has to be at least somewhat similar to how we learn language, as opposed to a specialized Chompyskan "language organ".

    The main difference is perhaps that the LLM is only predicting based on the preceding sequence, while our brain is driving language generation by a combination of sequence prediction and the thoughts being expressed. You can think of the thoughts being a bias to the language generation process, a bit like language being a bias to a diffusion based image generator.

    What would be cool would be if we could to some "mechanistic interpretability" work on the brain's language generation circuits, and perhaps discover something similar to induction heads.

  12. No - there was no separation of AST and interpreter. You could consider it just as a directly executable AST.
  13. So we're soon going to see Sora-generated Disney movies?
  14. You can count me as an AGI sceptic to extent that I don't think LLMs are the approach that are going to get us there, but I'm equally confident that we will get there, and that predictive neural nets are the core of the right approach.

    The article is a bit rambling, but the main claims seem to be:

    1) Computers can't emulate brains due to architecture (locality, caching, etc) and power consumption

    2) GPUs are maxxing out in terms of performance (and implicitly AGI has to use GPUs)

    3) Scaling is not enough, since due to 2) scaling is close to maxxing out

    4) AGI won't happen because he defines AGI as requiring robotics, and seeing scaling of robotic experience as a limiting factor

    5) Superintelligence (which he associates with self-improving AGI) won't happen because it'll again require more compute

    It's a strange set of arguments, most of which don't hold up, and both manages to miss what is actually wrong with the current approach, and to conceive of what different approach will get us to AGI.

    1) Brains don't have some exotic architecture than somehow gives them an advantage over computers in terms of locality, etc. The cortex is in fact basically a 2-D structure - a sheet of cortical columns, with a combination of local and long distance connections.

    Where brains are different from a von-neumann architecture is that compute & memory are one and the same, but if we're comparing communication speed between different cortical areas, or TPU/etc chips, then the speed advantage goes to the computer.

    2) Even if AGI had to use matmul and systolic arrays, and GPUs are maxxing out in terms of FLOPs, we could still scale compute, if needed, just by having more GPUs and faster and.or wider interconnect.

    3) As above, it seems we can scale compute just by adding more GPUs and faster interconnect if needed, but in any case I don't think inability to scale is why AGI isn't about to emerge from LLMs.

    4) Robotics and AGI are two separate things. A person lying in a hospital bed still has a brain and human-level AGI. Robots will eventually learn individually on-the-job, just as non-embodied AGI instances will, so size of pre-training datasets/experience will become irrelevant.

    5) You need to define intelligence before supposing what super-human intelligence is and how it may come about, but Dettmers just talks about superintelligence in hand-wavy fashion as something that AGI may design, and assumes that whatever it is will require more compute than AGI. In reality intelligence is prediction and is limited in domain by your predictive inputs, and in quality/degree by the sophistication of your predictive algorithms, neither of which necessarily need more compute.

    What is REALLY wrong with the GPT LLM approach, and why it can't just be scaled to achieve AGI, is that it is missing key architectural and algorithmic components (such as incremental learning, and a half dozen others), and perhaps more fundamentally that auto-regressive self-prediction is just the wrong approach. AGI needs to learn to act and predict the consequences of it's own actions - it needs to predict external inputs, not generative sequences.

  15. > The process of breaking a complex problem down into the right primitives requires great understanding of the original problem in the first place.

    Yes, but with experience that just becomes a matter of recognizing problem and design patterns. When you see a parsing problem, you know that the simplest/best design pattern is just to define a Token class representing the units of the language (keywords, operators, etc), write a NextToken() function to parse characters to tokens, then write a recursive descent parser using that.

    Any language may have it's own gotchas and edge cases, but knowing that recursive descent is pretty much always going to be a viable design pattern (for any language you are likely to care about), you can tackle those when you come to them.

  16. That's a good point - recursive descent as a general lesson in program design, in addition to being a good way to write a parser.

    Table driven parsers (using yacc/etc) used to be emphasized in old compiler writing books such as Aho & Ullman's famous "dragon (front cover) book". I'm not sure why - maybe part efficiency for the slower computers of the day, and part because in the infancy of computing a more theoretical/algorithmic approach seemed more sophisticated and preferable (the cannonical table driven parser building algorithm was one of Knuth's algorithms).

    Nowadays it seems that recursive descent is the preferred approach for compilers because it's ultimately more practical and flexible. Table driven can still be a good option for small DSLs and simple parsing tasks, but recursive descent is so easy that it's hard to justify anything else, and LLM code generation now makes that truer than ever!

    There is a huge difference in complexity between building a full-blown commercial quality optimizing compiler and a toy one built as a learning exercise. Using something like LLVM as a starting point for a learning exercise doesn't seem very useful (unless your goal is to build real compilers) since it's doing all the heavy lifting for you.

    I guess you can argue about how much can be cut out of a toy compiler for it still to be a useful learning exercise in both compilers and tackling complex problems, but I don't see any harm in going straight from parsing to code generation, cutting out AST building and of course any IR and optimization. The problems this direct approach causes for code generation, and optimization, can be a learning lesson for why a non-toy compiler uses those!

    A fun approach I used at work once, wanting to support a pretty major C subset as the language supported by a programmable regression test tool, was even simpler ... Rather than having the recursive descent parser generate code, I just had it generate executable data structures - subclasses of Statement and Expression base classes, with virtual Execute() and Value() methods respectively, so that the parsed program could be run by calling program->Execute() on the top level object. The recursive descent functions just returned these statement or expression values directly. To give a flavor of it, the ForLoopStatement subclass held the initialization, test and increment expression class pointers, and then the ForLoopStatement::Execute() method could just call testExpression->Value() etc.

  17. Peer coding?

    Maybe common usage is shifting, but Karpathy's "vibe coding" was definitely meant to be a never look at the code, just feel the AI vibes thing.

  18. I think that's what makes it funny - the future turns out to be just as dismal and predictable as we expect it to be. Google kills Gemini, etc.

    Humor isn't exactly a strong point of LLMs, but here it's tapped into the formulaic hive mind of HN, and it works as humor!

  19. Because the LLM has presumably been trained on more React than WASM, and will do a better job of it.

    ya filthy animal!

  20. Obviously right now the best language to use LLMs for, vibe coding or not, is whatever they are most familiar with, although not sure what this actually is! Java?

    Going forwards, when LLMs / coding tools are able to learn new languages, then languages designed for machines vs humans certainly makes sense.

    Languages designed for robust error detection and checking, etc. Prefer verbosity where it adds information rather than succintness. Static typing vs dynamic. Contractual specification of function input/output guarantees. Modular/localized design.

    It's largely the same considerations that make a language good for large team, large code base projects, opposite end of the spectrum to scripting languages, except that if it's machine generated you can really go to town on adding as much verbosity is needed to tighten the specification and catch bugs at compile time vs runtime.

This user hasn’t submitted anything.