Preferences

glomgril
Joined 112 karma

  1. one man's exfiltration is another man's distillation `¯\_(ツ)_/¯`

    you could say they're playing by a different set of rules, but distilling from the best available model is the current meta across the industry. only they know what fraction of their post-training data is generated from openai models, but personally i'd bet my ass it's greater than zero because they are clearly competent and in their position it would have been dumb to not do this.

    however you want to frame it, they have pushed the field forward -- especially in the realm of open-weight models.

  2. Check out this recent benchmark MTOB (Machine Translation from One Book) -- relevant to your comment, though the book does have parallel passages so not exactly what you have in mind: https://arxiv.org/pdf/2309.16575

    In the case of non-human communication, I know there has been some fairly well-motivated theorizing about the semantics of individual whale vocalizations. You could imagine a first pass at something like this if the meaning of (say) a couple dozen vocalizations could be characterized with a reasonable degree of confidence.

    Super interesting domain that's ripe for some fresh perspectives imo. Feels like at this stage, all people can really do is throw stuff at the wall. The interesting part will begin when someone can get something to stick!

    > that's basically a science-fiction babelfish or universal translator

    Ten years ago I would have laughed at this notion, but today it doesn't feel that crazy.

    I'd conjecture that over the next ten years, this general line of research will yield some non-obvious insights into the structure of non-human communication systems.

    Increasingly feels like the sci-fi era has begun -- what a time to be alive.

  3. Very cool. Got a silly sci-fi question for you. IIUC, with current technology it would take on the order of tens of thousands of years for a vessel to physically travel to the closest known Earth-like planet (correct me if I'm wrong).

    So any thoughts on what kinds of hypothetical breakthroughs would be needed to make the trip doable in (say) less than a human lifetime?

    And related, what do you think about the plausibility of the [Breakthrough Starshot](https://en.wikipedia.org/wiki/Breakthrough_Starshot) initiative? Aware of any alternative approaches?

  4. looks like it's there now
  5. Models like this are experimentally pretrained or tuned hundreds of times over many months to optimize the datamix, hyperparams, architecture, etc. When they say "ran parallel trainings" they are probably referring to parity tests that were performed along the way (possibly also for the final training runs). Different hardware means different lower-level libraries, which can introduce unanticipated differences. Good to know what they are so they can be ironed out.

    Part of it could also be that they'd prefer to move all operations to the in-house trn chips, but don't have full confidence in the hardware yet.

    Def ambiguous though. In general reporting of infra characteristics for LLM training is left pretty vague in most reports I've seen.

  6. He is coming from the perspective of a long-running debate on symbolic versus statistical/data-driven approaches to modeling language structure and use. It seems in recent years he has had trouble coming to terms with the fact that at least for real-world applications of language technology, the statistical approach has simply won the war (or at worst, forms the core foundation on top of which symbolic approaches can have some utility).

    I come from the same academic tradition, and have colleagues in common with him. He has been advocating for a quasi-chomskyan perspective on language science for many years -- as have many others working at the intersection of linguistics and psychology/cog sci.

    TBH I suspect he himself is a large part of his target audience. A lot of older school academics raised in the symbolic tradition are pretty unsettled by the incredible achievements of the data-driven approach.

    Personally I saw the writing on the wall years ago and have transitioned to working in statistical NLP (or "AI" I suppose). Feeling pretty good about that decision these days.

    FWIW I do think symbolic approaches will start to shine in the next several years, as a way to control the behavior of modern statistical LMs. But doubtful they will ever produce anything comparable to current systems without a strong base model trained on troves of data.

    edit: Worth noting that Marcus has produced plenty of high-quality research in his career. I think his main problem here is that he seems to believe that AI systems should function analogously to how human language/cognition functions. But from an engineering/product perspective, how a system works is just not that important compared to how well it works. There's probably a performance ceiling for purely statistical models, and it seems likely that some form of symbolic machinery can raise that ceiling a bit. Techniques that work will eventually make their way into products, no matter which intellectual tradition they come from. But framing things in this way is just not his style.

  7. Savor it while you can. As a former academic, for me the lack of intrinsic motivation to "create value for shareholders" is the hardest part of working in industry.
  8. As painful as it can be at times, it is a truly beautiful phase of life during which your main obligations are to become an expert in something that interests you and to make enough money to not starve and have a place to live. If you are single, coming directly from the "broke college student" lifestyle, and end up at a university with a good stipend, it won't even feel like you are "poor" and the money is mostly enough. But the life of a grad student in a large public university can come with much more financial instability and heavier teaching loads from day one, with less time for slacking off and letting ideas marinate. Less so if you are in a field/have an advisor with good/consistent funding. The devil is in the details.

    Wouldn't change it for the world though, and anecdotally most people I know who ended up finishing the PhD feel the same way.

    Main shortcoming of the (American) grad school experience imo is lack of preparation to join the corporate workforce (in my field, there are easily >10x the graduating PhDs each year than there are available university jobs). Academia has done a terrible job preparing grad students for the harsh reality of a non-academic career. Keeping this in mind throughout grad school will help a lot -- you can see the difference in non-academic career trajectory between people who had a backup plan and those who didn't.

  9. This is just brilliant. Brings back memories, some fond others less so. Only addition I'd suggest is a subplot involving teaching/TAing duties and/or money problems.

    Good to be occasionally reminded that slacking off is a legitimately important part of the scientific process. Wish this view was more popular in the industry.

  10. That's an insane story. As much as I hate flying, modern aviation infrastructure is one of mankind's most impressive feats.
  11. For those interested, there is a podcast about this book and some cases it's been relevant in: https://podcasts.apple.com/us/podcast/hit-man/id1449636432
  12. Yeah that is consistent with my experience for sure. Probably true for lots of others too. I'm pretty neurotic about dumb shit on a day-to-day basis, but the few times something actually extremely serious has gone wrong in/around my life, it's rarely felt particularly scary or panic-inducing -- maybe even less so than the usual "oh my god I probably left the stove on and the building will burn to the ground and it'll be all my fault!"
  13. That's kinda how the pandemic felt to a degree. A happier than average time in my personal life, despite (or perhaps due in part to?!) all the carnage and pandemonium throughout the world. Feels wrong to say, but I will look back on my experience of the pandemic fondly.
  14. Big shouts to visidata -- very underrated tool imo.
  15. The consistency of descriptions is particularly surprising to me. Like you got a roughly circular collection of seemingly random scribbles, but they can tell you exactly which parts of it correspond to the person's nose, hair, arms, eyes, etc. And the descriptions seem to stay the same if you ask about the same picture on different days. Still not sure what to make of this phenomenon but it is fascinating.
  16. I have a very similar plan involving a Raspberry Pi, I agree that it's a great middle ground.

    IMO screentime that only involves video calls with family is perfectly fine, especially if it is a group activity. In general (timeboxed) social activities that involve multiple people watching/interacting with a single screen don't seem as potentially consuming (e.g. a Super Bowl party). That's our arms-length strategy at this point. Seems to have worked quite well so far w.r.t. not getting obsessed with electronics. We'll see what happens when peers start getting phones/tablets though...

  17. Is migrating domains from Gandi to somewhere else easy/possible? I've had the same domain for years, I used to pay like $20 for five years, but then I forgot to renew, and because I let it expire for a few days, now they are charging me $100/year! Sour taste in my mouth about them even before this...
  18. Incredible post. I laughed when I saw the title, snickered at the first paragraph, and then proceeded to be blown away by the rest of it. Thought I was in for a joke and instead I'm thinking about the nature of ML Ops and what it's become.
  19. Training code won't get you much if you don't have the infra/money to gather a suitable dataset or actually execute training. Plus if your goal is to "steal" or riff on the base model, it's already there in the weights. Also probably not difficult to figure out how to fine tune it once you have the weights and tokenizer.
  20. Hopefully so, would really like to know what else is lost by nerfing potentially offensive responses. Can't imagine a project I'd rather work on.

    I think open-assistant.io has a chance to do exactly this. We'll see what kind of moves they make in coming months though, wouldn't be surprised if they go the safer route.

  21. This is the best advice on the topic that I've seen.
  22. Yeah I think iPhone is a very apt analogy: certainly not the first product of its kind, but definitely the first wildly successful one, and definitely the one people will point to as the beginning of the smartphone era. I suspect we'll look back on ChatGPT in a similar light ten years from now.
  23. I'm shocked that I've never heard this argument before. Very interesting and indeed, disturbing as well.
  24. Yeah this is pretty much my line of thinking. Not making any grand claims, just questioning the conventional wisdom. No doubt that the material conditions, health, and conveniences of today are unparalleled in history, that's just not what I'm talking about right now.
  25. Yeah that's pretty much where I end up: what it means to "thrive" is subjective and there's just not really a matter of the fact about what lifestyle is "best" for humans. Nevertheless, there are some scenarios that I think would be close to universally viewed as worse or better than others. I think the era we're living in now is close to being universally viewed as better than the past. Grass is always greener though I spose
  26. Right -- medical advances are the major exception to this line of thinking, as I noted.
  27. Limitations and methodological questions about OP aside, I really question whether our modern lifestyle and all the technologies that enable it have led to increased human thriving. Medical advances have obviously been huge in making life longer and less painful for the (lucky subsets of the) masses. But setting that aside, is life today (in a rich western country) more "enjoyable" or "fulfilling" now than it was, say, 500 years ago? What about 200 or 100? Or 5000? IDK, but increasingly it feels like a simple life in a small insular community with limited access to information is a better setup for "human thriving" than what we're living in today.

    Obviously life is much "easier" today from the perspective of material conditions, but why assume that this is a "good" thing?

    Or maybe I'm just a delusional Teddy K fanboy who's sick of typing on a computer all day, day after day, year after year.

  28. If you're at Amazon you'll walk away with 5% after one year, and only 20% after two years.
  29. Interesting perspective, agreed on some of the points.

    I'd like to hear some opinions from people who have worked as both "an engineer" and "a software engineer" (maybe OP is such a person, idk) -- what kind of corner-cutting is there in (non-software) engineering fields? Is it at all comparable to tech debt? What kind of compromises in quality/design are made in the service of profit or career advancement? etc.

    I think a lot of people end up with an impression that in e.g. civil engineering, everything is perfect and precise and elegant because it has to be (otherwise crumbling infrastructure, accidents, etc.). But understanding that humans in general are always looking to cut corners and be lazy, I wonder how realistic that impression really is... Wouldn't be surprised to hear about comical inefficiencies and poor practices that have become normalized over decades of designing/building physical stuff.

This user hasn’t submitted anything.

Keyboard Shortcuts

Story Lists

j
Next story
k
Previous story
Shift+j
Last story
Shift+k
First story
o Enter
Go to story URL
c
Go to comments
u
Go to author

Navigation

Shift+t
Go to top stories
Shift+n
Go to new stories
Shift+b
Go to best stories
Shift+a
Go to Ask HN
Shift+s
Go to Show HN

Miscellaneous

?
Show this modal