Preferences

mcqueenjordan
Joined 1,412 karma
Jordan McQueen Tokyo, Japan

site: https://jm.dev email: j+hn@jm.dev Twitter (eng): @jmq_en

follow:

security: tptacek, moxie, nickpsecurity, strcat, agl, SwellJoe, drewcrawford, schoen, mirimir, secfirstmd, mjg59, userbinator, gorhill, rgovostes, lallysingh, malandrew, mikewest, jedberg, wtarreau, michaelaiello, segmondy, whitequark_, jsnell, salgernon, geofft, ericb

startup: pg, sama, garry, mbesto, coffeemug, davidu, rms, xal, malgorithms, Alex3917, jacquesm, jl, AndrewWarner, emmett, ig1, anateus, lpolovets, ivankirigin, sahillavingia, Joshua, ericflo, immad, rdl, joshfraser, gdb, grellas, gatsby, mayop100, bryanh, josephsunny, ayw, pbiggar, sytse, csallen, joshu, jhuckesteinm, pclark, whockey, sjtgraham, jenthoven, aresant, mayop100, cmdrtaco

code: aphyr, KiranDave, peterwaller, saosebastiao, amirmc, lhorie, judofyr, nikita, huhtenberg, pron, Animats, dmbaggett, chris_wot, aaronbrethorst, grey-area, drewg123, dom96, jashkenas, rich_harris, jordwalke, Homunculiheaded, mark_l_watson, kibwen, Sir_Cmpwn, KenoFischer, ahoyhere, scrollaway, trishume, dbaupp, skrebbel, cryptica, peterhunt, rauchg, TazeTSchnitzel

systems: bcantrill, brendangregg, DannyBee, steveklabnik, fsk, pcwalton, jbk, ajross, yosefk, netguy, minimax, munificent, ColinWright, beat, zwischenzug, derefr, jandrewrogers, shykes, lallysingh, dochtman, SamReidHughes, hnkimb3558, rurban

mods: dang

linux: rwmj, pdkl95

oth: phkahler, davidw, antirez, gwern, patio11, jgrahamc, darksaints, jamwt, nostrademons, plinkplonk, mikekchar, holman, mikeash, edw519, jrockway, noonespecial, staunch, petercooper, jmathai, tzs, jacques_chester, coldtea, peteretep, happy-go-lucky, aaronbrethorst, mtgx

os: vezzy-fnord, rbehrends, vardump, amirmc, pjmlp, rsync, waddlesplash

db: craigkerstiens, teraflop, ifcologne

graphics: pcolton

net: zx2c4, keithwinstein, bsder, walrus01

aws: colmmacc, _msw_, aligouri, illumin8, jcrites, socttlegrand2, openasocket, otterley, mslot

ai: jph00


  1. I guess to follow up slightly more:

    - I think the "if you use another model" rebuttal is becoming like the No True Scotsman of the LLM world. We can get concrete and discuss a specific model if need be.

    - If the use case is "generate this function body for me", I agree that that's a pretty good use case. I've specifically seen problematic behavior for the other ways I'm seeing it OFTEN used, which is "write this feature for me", or trying to one shot too much functionality, where the LLM gets to touch data structures, abstractions, interface boundaries, etc.

    - To analogize it to writing: They shouldn't/cannot write the whole book, they shouldn't/cannot write the table of contents, they cannot write a chapter, IMO even a paragraph is too much -- but if you write the first sentence and the last sentence of a paragraph, I think the interpolation can be a pretty reasonable starting point. Bringing it back to code for me means: function bodies are OK. Everything else gets questionable fast IME.

  2. As usual with Oxide's RFDs, I found myself vigorously head-nodding while reading. Somewhat rarely, I found a part that I found myself disagreeing with:

    > Unlike prose, however (which really should be handed in a polished form to an LLM to maximize the LLM’s efficacy), LLMs can be quite effective writing code de novo.

    Don't the same arguments against using LLMs to write one's prose also apply to code? Was this structure of the code and ideas within the engineers'? Or was it from the LLM? And so on.

    Before I'm misunderstood as a LLM minimalist, I want to say that I think they're incredibly good at solving for the blank page syndrome -- just getting a starting point on the page is useful. But I think that the code you actually want to ship is so far from what LLMs write, that I think of it more as a crutch for blank page syndrome than "they're good at writing code de novo".

    I'm open to being wrong and want to hear any discussion on the matter. My worry is that this is another one of the "illusion of progress" traps, similar to the one that currently fools people with the prose side of things.

  3. The jankiness of the original had a lot of charm, almost selling the dystopian absurdity of trying to deploy a service via the janky voice and slightly desync'd audio and animation. I don't think it's just nostalgia, because I felt the same way watching it the first time all those years ago.

    I think AI slop is decidedly different, because it just doesn't have the charm. I don't know if I can yet decompose exactly why that is.

  4. Presumably cargo clippy --fix was the intention. Not all things are fixable, though, which is where LLMs are reasonable for -- the squishy hard-to-autofix things.
  5. If you want to do great work, that usually happens in environments with minimized politics.

    It's probably bad career advice to completely avoid politics (most places aren't doing great work) but it depends on what you're optimizing for.

    The problem with everyone getting into the political game is that then we have everyone talking and noone building.

  6. One of my favorite LLM uses is to feed it this essay, then ask it to assume the persona of the grug-brained developer and comment on $ISSUE_IM_CURRENTLY_DEALING_WITH. Good stress relief.
  7. I haven't read all the comments and I'm sure someone else made a similar point, but my first thought was the flip the direction of the statement: "Waymo rides cost more than Uber or Lyft /because/ people are willing to pay more".
  8. Usually if you’re using it, it’s because you’re forced to.

    In my experience, the best strategy is to minimize your use of it — call out to binaries or shell scripts and minimize your dependence on any of the GHA world. Makes it easier to test locally too.

  9. This is a silly extreme case, but it's kind of an absurd example of what happens when you live a life devoid of the principle of charity[1].

    I think tons of interpersonal engineering issues boil down to a failure to apply this principle.

    [1]: https://en.wikipedia.org/wiki/Principle_of_charity

  10. Yeah, I more or less agree about the closed loop part and the overall broader point the article was making in this context — that it may be a useful use case. I think it’s likely that process creates a lot of horseshit that passes through the process, but that might still be better than nothing for semgrep rules.

    I only came down hard on that quote out of context because it felt somewhat standalone and I want to broadcast this “fluency paradox” point a bit louder because I keep running into people who really need to hear it.

    I know you know what’s up.

  11. > But I just checked and, unsurprisingly, 4o seems to do reasonably well at generating Semgrep rules? Like: I have no idea if this rule is actually any good. But it looks like a Semgrep rule?

    This is the thing with LLMs. When you’re not an expert, the output always looks incredible.

    It’s similar to the fluency paradox — if you’re not native in a language, anyone you hear speak it at a higher level than yourself appears to be fluent to you. Even if for example they’re actually just a beginner.

    The problem with LLMs is that they’re very good at appearing to speak “a language” at a higher level than you, even if they totally aren’t.

  12. It’s just not that big of a mystery. It’s not an excuse; it’s just true. Also, they’re not especially selling reliability as much as they’re selling small geo-distributed deployments.
  13. Reliability is hard when your volume is (presumably) scaling geometrically.
  14. Based on the ole' joke about outfitting custom planes, "If you want to do anything to a plane... /anything/..., it's 250. New coffee machine? 250k. Rotate the sofa? 250k." -- $149,072 for a soap dispenser might well be a screaming deal.
  15. The topic came up again and maybe this has been changing lately. I downgrade my above comment. I still think that it got popular in the U.S. first and then propagated back to Japan but ¯\_(ツ)_/¯.
  16. Most Japanese people do not use this term, and I'm fairly certain most Japanese people don't even really know the word. This is one of those "Big in Japan" things, except, uh, "Big outside Japan".

    Source: live in Japan, have asked Japanese people around me if they know about this concept (that is popular in USA). Usually hear: へ〜、全然知らない。

  17. I think it's a mix of:

    1. Queues are actually used a lot, esp. at high scale, and you just don't hear about it.

    2. Hardware/compute advances are outpacing user growth (e.g. 1 billion users 10 years ago was a unicorn; 1 billion users today is still a unicorn), but serving (for the sake of argument) 100 million users on a single large box is much more plausible today than 10 years ago. (These numbers are made up; keep the proportions and adjust as you see fit.)

    3. Given (2), if you can get away with stuffing your queue into e.g. Redis or a RDBMS, you probably should. It simplifies deployment, architecture, centralizes queries across systems, etc. However, depending on your requirements for scale, reliability, failure (in)dependence, it may not be advisable. I think this is also correlated with a broader understanding that (1) if you can get away with out-of-order task processing, you should, (2) architectural simplicity was underrated in the 2010s industry-wide, (3) YAGNI.

  18. s3 is not optimized to directly serve websites, but to durably store and retrieve ~unlimited data.
  19. I prefer AuthN and AuthZ.

    I don't think sharing a prefix/root implies that they're the same thing.

    Also, I don't think the suggested "permissions" and "login" terminology would work for all AuthN/Z schemes. For example, when exactly do you "login" when calling an API with a bearer token? Doesn't work for me.

  20. Yeah, you can avoid holding the xact with the means that you mentioned, e.g. SKIP LOCKED and set some value to PROCESSING, then do your processing, then update to DONE at the end. Or as you mentioned, timestamps.

    I think the SKIP LOCKED part is really only useful to avoid contention between two workers querying for new work simultaneously.

  21. I personally think polling the queue/table via queries is a very sensible pattern and not something I have a desire to remove. In theory, you could go at it via a push approach by wiring into the WAL or something but that comes with its own rats nest of issues.
  22. Got it, thanks for the reply. The feature flag cache reload use case seems like reasonable one to me.
  23. Other than the space for past notifications and/or having to issue a DELETE, are there significant reasons to prefer this over the typical table-based approach with SKIP LOCKED queries to poll the queue?

    It seems to me that if the listener dies, notifications in the meantime will be dropped until a listener resubscribes, right? That seems prone to data loss.

    In the SKIP LOCKED topic-poller style pattern (for example, query a table for rows with state = 'ready' on some interval and use SKIP LOCKED), you can have arbitrary readers and if they all die, inserts into the table still go through and the backlog can later be processed.

  24. And just to add a small clarification since I had to double take: this isn't exactly-once delivery (which isn't possible), this is exactly-once processing. But even exactly-once processing generally has issues, so it's better to assume at least once processing as the thing to design for and try to make everything within your processing ~idempotent.
  25. I find the posturing as a thought leader and industry leader (on this topic especially) a bit ironic. A cloud provider licensing ARM Neoverse and throwing an ARM chip into their cloud compute boxes is not exactly a novel business practice.

    I'm happy to see this, and it should be all goodness, but... the posturing... I don't want to be negative for the sake of being negative, but I don't understand how anyone can write that first paragraph with a straight face and publish it when you're announcing ARM chips for cloud in 2024(?, maybe 2025?).

  26. Exploring the concept of files > app deeper, it would be interesting if we were able to foster a culture of website apps writing to local storage with files (in a similar manner to Obsidian), and if we had a common format for doing so, with an open-source daemon that sync'd writes to and from that directory to e.g. some other folder. That would unlock ownership of data even in web apps. The daemon could be app-agnostic and just dutifully sync all the things.
  27. We can’t have LLMs giving footguns to our children. ;)

This user hasn’t submitted anything.

Keyboard Shortcuts

Story Lists

j
Next story
k
Previous story
Shift+j
Last story
Shift+k
First story
o Enter
Go to story URL
c
Go to comments
u
Go to author

Navigation

Shift+t
Go to top stories
Shift+n
Go to new stories
Shift+b
Go to best stories
Shift+a
Go to Ask HN
Shift+s
Go to Show HN

Miscellaneous

?
Show this modal