Preferences

mccoyb
Joined 783 karma
5th year MIT PhD student. I work on programming languages and systems. Currently on parental leave https://github.com/femtomc

  1. No, you weren't clear, nor are you correct: you shared FUD about something it seems you have not tried, because testing your claims with a recent agentic system would dispel them.

    I've had great success teaching Claude Code use DSLs I've created in my research. Trivially, it has never seen exactly these DSLs before -- yet it has correctly created complex programs using those DSLs, and indeed -- they work!

    Have you had frontier agents work on programs in "esoteric" (unpopular) languages (pick: Zig, Haskell, Lisp, Elixir, etc)?

    I don't see clarity, and I'm not sure if you've tried any of your claims for real.

  2. If by "just breaks" means "refuses to write code / gives up or reverts what it does" -- yes, I've experienced that.

    Experiencing that repeatedly motivated me to use it as a reviewer (which another commenter noted), a role which it is (from my experience) very good at.

    I basically use it to drive Claude Code, which will nuke the codebase with abandon.

  3. I'm happy to pay the same right now for less (on the max plan, or whatever) -- because I'm never running into limits, and I'm running these models near all day every day (as a single user working on my own personal projects).

    I consistently run into limits with CC (Opus 4.5) -- but even though Codex seems to be spending significantly more tokens, it just seems like the quota limit is much higher?

  4. If anyone from OpenAI is reading this -- a plea to not screw with the reasoning capabilities!

    Codex is so so good at finding bugs and little inconsistencies, it's astounding to me. Where Claude Code is good at "raw coding", Codex/GPT5.x are unbeatable in terms of careful, methodical finding of "problems" (be it in code, or in math).

    Yes, it takes longer (quality, not speed please!) -- but the things that it finds consistently astound me.

  5. If you're maxing out the plans across the platforms, that's 600 bucks -- but if you think about your usage and optimize, I'm guessing somewhere between 200-600 dollars per month.
  6. Claude Code was a big jump for me. Another large-ish jump was multi-agents and following the tips from Anthropic’s long running harnesses post.

    I don’t go into Claude without everything already setup. Codex helps me curate the plan, and curate the issue tracker (one instance). Claude gets a command to fire up into context, grab an issue - implements it, and then Codex and Gemini review independently.

    I’ve instructed Claude to go back and forth for as many rounds as it takes. Then I close the session (\new) and do it again. These are all the latest frontier models.

    This is incredibly expensive, but it’s also the most reliable method I’ve found to get high-quality progress — I suspect it has something to do with ameliorating self-bias, and improving the diversity of viewpoints on the code.

    I suspect rigorous static tooling is yet another layer to improve the distribution over program changes, but I do think that there is a big gap in folk knowledge already between “vanilla agents” and something fancy with just raw agents, and I’m not sure if just the addition of more rigorous static tooling (beyond the compiler) closes it.

  7. Nada is the best! Don't forget the mind bending https://dl.acm.org/doi/10.1145/3158140 (not quite on topic, but in the multi-stage rabbit hole)
  8. That's only part of the reason that this type of content is used in academic papers. The other part is that you never know what PhD student / postdoc / researcher will be reviewing your paper, which means you are incentivized to be liberal with citations (however tangential) just in case someone is reading your paper, and has the reaction "why didn't they cite this work, of which I had some role in?"

    Papers with a fake air of authority of easily dispatched with. What is not so easily dispatched with is the politics of the submission process.

    This type of content is fundamentally about emotions (in the reviewer of your paper), and emotions is undeniably a large factor in acceptance / rejection.

  9. Abstract interpretation is also at the heart of Julia’s type inference algorithm (amongst other analyses that Julia performs)

    A very useful framework, both practically and theoretically!

  10. Boggles the mind.
  11. There is no better time than now to try something brash and perpendicular to the mainstream.
  12. This is excellent: thank you for pursuing these wonderful ideas.
  13. Nope, I did this today to try and see if it would work.

    It does not.

  14. I see -- but does this allow me to us the models within "Antigravity" with the same subscription?

    I poked around and couldn't figure this out.

  15. Got it -- thanks both.
  16. I'm asking about Gemini, not Copilot.
  17. I see -- so this is the "paid" AI studio plan?

    Does that have any relation to the Gemini plan thing: https://one.google.com/explore-plan/gemini-advanced?utm_sour...

    ?

  18. I truly do not understand what plan to use so I can use this model for longer than ~2 minutes.

    Using Anthropic or OpenAI's models are incredibly straightforward -- pay us per month, here's the button you press, great.

    Where do I go for this for these Google models?

  19. > Bajillions of dollars invested in the development of some of the most powerful computational artifacts to date.

    > Fork VS Code, add a few workflow / management ideas on top.

    > "Agentic development platform"

    I'm Jack's depressed lack of surprise.

    Please someone, make me feel something with software again.

  20. I used Julia for 4 years. I'm not a moron: I'm familiar with how it works, I've written several packages in it, including some speculative compiler ones.

    You claimed:

    > Allowing invokation of the compiler at runtime is definitely not something that is done for performance, but for dynamism, to allow some code to run that could not otherwise be run.

    I asked:

    > why not just compile a static but generic version of the method with branches based on the tags of values? ("Can't figure out the types, wait until runtime and then just branch to the specialized method instances which I do know the types for")

    Which can be done completely ahead of time, before runtime, and doesn't rely on re-invoking the compiler, thereby making this whole "ahead of time compilation only works for a subset of Julia code" problem disappear.

    Do you understand now?

    My original comment:

    > The problem (which the author didn't focus on, but which I believe to be the case) that Julia willingly hoisted on itself in the pursuit of maximum performance is _invoking the compiler at runtime_ to specialize methods when type information is finally known.

    is NOT a claim about the overall architecture of Julia -- it's a point about this specific problem (Julia's static ahead-of-time compilation) which is currently highly limited.

This user hasn’t submitted anything.

Keyboard Shortcuts

Story Lists

j
Next story
k
Previous story
Shift+j
Last story
Shift+k
First story
o Enter
Go to story URL
c
Go to comments
u
Go to author

Navigation

Shift+t
Go to top stories
Shift+n
Go to new stories
Shift+b
Go to best stories
Shift+a
Go to Ask HN
Shift+s
Go to Show HN

Miscellaneous

?
Show this modal