Profile: mccoyb - Hacker Neue

mccoyb

Joined Jul 20, 2020 783 karma

5th year MIT PhD student. I work on programming languages and systems. Currently on parental leave https://github.com/femtomc

mccoyb 5 days ago

No, you weren't clear, nor are you correct: you shared FUD about something it seems you have not tried, because testing your claims with a recent agentic system would dispel them.
I've had great success teaching Claude Code use DSLs I've created in my research. Trivially, it has never seen exactly these DSLs before -- yet it has correctly created complex programs using those DSLs, and indeed -- they work!
Have you had frontier agents work on programs in "esoteric" (unpopular) languages (pick: Zig, Haskell, Lisp, Elixir, etc)?
I don't see clarity, and I'm not sure if you've tried any of your claims for real.
mccoyb Dec 18, 2025

If by "just breaks" means "refuses to write code / gives up or reverts what it does" -- yes, I've experienced that.
Experiencing that repeatedly motivated me to use it as a reviewer (which another commenter noted), a role which it is (from my experience) very good at.
I basically use it to drive Claude Code, which will nuke the codebase with abandon.
mccoyb Dec 18, 2025

I'm happy to pay the same right now for less (on the max plan, or whatever) -- because I'm never running into limits, and I'm running these models near all day every day (as a single user working on my own personal projects).
I consistently run into limits with CC (Opus 4.5) -- but even though Codex seems to be spending significantly more tokens, it just seems like the quota limit is much higher?
mccoyb Dec 18, 2025

If anyone from OpenAI is reading this -- a plea to not screw with the reasoning capabilities!
Codex is so so good at finding bugs and little inconsistencies, it's astounding to me. Where Claude Code is good at "raw coding", Codex/GPT5.x are unbeatable in terms of careful, methodical finding of "problems" (be it in code, or in math).
Yes, it takes longer (quality, not speed please!) -- but the things that it finds consistently astound me.
mccoyb Dec 16, 2025

If you're maxing out the plans across the platforms, that's 600 bucks -- but if you think about your usage and optimize, I'm guessing somewhere between 200-600 dollars per month.
mccoyb Dec 16, 2025

Claude Code was a big jump for me. Another large-ish jump was multi-agents and following the tips from Anthropic’s long running harnesses post.
I don’t go into Claude without everything already setup. Codex helps me curate the plan, and curate the issue tracker (one instance). Claude gets a command to fire up into context, grab an issue - implements it, and then Codex and Gemini review independently.
I’ve instructed Claude to go back and forth for as many rounds as it takes. Then I close the session (\new) and do it again. These are all the latest frontier models.
This is incredibly expensive, but it’s also the most reliable method I’ve found to get high-quality progress — I suspect it has something to do with ameliorating self-bias, and improving the diversity of viewpoints on the code.
I suspect rigorous static tooling is yet another layer to improve the distribution over program changes, but I do think that there is a big gap in folk knowledge already between “vanilla agents” and something fancy with just raw agents, and I’m not sure if just the addition of more rigorous static tooling (beyond the compiler) closes it.
mccoyb Dec 14, 2025

Nada is the best! Don't forget the mind bending https://dl.acm.org/doi/10.1145/3158140 (not quite on topic, but in the multi-stage rabbit hole)
mccoyb Dec 7, 2025

That's only part of the reason that this type of content is used in academic papers. The other part is that you never know what PhD student / postdoc / researcher will be reviewing your paper, which means you are incentivized to be liberal with citations (however tangential) just in case someone is reading your paper, and has the reaction "why didn't they cite this work, of which I had some role in?"
Papers with a fake air of authority of easily dispatched with. What is not so easily dispatched with is the politics of the submission process.
This type of content is fundamentally about emotions (in the reviewer of your paper), and emotions is undeniably a large factor in acceptance / rejection.
mccoyb Dec 6, 2025

Abstract interpretation is also at the heart of Julia’s type inference algorithm (amongst other analyses that Julia performs)
A very useful framework, both practically and theoretically!
mccoyb Dec 2, 2025

Boggles the mind.
mccoyb Nov 19, 2025

There is no better time than now to try something brash and perpendicular to the mainstream.
mccoyb Nov 19, 2025

This is excellent: thank you for pursuing these wonderful ideas.
mccoyb Nov 18, 2025

Nope, I did this today to try and see if it would work.
It does not.
mccoyb Nov 18, 2025

I see -- but does this allow me to us the models within "Antigravity" with the same subscription?
I poked around and couldn't figure this out.
mccoyb Nov 18, 2025

Got it -- thanks both.
mccoyb Nov 18, 2025

I'm asking about Gemini, not Copilot.
mccoyb Nov 18, 2025

I see -- so this is the "paid" AI studio plan?
Does that have any relation to the Gemini plan thing: https://one.google.com/explore-plan/gemini-advanced?utm_sour...
?
mccoyb Nov 18, 2025

I truly do not understand what plan to use so I can use this model for longer than ~2 minutes.
Using Anthropic or OpenAI's models are incredibly straightforward -- pay us per month, here's the button you press, great.
Where do I go for this for these Google models?
mccoyb Nov 18, 2025

> Bajillions of dollars invested in the development of some of the most powerful computational artifacts to date.
> Fork VS Code, add a few workflow / management ideas on top.
> "Agentic development platform"
I'm Jack's depressed lack of surprise.
Please someone, make me feel something with software again.
mccoyb Nov 18, 2025

I used Julia for 4 years. I'm not a moron: I'm familiar with how it works, I've written several packages in it, including some speculative compiler ones.
You claimed:
> Allowing invokation of the compiler at runtime is definitely not something that is done for performance, but for dynamism, to allow some code to run that could not otherwise be run.
I asked:
> why not just compile a static but generic version of the method with branches based on the tags of values? ("Can't figure out the types, wait until runtime and then just branch to the specialized method instances which I do know the types for")
Which can be done completely ahead of time, before runtime, and doesn't rely on re-invoking the compiler, thereby making this whole "ahead of time compilation only works for a subset of Julia code" problem disappear.
Do you understand now?
My original comment:
> The problem (which the author didn't focus on, but which I believe to be the case) that Julia willingly hoisted on itself in the pursuit of maximum performance is _invoking the compiler at runtime_ to specialize methods when type information is finally known.
is NOT a claim about the overall architecture of Julia -- it's a point about this specific problem (Julia's static ahead-of-time compilation) which is currently highly limited.

This user hasn’t submitted anything.

Preferences

Keyboard Shortcuts

Story Lists

Navigation

Miscellaneous