Profile: calebhwin - Hacker Neue

calebhwin

Joined May 2, 2025 68 karma

Stanford CS PhD student building https://blastproject.org Let's talk calebhwin@gmail.com

1 point 2 days ago

An Optimizing JIT for LLM Tool-Use to Code

0 comments calebhwin github.com
2 points Jan 2, 2026

Show HN: A1 – cost-optimizing JIT for CRUD AI agent translation to code

0 comments calebhwin github.com
3 points Dec 8, 2025

Show HN: A1 – compiler for AI agents into maximally deterministic code

1 comment calebhwin github.com
1 point Dec 3, 2025

Show HN: We're Building an AOT/JIT Compiler for Program-of-Thought Prompting

0 comments calebhwin github.com
calebhwin Nov 27, 2025 parent

Simple idea - agents need fast JIT compilation to eliminate prompt injection attack vectors + maximize deterministic behavior.
Let us know what you think, thanks!
1 point Nov 27, 2025

Show HN: A1 – Local Sandbox and JIT Compiler for AI Agents

1 comment calebhwin github.com
2 points Nov 18, 2025

JIT Compiling AI Agents to Code

0 comments calebhwin github.com
calebhwin Nov 17, 2025 parent

If I may make a suggestion, many problems folks face with MCP would be solved if their agents were JIT compiled, not ran in a static while loop.
We've been developing this in case folks are interested: https://github.com/stanford-mast/a1
6 points Nov 13, 2025

Show HN: Agent-to-code JIT compiler for Z3-theorem-proving agents

0 comments calebhwin github.com
5 points Nov 13, 2025

A1: Agents-to-Code JIT Compiler

0 comments calebhwin github.com
1 point Nov 12, 2025

Show HN: Agent-to-Code JIT Compiler

0 comments calebhwin a1project.org
2 points Nov 12, 2025

A1: Agent-to-Code JIT Compiler

0 comments calebhwin github.com
1 point Nov 11, 2025

Show HN: A1 – An optimizing JIT compiler for AI agents

0 comments calebhwin github.com
1 point Nov 10, 2025

Show HN: a1 - determinism-maxing JIT compiler for AI agents

0 comments calebhwin github.com
calebhwin May 4, 2025 parent

A system that does the following given a task_description:
while LLM("is <task_description> not done?"): Browser.run(LLM("what should the browser do next given <Browser.get_state()>"))
This simple loop turns out to be very powerful, achieving the highest performance on some of OpenAI's latest benchmarks. But it's also heavily unoptimized compared to a system that is just LLM("<task_description>") for which we already have things like vllm. BLAST is a first step towards optimizing this while loop.
calebhwin May 3, 2025 parent

Yes. And you can give BLAST an LLM cost budget or max browser memory usage and BLAST takes care of scheduling.
calebhwin May 3, 2025 parent

The use cases are (1) integrating AI automation into my app (2) automating workflows inside web browser (3) personal use. The value is in optimizing for low latency under user-defined constraints such as LLM cost budget or maximum browser memory usage.
calebhwin May 3, 2025 parent

Do you build agents that interface with web browsers? BLAST is sort of like vllm for browser+LLM. The motivation for this is that browser+LLM is slow and we can do a lot of optimization with an engine that manages browser+LLM together - e.g. prefix caching, auto-parallelism, data parallelism, request hedging, scheduling policy, and more coming soon.
Now the API is what may be throwing folks off. Right now it's an OpenAI-compatible API. We will implement MCP. But really the core thing is abstracting away optimizations required to efficiently run browser+LLM.
calebhwin May 3, 2025 parent

I would really think about it as a serving engine like vllm but for browsers+LLMs. It handles caching, parallelism, scheduling, budget constraints for LLM cost and browser memory usage. Yes it currently has an OpenAI-compatible API but we will also implement MCP. (though we're working on something that will be way better than "MCP for web browsers")
calebhwin May 3, 2025 parent

Yes, human-in-the-loop is definitely on the roadmap. It's orthogonal to the central goal of low latency but necessary for completeness. Either via VNC or something simpler we have in mind.
calebhwin May 3, 2025 parent

Ah you're right, my bad. Hope I didn't sound dismissive because I think some sort of robots.txt needs to exist for AI that's scraping the web both at train or test time.
I'm really not excited at all about the "scrape other people's data" use case for BLAST and if we can prevent it then awesome. I'm excited about BLAST automating science, legacy web apps, internal tools, adding AI automation to your own app, etc.
calebhwin May 3, 2025 parent

Thank you! It's currently based on task lineage, exact match of task descriptions, and an optional user-provided cache_control argument that can control whether results or plans are cached.
One use-case for this is conversations: So for example if I invoke /chat/completions with [{"role": "user", "content": "Go to google.com"}] and later with [{"role": "user", "content": "Go to google.com"}, {"role": "user", "content": "Search for gorilla vs 100 human"}] then we cache the browser state from the first invocation so it can be quickly restored (or reuse the browser if not evicted).
Caching will get much more sophisticated in a future version, it's the piece we're most actively working on.
calebhwin May 2, 2025 parent

Good point, we should probably integrate that. Feel free to submit a PR!
BLAST can also be used to add automation to your own site/app FWIW.
calebhwin May 2, 2025 parent

IMO it depends on how this tech is deployed. One way I see this being extremely useful is for developers to quickly build AI automation for their own sites.
E.g. if I'm the developer of a workforce management app (e.g. https://WhenIWork.com) I could deploy BLAST to quickly provide automation for users of my app.
calebhwin May 2, 2025 parent

There's definitely opportunities to parallelize. BLAST exploits these with an LLM-planner and tool calls to dynamically spawn/join subtasks (there's also data parallelism and request hedging which further reduce latency).
Now you are right that at some point you'll get throttled either by LLM rate limits or a set budget for browser memory usage or LLM cost. BLAST's scheduler is aware of these constraints and uses them to effectively map tasks to resources (resource=browser+LLM).
calebhwin May 2, 2025 parent

Great point, we are working on an MCP server implementation which should address this. The main benefit of having a serving engine here is to abstract away browser-LLM specific optimizations like parallelism, caching, browser memory management, etc. It's closer to vllm but I agree an MCP server implementation will make integration easier.
Though ultimately I think the web needs something better than MCP and we're actively working on that as well.
calebhwin May 2, 2025 parent

Yes! And browser-use is great though I'm hoping at some point we can swap it out for something leaner, maybe one day it'll just be a vision language model. All we'll have to do within BLAST is implement a new Executor and all the scheduling/planning/resource management stays the same.
calebhwin May 2, 2025 parent

Maybe more of a legal than ethical consideration but web browsing AI makes scraping trivial. You could use that for surveillance, profiling (get a full picture of a user's whole online life before they even hit Sign Up), cutting egress cost in certain cases. Right now CAPTCHA is actually holding up pretty well against web browsing AI for sites that really want to protect their IP but it will be interesting to see if that devolves into yet another instance of an AI vs AI "arms race".
calebhwin May 2, 2025 parent

The main sort of parallelism we exploit is across distinct websites. For example "find me the cheapest rental" spawning tasks to look at many different websites. There is another level of parallelism that could be exploited within a web site/app. And yes we would have to make our planner rate limit aware for that.
Absolutely agree there are ethical considerations with web browsing AI in general. (And the whole general ongoing shift from using websites to using chatgpt/perplexity)
calebhwin May 2, 2025 parent

Right I figured there isn't a huge overlap of interested communities so hopefully not a point of confusion. I guess that could change!

This user hasn’t submitted anything.

Preferences

Keyboard Shortcuts

Story Lists

Navigation

Miscellaneous