Profile: qianli_cs - Hacker Neue

qianli_cs

Joined Sep 3, 2022 804 karma

Hi, I am Qian Li, co-founder at DBOS (lightweight durable execution library on Postgres, https://www.dbos.dev/). The way hardware, software, and data interact with each other appeals to me the most. Website: https://qianli.dev/

4 points Dec 16, 2025

The success of 'natural language programming'

0 comments qianli_cs co.za
qianli_cs Dec 2, 2025 parent

This post seems to be published in a hurry. Under "How it works" section a bunch of duplication, and I think they should make the blog post exactly once :) Excerpt from the blog post:
> During replay, your code runs from the beginning but skips over completed checkpoints, using stored results instead of re-executing completed operations. This replay mechanism ensures consistency while enabling long-running executions. > > ... During replay, your code runs from the beginning but skips over completed checkpoints, using stored results instead of re-executing completed operations. This replay mechanism ensures consistency while enabling long-running executions.
qianli_cs Nov 21, 2025 parent

I really enjoyed this post and love seeing more lightweight approaches! The deep dive on tradeoffs between different durable-execution approaches was great. For me, the most interesting part is that Persistasaurus (cool name btw) use of bytecode generation via ByteBuddy is a clever way to improve DX: it can transparently intercept step functions and capture execution state without requiring explicit API calls.
(Disclosure: I work on DBOS [1]) The author's point about the friction from explicit step wrappers is fair, as we don't use bytecode generation today, but we're actively exploring it to improve DX.
[1]: https://github.com/dbos-inc
qianli_cs Nov 15, 2025 parent

Yeah, in the long term, we're supporting more languages. But we aren't actively building new languages right now.
qianli_cs Nov 13, 2025 parent

To build on what Peter said, we need to stay focused and make one backend solid before expanding. But architecturally, nothing prevents us from supporting more engines going forward.
qianli_cs Oct 24, 2025 parent

I'm excited about this because durable workflows are really important for making AI applications production ready :) Disclaimer: I'm working on DBOS, a durable workflow library built on Postgres, which looks complementary to this.
I asked their main developer Dillon about the data/durability layer and also the compilation step. I wonder if adding a "DBOS World" will be feasible. That way, you get Postgres-backed durable workflows, queues, messaging, streams, etc all in one package, while the "use workflow" interface remains the same.
Here is the response from Dillon, and I hope it's useful for the discussion here:
> "The primary datastore is dynamodb and is designed to scale to support tens of thousands of v0 size tenants running hundreds of thousands of concurrent workflows and steps."
> "That being said, you don't need to use Vercel as a backend to use the workflow SDK - we have created a interface for anyone to implements called 'World' that you can use any tech stack for https://github.com/vercel/workflow/blob/main/packages/world/..."
> "you will require a compiler step as that's what picks up 'use workflow' and 'use step` and applies source transformations. The node.js run time limitations only apply to the outer wrapper function w/ `use workflow`"
67 points Oct 21, 2025

Finding my rhythm again

6 comments qianli_cs jeremydaly.com
qianli_cs Oct 20, 2025 parent

We're seeing issues with multiple AWS services https://health.aws.amazon.com/health/status
4 points Oct 16, 2025

Sanitized SQL

0 comments qianli_cs ardentperf.com
8 points Oct 10, 2025

Barbarians at the Gate: How AI Is Upending Systems Research

0 comments qianli_cs arxiv.org
3 points Oct 3, 2025

Supabase raises $100M at $5B valuation as vibe coding soars

0 comments qianli_cs fortune.com
qianli_cs Oct 3, 2025 parent

In DBOS, workflows can be invoked directly as normal function calls or enqueued. Direct calls don't require any polling. For queued workflows, each process runs a lightweight polling thread that checks for new work using `SELECT ... FOR UPDATE SKIP LOCKED` with exponential backoffs to prevent contentions, so many concurrent workers can poll efficiently. We recently wrote a blog post on durable workflows, queues, and optimizations: https://www.dbos.dev/blog/why-postgres-durable-execution
Throughput mainly comes down to database writes: executing a workflow = 2 writes (input + output), each step = 1 write. A single Postgres instance can typically handle thousands of writes per second, and a larger one can handle tens of thousands (or even more, depending on your workload size). If you need more capacity, you can shard your app across multiple Postgres servers.
qianli_cs Oct 3, 2025 parent

Good questions!
DBOS naturally scales to distributed environments, with many processes/servers per application and many applications running together. The key idea is to use the database concurrency control to coordinate multiple processes. [1]
When a DBOS workflow starts, it’s tagged with the version of the application process that launched it. This way, you can safely change workflow code without breaking existing ones. They'll continue running on the older version. As a result, rolling updates become easy and safe. [2]
[1] https://docs.dbos.dev/architecture#using-dbos-in-a-distribut...
[2] https://docs.dbos.dev/architecture#application-and-workflow-...
qianli_cs Oct 3, 2025 parent

I think one potential concern with "checkpoint execution state at every interaction with the outside world" is the size of the checkpoints. Allowing users to control the granularity by explicitly specifying the scope of each step seems like a more flexible model. For example, you can group multiple external interactions into a single step and only checkpoint the final result, avoiding the overhead of saving intermediate data. If you want finer granularity, you can instead declare each external interaction as its own step.
Plus, if the crash happens in the outside world (where you have no control), then checkpointing at finer granularity won't help.
qianli_cs Oct 3, 2025 parent

I think a clearer way to think about this is "at least once" message delivery plus idempotent workflow execution is effectively exactly-once event processing.
The DBOS workflow execution itself is idempotent (assume each step is idempotent). When DBOS starts a workflow, the "start" (workflow inputs) is durably logged first. If the app crashes, on restart, DBOS reloads from Postgres and resumes from the last completed step. Steps are checkpointed so they don't re-run once recorded.
qianli_cs Oct 3, 2025 parent

That password is only used by the GHA to start a local Postgres Docker container (https://github.com/dbos-inc/dbos-transact-golang/blob/main/c...), which is not accessible from outside.
2 points Sep 26, 2025

The Next Horizon of System Intelligence

0 comments qianli_cs sigops.org
qianli_cs Sep 25, 2025 parent

I think it was likely caused by the cache trying to compare the tag with Docker Hub: https://docs.docker.com/docker-hub/image-library/mirror/#wha...
> "When a pull is attempted with a tag, the Registry checks the remote to ensure if it has the latest version of the requested content. Otherwise, it fetches and caches the latest content."
So if the authentication service is down, it might also affect the caching service.
4 points Sep 12, 2025

Pydantic AI Durable Execution

0 comments qianli_cs pydantic.dev
qianli_cs Sep 9, 2025 parent

The main advantage is the same architectural benefit DBOS provides in other languages: you only need to deploy your application, so there's no separate coordinator to run. All functionality (checkpointing, durable queues, notification/signaling, etc) is built directly into the Go package on top of the database.
qianli_cs Sep 9, 2025 parent

Those are great questions!
For versioning, we recommend keeping each version running until all workflows on that version are done. It's similar to a blue-green deployment: each process is tagged with one version, and all workflows in it share that version. You can list pending/enqueued workflows on the old version (UI or list_workflow programmatic API), and once that list drains, you can shut down the old processes. DBOS Cloud automates this, and we'll add more guidance for self-hosting.
For bugfixes, DBOS supports programmatic forking and other workflow management tools [1]. We deliberately don't support code patching because it's fragile and hard to test. For example, patches can pile up on long-running workflows and make debugging painful.
The main limit is the database (which you can control the size). DBOS writes workflow inputs, step outputs, and workflow outputs to it. There's no step limit beyond disk space. Postgres/SQLite allow up to 1 GB per field, but keeping inputs/outputs under ~2 MB helps performance. We'll add clearer guidelines to the docs.
Thanks again for all the thoughtful questions!
[1] https://docs.dbos.dev/python/reference/contexts#fork_workflo...
qianli_cs Sep 9, 2025 parent
Thanks for sharing your insights! You nailed the key tradeoffs of most durable workflow systems. The callback-style programming model is exactly the pain point we aim to solve with DBOS.
Instead of forcing you into a custom async runtime, DBOS lets you keep writing normal functions (this is an example in Python):
```
    @DBOS.workflow()
    def do_thing(foo):
        return bar

    # You can still call the workflow function like this:
    result = do_thing(fooInput)
```
Under the hood, DBOS checkpoints inputs/outputs so it can recover after failure, but you don't have to restructure your code around callbacks. In Python and Java we use decorators/annotations so registration feels natural, while in Go/TypeScript there's a lightweight one-time registration step. Either way, you keep the synchronous call style you'd expect.
On top of that, DBOS also supports running workflows asynchronously or through queues, so you can start with a simple function call and later scale out to async/queued execution without changing your code. That's what the article was leading into.
qianli_cs Aug 29, 2025 parent

I've been building an integration [1] with Pydantic AI and the experience has been great. Questions usually get answered within a few hours, and the team is super responsive and supportive for external contributors. The public API is easy to extend for new functionality (in my case, durable agents).
Its agent model feels similar to OpenAI's: flexible and dynamic without needing to predefine a DAG. Execution is automatically traced and can be exported to Logfire, which makes observability pretty smooth too. Looking forward to their upcoming V1 release.
Shameless plug: I've been working on a DBOS [2] integration into Pydantic-AI as a lightweight durable agent solution.
[1] https://github.com/pydantic/pydantic-ai/pull/2638
[2] https://github.com/dbos-inc/dbos-transact-py
5 points Aug 16, 2025

Dynamo, DynamoDB, and Aurora DSQL

0 comments qianli_cs co.za
5 points Aug 9, 2025

What even is distributed systems

0 comments qianli_cs eatonphil.com
qianli_cs Aug 9, 2025 parent

Yeah, we plan to add more languages. Currently supports Python and TypeScript, and Go and Java will be released soon. We’re having a preview of DBOS Java at our user group meeting on August 28: https://lu.ma/8rqv5o5z Welcome to join us! We’d love to hear your feedback.
We welcome community contributions to the open source repos.
qianli_cs Aug 9, 2025 parent

We heard you! Working on improvements based on user feedback. Stay tuned :)
qianli_cs Aug 8, 2025 parent

Yup, some features are timeless and deserve a re-intro every now and then. SKIP LOCKED is definitely one of them.
qianli_cs Aug 1, 2025 parent

Managing complex scheduled workflows at scale comes with a lot of nuances. This is exactly why we're building DBOS (shameless plug! https://github.com/dbos-inc), which provides durable cron jobs and exactly-once workflow triggering. Since it's just a library on top of Postgres, it doesn't require a centralized scheduler (well, think of Postgres as the coordinator).
One challenge is to guarantee exactly-once processing across software upgrades. DBOS uses the cron-scheduled time as an idempotency key, and tags each workflow execution with a version. We also use the database transactions to guard against conflicting concurrent updates.

This user hasn’t submitted anything.

Preferences

Keyboard Shortcuts

Story Lists

Navigation

Miscellaneous