Show HN: DBOS transact – Ultra-lightweight durable execution in Python

89 points Sep 10, 2024

Hi HN - DBOS CEO here with the co-founders of DBOS, Peter (KraftyOne) and Qian (qianli_cs). The company started as a research project of Stanford and MIT, and Peter and Qian were advised by Mike Stonebreaker, the creator of Postgres, and Matei Zaharia, the creator of Spark. They believe so strongly in reliable, serverless compute that they started a company (with Mike) to bring it to the world!

Today we want to share our brand new Python library providing ultra-lightweight durable execution.

https://github.com/dbos-inc/dbos-transact-py

Durable execution means your program is resilient to any failure. If it is ever interrupted or crashes, all your workflows will automatically resume from the last completed step. If you want to see durable execution in action, check out this demo app:

https://demo-widget-store.cloud.dbos.dev/

Or if you’re like me and want to skip straight to the Python decorators in action, here’s the demo app’s backend – an online store with reliability and correctness in just 200 LOC:

https://github.com/dbos-inc/dbos-demo-apps/blob/main/python/...

Don't want to keep reading and just try it out:

https://console.dbos.dev/launch

No matter how many times you try to crash it, it always resumes from exactly where it left off! And yes, that button really does crash the app.

Under the hood, this works by storing your program's execution state (which workflows are currently executing and which steps they've completed) in a Postgres database. So all you need to use it is a Postgres database to connect to—there's no need for a "workflow server." This approach is also incredibly fast, for example 25x faster than AWS Step Functions.

Some more cool features include:

* Scheduled jobs—run your workflows exactly-once per time interval, no more need for cron.

* Exactly-once event processing—use workflows to process incoming events (for example, from a Kafka topic) exactly-once. No more need for complex code to avoid repeated processing

* Observability—all workflows automatically emit OpenTelemetry traces.

Docs: https://docs.dbos.dev/

Examples: https://docs.dbos.dev/examples

We also have a webinar on Thursday where we will walk through the new library, you can sign up here: https://www.dbos.dev/webcast/dbos-transact-python

We'd love to hear what you think! We’ll be in the comments for the rest of the day to answer any questions you may have.

jedberg OP Sep 10, 2024

Hey all, I'm excited to be the new CEO of DBOS! I'm coming up on my one month anniversary. I joined because I truly believe DBOS is solving a lot of the main issues with serverless deployments. I still believe that Serverless is the way of the future for most applications and I'm excited to make it a reality.

Ask me anything!

bb01100100 Sep 10, 2024

Would it be correct to say the these client libraries provide the functionality (eg ease of transactions, once only, recovery) whereas your cloud offering solves the scaling / performance issues you’d hit trying to do this with a regular pg compatible DB?

I do a lot of consulting on Kafka-related architectures and really like the concept of DBOS.

Customers tend to hit a wall of complexity when they want to actually use their streaming data (as distinct from simply piping it into a DWH).. being able to delegate a lot of that complexity to the lower layers is very appealing.

Would DBOS align with / complement these types of Kafka streaming pipelines or are you addressing a different need?

KraftyOne Sep 10, 2024

Yeah exactly! The Kafka use case is a great one--specifically writing consumers that perform real-world processing on events from Kafka.

In fact, one of our first customers used DBOS to build an event processing pipeline from Kafka. They hit the "wall of complexity" you described trying to persist events from Kafka to multiple backend data stores and services. DBOS made it much simpler because they could just write (and serverlessly deploy) durable workflows that ran exactly-once per Kafka message.

rtcoms Sep 11, 2024

Recently I came to know about https://www.membrane.io/, which also follows similar approach, but it looks like that is more for internal apps and small projects.

How would you compare DBOS with that ?

jedberg OP Sep 11, 2024

From a high level what we offer is similar -- durable and reliable compute.

There isn't a lot of public information about how they are built, but from what I can tell you're right -- their architecture is more oriented for small projects.

It looks like they store the entire JS heap in a SQLLite database. We store schematized state checkpoints in Postgres compatible database, which makes it so that we can scale up and allow interesting things like querying the previous states and time travel debugging, where you can actually step though previously run workflows.

ashwindharne Sep 10, 2024

I've been using Temporal recently for some long-running multi-step AI workflows -- helps me get around API flakiness, manage rate limits for hosted models, and manage load on local models. It's pretty cool to write workers in different languages and run them on different infra and have them all orchestrate together nicely. How does DBOS compare -- what are the core differences?

From what I can tell, the programming model seems to be pretty similar but DBOS doesn't require a centralized workflow server, just serverless functions?

KraftyOne Sep 10, 2024

Co-founder here:

Great question! Yeah, the biggest difference is that DBOS doesn't require a centralized workflow server, but does all orchestration directly in your functions (through the decorators), storing your program's execution state in Postgres. Implications are:

1. Performance. A state transition in DBOS requires only a database write (~1 ms) whereas in Temporal it requires a roundtrip and dispatch from the workflow servers (tens of ms -- https://community.temporal.io/t/low-latency-stateless-workfl...). 2. Simplicity. All you need to run DBOS is Postgres. You can run locally or serverlessly deploy your app to our hosted cloud offering.

sim7c00 Sep 10, 2024

it might be interesting to look at some standard for workflows like CACAO to express what a workflow is. that way, workflows can ultimately become shareable between such workflow execution engines, and have common workflow editors. its (in cyber) a big problem that workflows cannot be shared between different systems which adds great costs to implementing such a system (need to redesign or design all workflows from the ground up). I think workflows and easy editors to assemble and connect steps are a good step ahead in any automation domain, but everywhere people want to reinvent the wheel of expressing what a workflow is.

definitely a fan of what these types of systems can do in replay/recovering and retying steps etc. as well as centralizing a lot of didferent workloads to a common execution engine.

qianli_cs Sep 10, 2024

Hello! I’m here to answer any questions. I’d love to hear your feedback, comments, and anything!

evantbyrne Sep 10, 2024

Looks like an interesting abstraction. I can see the usefulness because I had to create a poor man's version of this when I built a CD. Sorry, because I don't have time to watch the 50 minute video, but how are you guaranteeing durability? Are you basically opening Postgres transactions for each step, or is there something else going on to persist state?

qianli_cs Sep 10, 2024

Yeah! Under the hood, DBOS wraps each function (step) to log its output in the database. This ensures that workflows can be safely re-executed if they're interrupted, guaranteeing durability.

More info here: https://docs.dbos.dev/explanations/how-workflows-work

hmaxdml Sep 12, 2024

Can you give us some more details about the CD pipeline you built? :)

quickvi Sep 10, 2024

Is it possible not to use a PostgreSQL database? For example would it run with SQLite? The goal is to improve developer experience.

sitkack Sep 10, 2024

You can now run Wasm builds of PostgreSQL that will get you everything you like about SQLite.

https://github.com/electric-sql/pglite

qianli_cs Sep 10, 2024

Co-founder here! No current plans to support SQLite. We picked Postgres because of its huge ecosystem--you can use DBOS with any PostgreSQL-compatible database (Supabase, Neon, Aurora, Cockroach...) and with any Postgres extension (here's an example app using pgvector: https://github.com/dbos-inc/dbos-demo-apps/tree/main/python/...)

threecheese Sep 10, 2024

(After a quick look at the code) Is this due to concurrency (writes)? It looks like this architecture supports multiple executors, and I would imagine you require transactional guards to ensure consistency. I really like this interface btw, the complexity is hidden very well and from reading your docs it remains accessible if you need to dig deeper than a decorator.

And how the heck are you maintaining Typescript and Python copies? lol

qianli_cs Sep 10, 2024

Thanks for your kind words! We're focusing on Postgres because an important scenario for durable execution is serverless computing, which won't work with an embedded database.

threecheese Sep 14, 2024

I am sure you are aware of this, but if not: there are some emerging technologies around embedded database scale-out using CRDT and other replication protocols that would support various “serverless” (as in decentralized) topologies. PGLite, sqlite-cr, libSQL et al. I am informally looking at for serverless executor agents that do not need to coalesce around a central database instance (“server-full”). I am sure you tested something like this, I would guess that classic/CDC replication lag would throw a big wrench into an attempt to orchestrate disconnected remote executors, I am hoping that in a peer to peer topology this new tech will have low enough sync latencies to be useful. Best of luck with DBOS! You have an amazing team.

snicker7 Sep 11, 2024

How does the DX compare against AWS step functions? My experience is that it is very difficult to “unit test” step workflows.

hmaxdml Sep 11, 2024

Step functions are an "external orchestrator". With DBOS your orchestration runs in the same process, so you can use normal testing frameworks like pytest for unit tests. Its super easy to test locally.

jedberg OP Sep 11, 2024

We have a time travel debugger that makes it super easy to test workflows. You could set them up in test and then time travel them, or even time travel completed workflows in production.

https://docs.dbos.dev/cloud-tutorials/timetravel-debugging

darkteflon Sep 11, 2024

This looks quite cool. If anyone from DBOS is still around: does this handle more complex dependency relations between workflow steps (e.g. directed graphs), or is it only suitable for linear workflows?

chuck_dbos Sep 11, 2024

I'd recommend using child workflows for directed graphs. The building blocks are start_workflow to split off a child, and then you can wait for the result at a later time. Workflows can also send events / messages to communicate back and forth with each other.

One neat thing about starting a child workflow is you can assign an idemopotency ID, which might be intentionally calculated in a way such that multiple parents will only start one run of the child workflow.

hobs Sep 11, 2024

Very funny to me how much everyone went away from the db to achieve idempotent behavior, and now we're back to just using a db as a complicated queue with state.

snicker7 Sep 11, 2024

It turns out that DBs were invented to solve hard problems with state management. When people moved away from DBs (at least transactional relational DBs) they had to rediscover all the old problems. Tech is cyclical.

hmaxdml Sep 11, 2024

One of the motivation for DBOS is that OSes were designed with orders of magnitude less state to manage than today. (e.g. linux >30 years ago). What's made to manage tons of state? A DBMS! :)

catzapd Sep 13, 2024

Recovering the application from failures especially when updating multiple data sources, once and only once execution and such things are in the application domain. They have never been done by relational databases. That is the problem solved by the Python SDK of DBOS ( and typescript SDK)

jedberg OP Sep 14, 2024

It's sort of a combination of both. The library solves those problems by storing specific data in the database and then taking advantage of the database's ACID properties and transactions to make the guarantees.

Then the DBOS cloud platform optimizes those interactions between the database and code so that you get a superior experience to running locally.

rtcoms Sep 11, 2024

Would I be able to use all the python and npm packages with it. Would something opening a headless browser to scrap data work with DBOS ?

hmaxdml Sep 11, 2024

Yes, its normal Python/node.js so you an use all their packages.

We know of users running puppeeter to scrap data.

mharig Sep 11, 2024 (dead)

This item has no comments currently.

Preferences

Keyboard Shortcuts

Story Lists

Navigation

Miscellaneous