- RuskyTCO is less of an optimization (which are typically best-effort on the part of the compiler) and more of an actual semantic change that expands the set of valid programs. It's like a new control flow construct that lives alongside `while` loops.
- Giving special treatment to code that "explicitly wants" to handle unwinding means two things:
* You have to know when an API can unwind, and you have to make it an error to unwind when the caller isn't expecting it. If this is done statically, you are getting into effect annotation territory. If this is done dynamically, are essentially just injecting drop bombs into code that doesn't expect unwinding. Either way, you are multiplying complexity for generic code. (Not to mention you have to invent a whole new set of idioms for panic-free code.)
* You still have to be able to clean up the resources held by a caller that does expect unwinding. So all your vocabulary/glue/library code (the stuff that can't just assume panic=abort) still needs these "scoped panic hooks" in all the same places it has any level of panic awareness in Drop today.
So for anyone to actually benefit from this, they have to be writing panic-free code with whatever new static or dynamic tools come with this, and they have to be narrowly scoped and purpose-specific enough that they could essentially already today afford panic=abort. Who is this even for?
- > Suppose, instead, we had a mechanism that allowed registering arbitrary panic hooks, and unregistering them when no longer needed, in any order. Then, we could do RAII-style resource handling: you could have a `CursesTerminal` type, which is responsible for cleaning up the terminal, and it cleans up the terminal on `Drop` and on panic. To do the latter, it would register a panic hook, and deregister that hook on `Drop`.
This doesn't get rid of unwinding at all- it's an inefficient reimplementation of it. There's a reason language implementations have switched away from having the main execution path register and unregister destructors and finally blocks, to storing them in a side table and recovering them at the time of the throw.
- There are also currently the unstable rustc_layout_scalar_valid_range_start and rustc_layout_scalar_valid_range_end attributes (which are used in the definition of NonNull, etc.) which could be used for some bit patterns.
Also aspirations to use pattern types for this sort of thing: https://github.com/rust-lang/rust/issues/135996
- Hypothetically Rust could make `Mutex<InnerBlah>` work with just two bits in the same way it makes `Option<&T>` the same size as `&T`. Annotate `InnerBlah` with the information about which bits are available and let `Mutex` use them.
- You don't need any of that, and you can keep cancellation too.
The core of an eager cooperative multitasking system does not even need the concept of an executor. You can spawn a new task by giving it some stack space and running its body to its first suspension point, right there on the current thread. When it suspends, the leaf API (e.g. `lock`) grabs the current top of the stack and stashes it somewhere, and when it's time to resume it again just runs the next part of the task right there on the current thread.
You can build different kinds of schedulers on top of this first-class ability to resume a particular leaf call in a task. For example, a `lock` integrated with a particular scheduler might queue up the resume somewhere instead of invoking it immediately. Or, a generic `lock` might be wrapped with an adapter that re-suspends and queues that up. None of this requires that the language know anything about the scheduler at all.
This is all typical of how higher level languages implement both stackful and stackless coroutines. The difference is that we want control over the "give it some stack space" part- we want the compiler to compute a maximum size and have us specify where to store it, whether that's on the heap (e.g. tokio::spawn) or nested in some other task's stack (e.g. join, select) or some statically-allocated storage (e.g. on a microcontroller).
(Of course the question then becomes, how do you ensure `lock` can't resume the task after it's been freed, either due to normal resumption or cancellation? Rust answers this with `Waker`, but this conflates the unit of stack ownership with the unit of scheduling, and in the process enables intermediate futures to route a given wakeup incorrectly. These must be decoupled so that `lock` can hold onto both the overall stack and the exact leaf suspension point it will eventually resume.)
Cancellation doesn't change much here. Given a task held from the "caller end" (as opposed to the leaf callee resume handles above), the language needs to provide a way to destruct the stack and let the decoupled `Waker` mechanism respond. This still propagates naturally to nested tasks like join/select arms, though there is now an additional wrinkle that a nested task may be actively running (and may even be the thing that indirectly provoked the cancellation).
- > in principle the exact same optimization could be done for stackful coroutines.
Yes, I totally agree, and this is sort of what I imagine a better design would look like.
> One of the reasons Rust does it the way it currently does is because the implementation avoids requiring support from, e.g., LLVM
This I would argue is simply a failure of imagination. All you need from the LLVM layer is tail calls, and then you can manage the stack layout yourself in essentially the same way Rust manages Future layout.
You don't even need arbitrary tail calls. The compiler can limit itself to the sorts of things LLVM asks for- specific calling convention, matching function signatures, etc. when transferring control between tasks, because it can store most of the state in the stack that it laid out itself.
- "Not inert" does not at all imply "a single runtime within std+compiler." You've jumped way too far in the opposite direction there.
The problem is that the particular interface Rust chose for controlling dispatch is not granular enough. When you are doing your own dispatch, you only get access to separate tasks, but for individual futures you are at the mercy of combinators like `select!` or `FuturesUnordered` that only have a narrow view of the system.
A better design would continue to avoid heap allocations and allow you to do your own dispatch, but operate in terms of individual suspended leaf futures. Combinators like `join!`/`select!`/etc. would be implemented more like they are in thread-based systems, waiting for sub-tasks to complete, rather than being responsible for driving them.
- This one is relevant because it avoids heap allocation while running the iterator and for loop body concurrently. Which is exactly the kind of thing that `async` does.
- The requirement is that the futures are not separate heap allocations, not that they are inert.
It's not at all obvious that Rust's is the only possible design that would work here. I strongly suspect it is not.
In fact, early Rust did some experimentation with exactly the sort of stack layout tricks you would need to approach this differently. For example, see Graydon's post here about the original implementation of iterators, as lightweight coroutines: https://old.reddit.com/r/ProgrammingLanguages/comments/141qm...
- If that is what profiles were actually doing, it would probably make sense. But it's not what profiles are doing.
Instead, for example, the lifetime safety profile (https://github.com/isocpp/CppCoreGuidelines/blob/master/docs...) is a Rust-like compile time borrow checker that relies on annotations like [[clang::lifetimebound]], yet they also repeatedly insist that profiles will not require this kind of annotation (see the papers linked from https://www.circle-lang.org/draft-profiles.html#abstract).
Their messaging is just not consistent with the concrete proposals they have described, let alone actually implemented.
- There is work coming from the "academic pedantism" sphere for exploiting single-resumability. For example: https://dl.acm.org/doi/pdf/10.1145/3632896
- Yes, that's how it should work. It is not how it works in today's rustc.
- > I'd be fine writing `.into()` or `.trunc()`
Yes, this is specifically what I'm disagreeing with.
> I fully expect that such methods will be inlined, likely even in debug mode (e.g. `#[inline(always)]`), and compile down to the same minimal instructions.
That's the cost to compile time I mentioned.
- A method call like `.trunc()` is still going to be abysmally less ergonomic than `as`. It relies on inference or turbofish to pick a type, and it has all the syntactic noise of a function call on top of that.
Not to mention this sort of proliferation of micro-calls for what should be <= 1 instruction has a cost to debug performance and/or compile times (though this is something that should be fixed regardless).
- It is context free, just ambiguous.
- I've been using https://messages.google.com to get something like the desktop iMessage experience with Android- does that work for your use case? (I don't use iMessage so I could just be missing some killer feature it has, or something.)
- The tokenizer is not really a good demonstration of the differences between these styles. A more representative comparison would be the later stages that build, traverse, and manipulate tree and graph data structures.
- "Hand-rolled assembly" was one item in a list that also included DoD. You're reading way more into that sentence than they wrote- the claim is that DoD itself also impacts the maintainability of the codebase.
- Isn't stack overflow made safe via guard pages and probes (on sufficiently high-tier target platforms)? That is you should get a guaranteed error, even if that is a segfault, and not memory corruption.