Preferences

raphlinus
Joined 13,295 karma
I do research on fundamental UI technology and 2D graphics, with a focus on Rust and fonts. Currently doing open source work in my personal capacity until the end of the year, then in January I start an exciting new role.

@raph@mastodon.online


  1. My reading is that there aren't really a lot of addressing modes on 286, as there are on 68000 and friends, rather every address is generated by summing an optional immediate 8 or 16 bit value and from zero to two registers. There aren't modes where you do one memory fetch, then use that as the base address for a second fetch, which is arguably a vaguely RISC flavored choice. There is a one cycle penalty for summing 3 elements ("based indexed mode").
  2. Memory safety in particular, actually UB in general (got to watch out for integer overflows, among other things). But one could prove arbitrary properties, including lack of panics (would have been helpful for a recent Cloudflare outage), etc.

    In order to prove lack of UB, you have to be able to reason about other things. For example, to safely call qsort, you have to prove that the comparison is a total order. That's not easy, especially if comparing larger and more complicated structures with pointers.

    And of course, proving the lack of pointer aliasing in C is extremely difficult, even more so if pointer arithmetic is employed.

  3. There's a straightforward answer to the "why not" question: because it will result in codebases with the same kind of memory unsafety and vulnerability as existing C code.

    If an LLM is in fact capable of generating code free of memory safety errors, then it's certainly also capable of writing the Rust types that guarantee this and are checkable. We could go even further and have automated generation of proofs, either in C using tools similar to CompCert, or perhaps something like ATS2. The reason we don't do these at scale is that they're tedious and verbose, and that's presumably something AI can solve.

    Similar points were also made in Martin Kleppmann's recent blog post [1].

    [1]: https://martin.kleppmann.com/2025/12/08/ai-formal-verificati...

  4. That's because the 1 instruction variant may read past the end of an array. Let's say s is a single null byte at 0x2000fff, for example (and that memory is only mapped through 0x2001000); the function as written is fine, but the optimized version may page fault.
  5. Unfortunately graphics APIs suck pretty hard when it comes to actually sharing memory between CPU and GPU. A copy is definitely required when using WebGPU, and also on discrete cards (which is what these APIs were originally designed for). It's possible that using native APIs directly would let us avoid copies, but we haven't done that.
  6. It's analogous, but vertex shaders are just triangles, and in 2D graphics you have a lot of other stuff going on.

    The actual process of fine rasterization happens in quads, so there's a simple vertex shader that runs on GPU, sampling from the geometry buffers that are produced on CPU and uploaded.

  7. Thanks for the pointer, we were not actually aware of this, and the claimed benchmark numbers look really impressive.
  8. The output of this renderer is a bitmap, so you have to do an upload to GPU if that's what your environment is. As part of the larger work, we also have Vello Hybrid which does the geometry on CPU but the pixel painting on GPU.

    We have definitely thought about having the CPU renderer while the shaders are being compiled (shader compilation is a problem) but haven't implemented it.

  9. Another deep dive is in https://www.copetti.org/writings/consoles/master-system/

    I've got a mostly-written emulator (in Rust). It's very easy to emulate, possibly the best gameplay bang for the emulator coding effort buck aside from NES. My main intent in writing this emulator is getting it running on an RP2350 board, like Adafruit Fruit Jam or Olimex RP2350pc.

    It should also be possible to get the next generation (SNES, Genesis) on such hardware, but it's a much tighter fit and more effort.

  10. I almost mentioned it in the talk, as an example of a language that's deployed very successfully and expresses parallelism at scale. Ultimately I didn't, as the core of what I'm talking about is control over dynamic allocation and scheduling, and that's not the strength of VHDL.
  11. Right. This is the binary tree version of the algorithm, and is nice and concise, very readable. What would take it to the next level for me is the version in the stack monoid paper, which chunks things up into workgroups. I haven't done benchmarks against the Pareas version (unfortunately it's not that easy), but I would expect the workgroup optimized version to be quite a bit faster.
  12. Yes, sorry about that. We had tech issues, and did the best we could with the audio that was captured.
  13. It's not strictly x86 either, the other case you care about is fp16 support on ARM. But it is included in the M1 target, so really only on other ARM.
  14. I'm extremely curious what those basic methods are. We're in the process of replacing the higher order rootfinding in kurbo with a new solver based on Yuksel's method[1]. If you know of simpler, faster techniques that would be quite interesting.

    [1]: https://crates.io/crates/polycool

  15. I have very high hopes for this board, and have been playing with RP2350 with DVI out for a while (I have one of these on order but it hasn't arrived yet, but other boards[1] exist).

    Emulation is a sweet spot because if you race the beam, there is no compositor latency. Basically every retro computer with less than a quarter meg of VRAM is fair game (whether a framebuffer or not).

    I have a bit of time off this fall and intend to do some fun things.

    [1]: https://github.com/DusterTheFirst/pico-dvi-rs/wiki/RP2350-DV...

This user hasn’t submitted anything.

Keyboard Shortcuts

Story Lists

j
Next story
k
Previous story
Shift+j
Last story
Shift+k
First story
o Enter
Go to story URL
c
Go to comments
u
Go to author

Navigation

Shift+t
Go to top stories
Shift+n
Go to new stories
Shift+b
Go to best stories
Shift+a
Go to Ask HN
Shift+s
Go to Show HN

Miscellaneous

?
Show this modal