Comment by flakiness - Hacker Neue

flakiness Jun 19, 2025 parent

This project is from CMU. Hazy Research at Stanford talked about the megakernel too: https://hazyresearch.stanford.edu/blog/2025-05-27-no-bubbles

Good to see the competition in this area.

(Edited): Related paper covering the larger "mirage" project, but this doesn't cover the "megakernel" approach: https://arxiv.org/abs/2405.05751

zhihaojia Jun 19, 2025

This is the writer of the blog post. You are right that Stanford's work is a parallel effort. The main difference is that our focus is on compilation: making it easier to generate megakernels automatically.

zhihaojia Jun 20, 2025

Ooops, missed one sentence in my previous response. Stanford's MegaKernel project tackles a similar challenge but focuses on manual CUDA implementation. While MPK takes a compiler-driven approach—users express their LLMs at the PyTorch level, and MPK automatically compiles them into optimized megakernels. Our goal is to make programming megakernels much more accessible.

zekrioca Jun 20, 2025

And their focus is..?

sigbottle Jun 19, 2025

Hazy Research also has ThunderKittens, pretty cool library. There's a lot of effort to really formalize, pipeline, divide and conquer in the current NVIDIA GPU model for maximize GPU efficiency, and to write compilers/DSL's for things, it seems.

This item has no comments currently.