Comment by mynti - Hacker Neue

mynti Oct 23, 2025 parent

Super interesting blogpost. I just wonder how this is actually different to LORA, since LORA also adds some parameters and freezes the rest of the model. This seems like a sparse, memory efficient LORA with a couple of extra steps, since it uses attention again to make the sparsity work. All while making it a lot more effective compared to LORA (performance drop of only 11% compared to 71%).

sva_ Oct 29, 2025

> LORA

I think you meant LoRA (not to be confused with LoRa)

This item has no comments currently.

Preferences

Keyboard Shortcuts

Story Lists

Navigation

Miscellaneous