Comment by numpad0 - Hacker Neue

numpad0 Apr 16, 2024 parent

Note that while UMA is great in the sense that they allow LLM models to be run at all, M-series chips aren't faster[1] when the model fits in VRAM.

  1: screenshot from[2]: https://www.igorslab.de/wp-content/uploads/2023/06/Apple-M2-ULtra-SoC-Geekbench-5-OpenCL-Compute.jpg
  2: https://wccftech.com/apple-m2-ultra-soc-isnt-faster-than-amd-intel-last-year-desktop-cpus-50-slower-than-nvidia-rtx-4080/

cstejerean Apr 16, 2024

The problem is you're limited to 24 GB of VRAM unless you pay through the nose for datacenter GPUs, whereas you can get an M-series chip with 128 GB or 192 GB of unified memory.

numpad0 OP Apr 16, 2024

Surely! The point is that they're not million times faster magic chips that makes NVIDIA bankrupt tomorrow. That's all. A laptop with up to 128GB "VRAM" is a great option, absolutely no doubt about that.

john_alan Apr 16, 2024

They are powerful, but I agree with you, it's nice to be able to run Goliath locally, but it's a lot slower than my 4070.

paulmd Apr 16, 2024

that's openCL compute, LLM models ideally should be hitting the neural accelerator, not running on generalized gpu compute shaders.

This item has no comments currently.

Preferences

Keyboard Shortcuts

Story Lists

Navigation

Miscellaneous