You can easily run models like Mistral and Stable Diffusion in Ollama and Draw Things, and you can run newer models like Devstral (the MLX version) and Z Image Turbo with a little effort using LM Studio and Comfyui. It isn't as fast as using a good nVidia GPU or a cloud GPU but it's certainly good enough to play around with and learn more about it. I've written a bunch of apps that give me a browser UI talking to an API that's provided by an app running a model locally and it works perfectly well. I did that on an 8GB M1 for 18 months and then upgraded to a 24GB M4 Pro recently. I still have the M1 on my network for doing AI things in the background.
Yes, the models it can run do not perform like chatgpt or claude 4.5, but they're still very useful.
If you try to get them to compose text, you'll end up seeing a lot less variety than you would with a chatgpt for instance. That said, ask them to analyze a csv file that you don't want to give to chatgpt, or ask them to write code and they're generally competent at it. the high end codex-gpt-5.2 type models are smarter, may find better solutions, may track down bugs more quickly -- but the local models are getting better all the time.
You need 96gb or 128gb to do non trivial things. That is not yet 749 usd
While browsing the Apple website, it looks like the cheapest Macbook with 64 GB of RAM is the Macbook Pro M4 Max with 40-core GPU, which starts at $3,899, a.k.a. more than five times more expensive than the price quoted above.
Anyway, I'm on a mission to have no subscriptions in the New Year. Plus it feels wrong to be contributing towards my own irrelevance (GAI).