jetsnoc parent
What does the GPD Win 4 do in this scenario? Is there a step w/ Agent Organizer that decides if a task can go to a smaller model on the Win 4 vs a larger model on your Mac?
What sorts of token/s are you getting with each model?
Model performance summary:
**openai/gpt-oss-120b** — MLX (MXFP4), ~66 tokens/sec @ Hugging Face: `lmstudio-community/gpt-oss-120b-MLX-8bit`
**google/gemma-3-27b** — MLX (4-bit), ~27 tokens/sec @ Hugging Face: `mlx-community/gemma-3-27b-it-qat-4bit`
**qwen/qwen3-coder-30b** — MLX (8-bit), ~78 tokens/sec @ Hugging Face: `Qwen/Qwen3-Coder-30B-A3B-Instruct`
Will reply back and add Meta Llama performance shortly.What is the Agent Organizer you use?
It’s a Claude agent prompt. I don’t recall who originally shared it, so I can’t yet attribute the source, but I’ll track that down shortly and add proper attribution here.
Here’s the Claude agent markdown:
https://github.com/lst97/claude-code-sub-agents/blob/main/ag...
Edit: Updated from the old Pastebin link to the GitHub version. Attribution found: lst97 on GitHub
How it looks like Claude agent is written by Claude...