Posts

Showing posts from March, 2025

Micro benchmarking Apple M1 Max - MLX vs GGUF - LLM QWEN 2.5

Image
MLX is an ML framework targeted at Apple Silicon.  It provides noticeable ML performance gains when compared to the standard (GGUF) techniques running on Apple Silicon.  This MLX project describes MLX as:   MLX is an array framework for machine learning on Apple silicon, brought to you by Apple machine learning research. A notable difference from MLX and other frameworks is the   unified memory model . Arrays in MLX live in shared memory. Operations on MLX arrays can be performed on any of the supported device types without transferring data. LM Studio added support for Apple Silicon MLX models in 2024 . I totally ignored it until I saw a 2025/02 Reddit post in the /r/ocallama subreddit .  I wanted to execute their microbenchmark on my Mac to get a feel for the possible performance difference.  The performance improvement is exciting.  I am waiting on really jumping into the MLX until Ollama supports MLX something they are working on as of 2025/0...