News Grower

Independent coverage of AI, startups, and technology.

Ars Technica Mar 31, 2026 at 23:00 Big Tech

Running local models on Macs gets faster with Ollama's MLX support

Apple Silicon Macs get a performance boost thanks to better unified memory usage.

By Samuel Axon Original source
Running local models on Macs gets faster with Ollama's MLX support

Ollama, a runtime system for operating large language models on a local computer, has introduced support for Apple's open source MLX framework for machine learning. Additionally, Ollama says it has improved caching performance and now supports Nvidia's NVFP4 format for model compression, making for much more efficient memory usage in certain models. Combined, these developments promise significantly improved performance on Macs with Apple Silicon chips (M1 or later)—and the timing couldn't be better, as local models are starting to gain steam in ways they haven't before outside researcher and hobbyist communities. The recent runaway success of OpenClaw—which raced its way to over 300,000 stars on GitHub, made headlines with experiments like Moltbook and became an obsession in China in particular—has many people experimenting with running models on their machines.Read full article Comments

Quick summary

Apple Silicon Macs get a performance boost thanks to better unified memory usage. Ollama, a runtime system for operating large language models on a local computer, has introduced support for Apple's open source MLX framework for machine learning.

Related tags

Companies and people

Story threads

Continue with this story

Follow the same topic through connected articles, entity pages, and active story threads.

Ad slot

Article inline monetization block

A reserved partner slot for relevant tools, services, and contextual editorial integrations.

Partner slot

Related articles

More stories that share tags, source, or category context.

More from Ars Technica

Fresh reporting and follow-up coverage from the same newsroom.

Open source page