βš™οΈ Prerequisites

  • If you haven’t already, install the nexa-SDK by following the installation guide.
  • MLX models only work on MacOS. Verify you have at least 16GB RAM.
  • Below are the MLX-compatible model types you can experiment with right away.

LLM - Language Models

πŸ“ Language models in MLX format. Try out this quick example: Try it out:
bash
nexa infer NexaAI/Qwen3-0.6B-bf16-MLX
⌨️ Once model loads, type or paste multi-line text directly into the CLI to chat with the model.

LMM - Multimodal Models

πŸ–ΌοΈ Language models that also accept vision and/or audio inputs. LMM in MLX formats. Try out this quick example:
bash
nexa infer NexaAI/gemma-3-4b-it-8bit-MLX
⌨️ Drag photos or audio clips directly into the CLI β€” you can even drop multiple images at once!

Supported Model List

We curated a list of top, high quality models in MLX format.
Many MLX models in the Hugging Face mlx-community have quality issues and may not run locally. We recommend using models from our collection for best results.
For more advanced models, you may visit the Nexa Model Hub. Also, access token is required to download and use these models. To get access token:
  • Create an account at sdk.nexa.ai
  • Generate a token: Go to Deployment β†’ Create Token
  • Activate your SDK: Run the following command in your terminal (replace with your token):
bash
nexa config set license '<your_token_here>'

πŸ™‹ Request New Models

Want a specific model? Submit an issue on the nexa-sdk GitHub or request in our Discord/Slack community!