Model Name Mapping
For all NPU model, we use an internal namming mapping and please fill in plugin id accordingly.| Model Name | Plugin ID | Huggingface repository name |
|---|---|---|
| omni-neural | npu | NexaAI/OmniNeural-4B-mobile |
| phi3.5 | npu | NexaAI/phi3.5-mini-npu-mobile |
| phi4 | npu | NexaAI/phi4-mini-npu-mobile |
| granite4 | npu | NexaAI/Granite-4-Micro-NPU-mobile |
| embed-gemma | npu | NexaAI/embeddinggemma-300m-npu-mobile |
| qwen3-4b | npu | NexaAI/Qwen3-4B-Instruct-2507-npu-mobile |
| llama3-3b | npu | NexaAI/Llama3.2-3B-NPU-Turbo-NPU-mobile |
| liquid-v2 | npu | NexaAI/LFM2-1.2B-npu-mobile |
| paddleocr | npu | NexaAI/paddleocr-npu-mobile |
Two Ways to Run on NPU
You can run models on Qualcomm Hexagon NPU in two different ways:1) NEXA models via “npu” plugin
- Use the
npuplugin - Pick a supported NEXA model from the table above and set
model_nameaccordingly
2) GGUF models via GGML Hexagon backend
- Load a GGUF model
- Use the
cpu_gpuplugin - Set
device_idtoHTP0 - Set
nGpuLayers > 0inModelConfig
LLM Usage
Large Language Models for text generation and chat applications.1) NEXA Models (“npu” plugin)
We support NPU inference for NEXA format models.2) GGUF Models on Hexagon NPU (GGML Hexagon backend)
Run a GGUF model on Hexagon NPU by using thecpu_gpu plugin with device_id = "HTP0,HTP1,HTP2,HTP3" and setting nGpuLayers > 0.
Multimodal Usage
Vision-Language Models for image understanding and multimodal applications.1) NEXA Models (“npu” plugin)
We support NPU inference for NEXA format models.2) GGUF Models on Hexagon NPU (GGML Hexagon backend)
Run a GGUF VLM on Hexagon NPU by using thecpu_gpu plugin with device_id = "HTP0" and setting nGpuLayers > 0.
Embeddings Usage
Generate vector embeddings for semantic search and RAG applications.Basic Usage
ASR Usage
Automatic Speech Recognition for audio transcription.Basic Usage
Rerank Usage
Improve search relevance by reranking documents based on query relevance.Basic Usage
CV Usage
Computer Vision models for OCR, object detection, and image classification.Basic Usage
Need Help?
Join our community to get support, share your projects, and connect with other developers.Discord Community
Get real-time support and chat with the Nexa AI community
Slack Community
Collaborate with developers and access community resources
Was this page helpful?