Plugin Selection
Choose the appropriate backend plugin for your model type and format."cpu_gpu"- GGUF backend for CPU/GPU/Hexagon NPU (LLM, VLM). Device is selected viadevice_id+nGpuLayers."npu"- NPU backend for NEXA format models (LLM, VLM, Embeddings, ASR, CV, Rerank)"whisper_cpp"- Whisper.cpp backend for ASR"tts_cpp"- TTS backend for text-to-speech
Device Selection
Control which hardware device processes your model.null- CPU (default)"GPUOpenCL"- GPU acceleration via OpenCL"HTP0"- Qualcomm Hexagon NPU acceleration
Hardware Acceleration with GGUF (
plugin_id = "cpu_gpu"):- GPU: set
device_id = "GPUOpenCL"and setnGpuLayers > 0inModelConfig - Hexagon NPU (GGML backend): set
device_id = "HTP0"and setnGpuLayers > 0inModelConfig
nGpuLayers = 0 (or device_id = null), the model runs on CPU.LLM Data Structures
LlmCreateInput
ChatMessage
GenerationConfig
LlmStreamResult
Multimodal Data Structures
VlmCreateInput
VlmChatMessage
VlmContent
Embeddings Data Structures
EmbedderCreateInput
EmbeddingConfig
ASR Data Structures
AsrCreateInput
AsrTranscribeInput
AsrTranscriptionResult
TTS Data Structures
TtsCreateInput
TtsSynthesizeInput
TtsConfig
TtsSynthesizeOutput
Rerank Data Structures
RerankerCreateInput
RerankConfig
RerankerResult
Computer Vision Data Structures
CVCreateInput
CVModelConfig
CVCapability
CVResult
Need Help?
Join our community to get support, share your projects, and connect with other developers.Discord Community
Get real-time support and chat with the Nexa AI community
Slack Community
Collaborate with developers and access community resources
Was this page helpful?