Local OpenAI-compatible API for text generation, embeddings, and more
http://127.0.0.1:18181
by default.
endpoint
: type of interaction, for example, http://127.0.0.1:18181/v1/completions
/completions
/chat/completions
/embeddings
/reranking
model-name
: for example, NexaAI/Qwen3-0.6B
stream
, multimodal input, and function tools.
Example: