Open Think Start Free →

Point at any endpoint that implements OpenAI's `/v1/chat/completions` with `tool_calls`. Covers Groq's sub-second inference, Together's open-weight catalog, self-hosted Ollama on your LAN, and vLLM clusters. Our streaming adapter handles the incremental `tool_calls` chunk format.

Install

1 · Config

Append to ENABLED_PLUGINS in wrangler.toml:

openai-compatible

2 · Secrets

Run each from your Worker project:

  • wrangler secret put OPENAI_COMPATIBLE_URL Base URL, e.g. https://api.groq.com/openai/v1
  • wrangler secret put OPENAI_COMPATIBLE_KEY optional — Bearer token for the endpoint. Optional for local Ollama.
Copy-paste .dev.vars template
OPENAI_COMPATIBLE_URL=Base URL, e.g. https://api.groq.com/openai/v1
OPENAI_COMPATIBLE_KEY=# optional — Bearer token for the endpoint. Optional for local Ollama.

3 · Steps

  1. Add the endpoint's hostname to ALLOWED_HOSTS before your first call.

Tags

byokself-host-friendlytool-usestreaming