OpenAI-compatible API

Chat completions and model listing at https://ai.hep.gg/v1, authenticated with an sk-hyd- API key. Drop-in for any OpenAI SDK.

OpenAI-compatible API

https://ai.hep.gg/v1 speaks the OpenAI Chat Completions wire format. Point any OpenAI SDK at it by overriding the base URL and passing your sk-hyd- key as the API key. No custom client is needed.

Authentication

Send your API key as a Bearer token.

Authorization header
Authorization: Bearer sk-hyd-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

The key is SHA-256 hashed and matched against your active keys. A missing key, a value that does not start with sk-hyd-, or a disabled key all return 401:

{ "error": "Invalid API key" }

Mint keys with your master token via POST /keys, or from the dashboard. Each key is pinned to one model at mint time.

Model selection

Each sk-hyd- key is bound to exactly one model (its mint-time slug). The endpoints route to that model regardless of what you put in the request model field, so a model value in the body is effectively ignored for routing. Pass the slug anyway for SDK compatibility, the response model is rewritten to the hep.gg slug (for example cf-gpt-oss-20b).

Two models exist: qwen3-8b (Qwen3 8B), generally available to anyone with AI access, and cf-gpt-oss-20b (GPT-OSS 20B on Cloudflare), admin only. List the slugs you can mint a key for at GET https://ai.hep.gg/models. See Usage and quotas for plan access and the per-account allowance.

POSThttps://ai.hep.gg/v1/chat/completionsAuth required
OpenAI-compatible chat completion. Supports streaming.

Accepts a standard OpenAI Chat Completions JSON body and returns an OpenAI-shaped completion. Content-Type: application/json, body limit 10 MB.

Body fields
messages
arrayrequired
The conversation as an array of { role, content } objects, exactly as OpenAI expects.
model
stringoptional
The model slug. Accepted for SDK compatibility but does not change routing, the key's pinned model is always used. Pass your key's slug (for example qwen3-8b).
stream
booleanoptionaldefault: false
When true, the response is streamed as Server-Sent Events (text/event-stream).
max_tokens
integeroptional
Maximum tokens to generate. If omitted, the gateway injects 16384 upstream (see below). Any value you pass is honored as-is.
temperature
numberoptional
Standard OpenAI sampling parameter. Other standard fields are forwarded upstream unchanged.

Response

Non-streaming returns the upstream completion JSON with the model field rewritten to the hep.gg slug and a usage object carrying prompt_tokens and completion_tokens.

200 OK (non-streaming)
{
  "id": "chatcmpl-...",
  "object": "chat.completion",
  "model": "cf-gpt-oss-20b",
  "choices": [
    {
      "index": 0,
      "message": { "role": "assistant", "content": "Hello" },
      "finish_reason": "stop"
    }
  ],
  "usage": { "prompt_tokens": 12, "completion_tokens": 1 }
}

Streaming pipes SSE chunks through with the model field rewritten in each data: line; the final chunk carries usage. Every request is logged and your key's request_count, prompt_tokens, completion_tokens, and last_used_at counters are updated.

Examples

curl
curl https://ai.hep.gg/v1/chat/completions \
  -H "Authorization: Bearer $HYD_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "cf-gpt-oss-20b",
    "messages": [
      { "role": "system", "content": "You are concise." },
      { "role": "user", "content": "Name three primary colors." }
    ],
    "max_tokens": 512
  }'

Streaming with the SDK:

curl
curl https://ai.hep.gg/v1/chat/completions \
  -H "Authorization: Bearer $HYD_API_KEY" \
  -H "Content-Type: application/json" \
  -N \
  -d '{
    "model": "cf-gpt-oss-20b",
    "messages": [{ "role": "user", "content": "Count to five." }],
    "max_tokens": 512,
    "stream": true
  }'
GEThttps://ai.hep.gg/v1/modelsAuth required
List the single model your API key is pinned to.

OpenAI-compatible model list. Because an OpenAI client pins to one model, this returns only the model your presented key is bound to. Same authentication as chat completions.

200 OK
{
  "object": "list",
  "data": [
    { "id": "cf-gpt-oss-20b", "object": "model", "owned_by": "team-hydra" }
  ]
}
curl
curl https://ai.hep.gg/v1/models \
  -H "Authorization: Bearer $HYD_API_KEY"

Usage limits

Requests count against your account's rolling 30-day allowance. Once you pass it, this endpoint returns 429 with type: "quota_exceeded" (unless you have enabled hep_tokens extra usage). The code field says why. See Usage and quotas for the allowances, the error codes, and how to keep working past the cap.

Errors

Errors use the OpenAI shape, { "error": { "message": "..." } }, with the upstream status code (or 400, 429, 500, 501 for local conditions). A 429 with type: "quota_exceeded" means you reached your usage allowance.