OpenAI-compatible API
Chat completions and model listing at https://ai.hep.gg/v1, authenticated with an sk-hyd- API key. Drop-in for any OpenAI SDK.
OpenAI-compatible API
https://ai.hep.gg/v1 speaks the OpenAI Chat Completions wire format. Point any OpenAI SDK at it by overriding the base URL and passing your sk-hyd- key as the API key. No custom client is needed.
Authentication
Send your API key as a Bearer token.
Authorization: Bearer sk-hyd-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxThe key is SHA-256 hashed and matched against your active keys. A missing key, a value that does not start with sk-hyd-, or a disabled key all return 401:
{ "error": "Invalid API key" }Mint keys with your master token via POST /keys, or from the dashboard. Each key is pinned to one model at mint time.
Model selection
Each sk-hyd- key is bound to exactly one model (its mint-time slug). The endpoints route to that model regardless of what you put in the request model field, so a model value in the body is effectively ignored for routing. Pass the slug anyway for SDK compatibility, the response model is rewritten to the hep.gg slug (for example cf-gpt-oss-20b).
The only model currently available is cf-gpt-oss-20b (GPT-OSS 20B on Cloudflare). List the minting catalog at GET https://ai.hep.gg/models.
https://ai.hep.gg/v1/chat/completionsAuth requiredAccepts a standard OpenAI Chat Completions JSON body and returns an OpenAI-shaped completion. Content-Type: application/json, body limit 10 MB.
messages{ role, content } objects, exactly as OpenAI expects.modelcf-gpt-oss-20b.streamfalsetrue, the response is streamed as Server-Sent Events (text/event-stream).max_tokens16384 upstream (see below). Any value you pass is honored as-is.temperatureResponse
Non-streaming returns the upstream completion JSON with the model field rewritten to the hep.gg slug and a usage object carrying prompt_tokens and completion_tokens.
{
"id": "chatcmpl-...",
"object": "chat.completion",
"model": "cf-gpt-oss-20b",
"choices": [
{
"index": 0,
"message": { "role": "assistant", "content": "Hello" },
"finish_reason": "stop"
}
],
"usage": { "prompt_tokens": 12, "completion_tokens": 1 }
}Streaming pipes SSE chunks through with the model field rewritten in each data: line; the final chunk carries usage. Every request is logged and your key's request_count, prompt_tokens, completion_tokens, and last_used_at counters are updated.
Examples
curl https://ai.hep.gg/v1/chat/completions \
-H "Authorization: Bearer $HYD_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "cf-gpt-oss-20b",
"messages": [
{ "role": "system", "content": "You are concise." },
{ "role": "user", "content": "Name three primary colors." }
],
"max_tokens": 512
}'import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://ai.hep.gg/v1",
apiKey: process.env.HYD_API_KEY, // sk-hyd-...
});
const res = await client.chat.completions.create({
model: "cf-gpt-oss-20b",
messages: [
{ role: "system", content: "You are concise." },
{ role: "user", content: "Name three primary colors." },
],
max_tokens: 512,
});
console.log(res.choices[0].message.content);
console.log(res.usage);Streaming with the SDK:
curl https://ai.hep.gg/v1/chat/completions \
-H "Authorization: Bearer $HYD_API_KEY" \
-H "Content-Type: application/json" \
-N \
-d '{
"model": "cf-gpt-oss-20b",
"messages": [{ "role": "user", "content": "Count to five." }],
"max_tokens": 512,
"stream": true
}'const stream = await client.chat.completions.create({
model: "cf-gpt-oss-20b",
messages: [{ role: "user", content: "Count to five." }],
max_tokens: 512,
stream: true,
});
for await (const chunk of stream) {
process.stdout.write(chunk.choices[0]?.delta?.content ?? "");
}https://ai.hep.gg/v1/modelsAuth requiredOpenAI-compatible model list. Because an OpenAI client pins to one model, this returns only the model your presented key is bound to. Same authentication as chat completions.
{
"object": "list",
"data": [
{ "id": "cf-gpt-oss-20b", "object": "model", "owned_by": "team-hydra" }
]
}curl https://ai.hep.gg/v1/models \
-H "Authorization: Bearer $HYD_API_KEY"const models = await client.models.list();
console.log(models.data);Errors
Errors use the OpenAI shape, { "error": { "message": "..." } }, with the upstream status code (or 400, 500, 501 for local conditions).