Usage and quotas
Which models you can use, your rolling 30-day usage allowance for self-service LLM keys, the 429 you get over the cap, and optional hep_tokens overage.
Usage and quotas
Self-service LLM keys are metered against a rolling dollar-value allowance. This page explains which models you can use, how much usage is included, and what happens when you run past it.
Models and plan access
qwen3-8bcf-gpt-oss-20b403.GET https://ai.hep.gg/models lists the slugs you can mint a key for. If you pick
a model that is not available on your plan, key creation returns 403.
Your monthly allowance
Usage is valued in dollars at each model's rate and charged against a rolling 30-day allowance for your whole account (every key you own counts against the same pool):
| Plan | Included usage (rolling 30 days) |
|---|---|
| Free | $10 |
| Premium | $50 |
| Admin | Unlimited |
"Rolling" means the window is always the last 30 days, so old usage ages off
continuously and your headroom refills gradually. It is not a hard reset on a
fixed day of the month. The qwen3-8b rate is $0.06 per 1M input tokens and
$0.24 per 1M output tokens, metered separately. Your current spend and remaining
allowance are shown on the LLM Keys page.
When you hit the cap
Once your rolling usage reaches your allowance, POST /v1/chat/completions
returns 429 in the OpenAI error shape with type: "quota_exceeded":
{
"error": {
"message": "Monthly usage limit reached ($10). It refills as your last-30-day usage ages off, or enable extra usage (hep_tokens) in your dashboard.",
"type": "quota_exceeded",
"code": "quota_exhausted"
}
}The code tells you why:
quota_exhaustedoverage_capno_hep_tokensExtra usage (hep_tokens overage)
Extra usage is off by default. Turn it on from the LLM Keys page to keep
working past your included allowance. Beyond the allowance, each request is billed
to your hep_tokens balance at the standard list rate of 240 hep_tokens per $1
of usage.
- You can set an optional cap (the most hep_tokens to auto-spend per rolling 30
days).
0means no cap, up to your balance. - When your balance reaches
0or you hit your cap, requests return429again (no_hep_tokensoroverage_cap).
Admins
Admin accounts have no quota, can use every model, and keep the existing admin key path. None of the limits above apply to them.