Open app
Moonborn — Developers

Rate limits

Per-tier API caps, headers, retry policy, and where to read the live remaining count.

Rate limiting is per-org and applied at the gateway. Three lever points: requests per minute, requests per day, and per-endpoint specialization (e.g. generation is more constrained than reads).

Per-tier defaults

TierReq/minReq/dayConcurrent generations
Free605,0001
Pro60050,0005
Team3,000250,00025
Enterprisecustomcustomcustom

Generation endpoints (POST /v1/personas, POST /v1/personas/{id}/refine, POST /v1/personas/{id}/fork) burn from the concurrent-generations budget separately from the per-minute cap. Reads (GET *) don't.

Headers

Every response carries:

X-RateLimit-Limit:     <per-minute cap>
X-RateLimit-Remaining: <remaining in current window>
X-RateLimit-Reset:     <Unix timestamp when window resets>

On 429, additionally:

Retry-After: 12

(seconds until the next request is permitted)

Endpoint-specific caps

Some endpoints have tighter individual budgets to protect upstream LLM providers:

Endpoint familyPro cap
POST /v1/personas (generation)60/hour, 5 concurrent
POST /v1/personas/{id}/refine120/hour
POST /v1/chat/sessions/{id}/messages600/min (counts against the general cap)
POST /v1/personas/{id}/audit300/hour

Cap multipliers per tier: Free × 0.1, Pro × 1, Team × 5, Enterprise custom.

Client-side backpressure

Don't wait for 429. The SDKs read X-RateLimit-Remaining after every call; if you're below 10% of the cap with > 30 seconds to reset, slow down. The SDK exposes a callback for custom backpressure:

const client = new Moonborn({
  apiKey: process.env.MOONBORN_API_KEY,
  onRateLimitNearCap: ({ remaining, resetIn }) => {
    if (remaining < 50) setMyOwnBackpressure(resetIn);
  },
});

Quota vs rate limit

These are different. Rate limit is per-minute / per-day shaped to protect throughput. Quota is a tier-bound monthly cap on certain operations (e.g. "100 generations/month on Pro"). A quota cap returns quota_exceeded (429); the rate limit returns rate_limited (429). Same status, different code in the envelope.

Enterprise

Per-org custom caps via contract. Configurable via api.rate_limit.* config items (Owner role, Enterprise only).

Honest scope

Rate limits protect the system; they don't shape your application's UX. Buffer, queue, and retry in your code; treat 429 as routine, not exceptional.