Skip to content

Rate limits

Preview The API is rate-limited on two levels: a token bucket per API key for day-to-day fairness, and a ceiling per team so no single team can monopolize the platform. Every response tells you where you stand, and a 429 tells you exactly how long to wait.

Two levels: per-key bucket and per-team ceiling

Section titled “Two levels: per-key bucket and per-team ceiling”
  • Per-key token bucket. Each API key gets its own bucket that refills continuously. This is the limit you’ll meet in normal use, and it keeps one noisy script from starving your other integrations.
  • Per-team ceiling. A separate aggregate limit applies across all keys in a team. It exists for fairness and abuse resistance — minting more keys doesn’t buy more total throughput, because every key in the team draws against the same team ceiling.

Buckets are sized by action class, because a read costs far less than provisioning an environment:

Action class Example limit Covers
Read 1000/min GET requests — lists, gets, observability reads.
Write 100/min PATCH/PUT/DELETE and most state-changing POSTs.
Create 5/min Heavy provisioning — new sites and environments, clones, restores.

Every response — not just 429s — carries your current standing, so you can throttle proactively instead of waiting to be rejected:

Header Meaning
X-RateLimit-Limit The bucket’s ceiling for the matched action class.
X-RateLimit-Remaining Tokens left in the current window.
X-RateLimit-Reset Unix timestamp when the bucket next refills to full.
Retry-After On a 429 only — seconds to wait before retrying.
Headers on a normal 200 response
HTTP/1.1 200 OK
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 87
X-RateLimit-Reset: 1782192000

Exceed a limit and you get 429 Too Many Requests with a Retry-After header and the standard error envelope:

A rate-limited response
HTTP/1.1 429 Too Many Requests
Retry-After: 12
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1782192012
Content-Type: application/json
{
"error": {
"type": "rate_limit",
"code": "rate_limit.exceeded",
"message": "rate limit exceeded for write actions; retry after 12s",
"doc_url": "https://docs.managed.dev/reference/error-codes/#rate_limit.exceeded"
},
"request_id": "req_01J9…"
}

To see your current standing without making (and possibly burning) a real call, read GET /v1/rate-limits. It returns each bucket’s limit, remaining tokens, and reset time:

GET /v1/rate-limits
curl https://api.managed.dev/v1/rate-limits \
-H "Authorization: Bearer mfk_live_…" \
-H "Forge-Version: 2026-06-23"

When you hit a 429, retry — but politely:

  1. Honor Retry-After first. It’s the authoritative wait. Sleep at least that long before retrying.
  2. Use exponential backoff with jitter for repeated failures: wait base * 2^attempt, plus a random fraction, so a fleet of clients doesn’t retry in lockstep and re-spike the limit (the thundering-herd problem).
  3. Cap your retries, then surface the error. Don’t loop forever.
  4. Throttle proactively. Watch X-RateLimit-Remaining and slow down as it approaches zero, rather than sprinting into a wall of 429s.

The official SDKs implement this backoff for you by default, so you get retry-with-jitter without writing the loop.