Rate limits

Preview The API is rate-limited with token buckets, per key and per team. Every response carries the headers you need to pace yourself, and a 429 tells you exactly how long to wait. This page lists the concrete limits and the introspection endpoint.

the buckets

There are two layers, and a request must satisfy both. Per-key buckets give each credential predictable headroom; the per-team ceiling keeps one noisy team — or a team that mints many keys — from starving everyone else.

Bucket	Limit (example)	Applies to
Per-key — read (`GET`)	1000 / min	each API key
Per-key — write (`POST`/`PATCH`/`PUT`/`DELETE`)	100 / min	each API key
Per-key — create (job-spawning `POST`)	5 / min	each API key
Per-team ceiling	5000 / min aggregate	all keys on a team

response headers

Every response includes the state of the bucket the request drew from, so a well-behaved client never needs to hit a 429 to discover the limit:

Header	Meaning
`X-RateLimit-Limit`	The bucket’s ceiling for this request class.
`X-RateLimit-Remaining`	Requests left in the current window.
`X-RateLimit-Reset`	Unix epoch seconds when the window refills.
`Retry-After`	Seconds to wait — present only on a `429`.

HTTP/1.1 200 OK
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 994
X-RateLimit-Reset: 1782192000

the 429 body

When a bucket empties you get a 429 with Retry-After and the standard error envelope. Back off for the stated seconds — don’t retry tighter.

HTTP/1.1 429 Too Many Requests
Retry-After: 12
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1782192012

{
  "error": {
    "type": "rate_limit",
    "code": "rate_limit.exceeded",
    "message": "rate limit exceeded; retry after 12s",
    "doc_url": "https://docs.managed.dev/reference/error-codes/#rate_limit.exceeded"
  },
  "request_id": "req_01J9F2KQ"
}

The SDKs honor Retry-After automatically with backoff, so you rarely handle a 429 by hand unless you’ve written a raw client.

introspecting your limits

Don’t hard-code the numbers — read them. GET /v1/rate-limits returns the live buckets and their current state for the calling key, so a client can self-throttle:

curl https://api.managed.dev/v1/rate-limits \
  -H "Authorization: Bearer mfk_live_…"

{
  "data": {
    "buckets": [
      { "class": "read",   "limit": 1000, "remaining": 994, "reset": 1782192000 },
      { "class": "write",  "limit": 100,  "remaining": 100, "reset": 1782192000 },
      { "class": "create", "limit": 5,    "remaining": 4,   "reset": 1782192000 }
    ],
    "team_ceiling": { "limit": 5000, "remaining": 4980, "reset": 1782192000 }
  },
  "request_id": "req_01J9F2KQ"
}

limits scale with your plan

Rate limits track your plan tier: higher tiers get larger per-key buckets and a higher team ceiling, matching the larger fleets they run. The exact per-tier numbers are part of the same catalog reconciliation as the plan limits, so they’re held as examples here until finalized.

next steps

Rate limits conceptHow token buckets, backoff, and the per-key/per-team layering work.

Plans & limitsThe plan tiers your rate limits and quotas scale with.

Error codesrate_limit.exceeded and every other error you might hit.

SDKsClients that honor Retry-After and back off for you.