Skip to content

Rate limits

Preview The API is rate-limited with token buckets, per key and per team. Every response carries the headers you need to pace yourself, and a 429 tells you exactly how long to wait. This page lists the concrete limits and the introspection endpoint.

There are two layers, and a request must satisfy both. Per-key buckets give each credential predictable headroom; the per-team ceiling keeps one noisy team — or a team that mints many keys — from starving everyone else.

Bucket Limit (example) Applies to
Per-key — read (GET) 1000 / min each API key
Per-key — write (POST/PATCH/PUT/DELETE) 100 / min each API key
Per-key — create (job-spawning POST) 5 / min each API key
Per-team ceiling 5000 / min aggregate all keys on a team

Every response includes the state of the bucket the request drew from, so a well-behaved client never needs to hit a 429 to discover the limit:

Header Meaning
X-RateLimit-Limit The bucket’s ceiling for this request class.
X-RateLimit-Remaining Requests left in the current window.
X-RateLimit-Reset Unix epoch seconds when the window refills.
Retry-After Seconds to wait — present only on a 429.
rate-limit headers on a normal response
HTTP/1.1 200 OK
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 994
X-RateLimit-Reset: 1782192000

When a bucket empties you get a 429 with Retry-After and the standard error envelope. Back off for the stated seconds — don’t retry tighter.

a rate-limited response
HTTP/1.1 429 Too Many Requests
Retry-After: 12
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1782192012
429 error body
{
"error": {
"type": "rate_limit",
"code": "rate_limit.exceeded",
"message": "rate limit exceeded; retry after 12s",
"doc_url": "https://docs.managed.dev/reference/error-codes/#rate_limit.exceeded"
},
"request_id": "req_01J9F2KQ"
}

The SDKs honor Retry-After automatically with backoff, so you rarely handle a 429 by hand unless you’ve written a raw client.

Don’t hard-code the numbers — read them. GET /v1/rate-limits returns the live buckets and their current state for the calling key, so a client can self-throttle:

Check your current rate-limit state
curl https://api.managed.dev/v1/rate-limits \
-H "Authorization: Bearer mfk_live_…"
response
{
"data": {
"buckets": [
{ "class": "read", "limit": 1000, "remaining": 994, "reset": 1782192000 },
{ "class": "write", "limit": 100, "remaining": 100, "reset": 1782192000 },
{ "class": "create", "limit": 5, "remaining": 4, "reset": 1782192000 }
],
"team_ceiling": { "limit": 5000, "remaining": 4980, "reset": 1782192000 }
},
"request_id": "req_01J9F2KQ"
}

Rate limits track your plan tier: higher tiers get larger per-key buckets and a higher team ceiling, matching the larger fleets they run. The exact per-tier numbers are part of the same catalog reconciliation as the plan limits, so they’re held as examples here until finalized.