Moda enforces two kinds of limits on every API key:Documentation Index
Fetch the complete documentation index at: https://docs.moda.app/llms.txt
Use this file to discover all available pages before exploring further.
- Request rate — how many HTTP calls per minute your key can make.
- Concurrent design tasks — how many agent-driven design tasks (
POST /v1/tasks,POST /v1/remixwith a prompt) your organization can have running at once.
Request rate
Every API response carries aRetry-After header when rate-limited. If you exceed your minute budget, you receive:
429 Too Many Requests. Back off and retry after the hint.
Default request-rate limits
| Plan | Requests per minute |
|---|---|
free | 120 |
free_beta | 120 |
paid | 120 |
ultra | 120 |
Moda-Version header.
Concurrent design tasks
Every organization has a cap on how many design tasks can be queued or running at the same time. The cap applies across:POST /v1/tasks(REST API) — counts against your org’s task cap.POST /v1/remixwith aprompt(REST API) — counts against your org’s task cap.- Design-dispatch tools on the Moda MCP server (
start_design_task,remix_design) — also counts against your org’s task cap. - Webhook-triggered tasks — count against your org’s task cap.
- Interactive design sessions in the Moda web app (WebSocket).
- Slack integration traffic.
- Internal / background jobs we run on your behalf.
429 Too Many Requests with Retry-After: 30:
GET /v1/tasks/{id} is unaffected by the cap — only starting new tasks counts toward it.
Default concurrent-task limits
| Plan | Max concurrent tasks |
|---|---|
free | 3 |
free_beta | 3 |
paid | 10 |
ultra | 15 |
Recommended client behavior
- Use the
Retry-Afterheader. It’s populated on both rate-limit and concurrency-limit responses. - Back off with jitter, not a tight loop. A simple
time.sleep(retry_after + random())is enough. - Pipeline, don’t parallel-spray. If you have a batch of 50 designs to generate, submit them in waves of N (where N = your cap) and wait for each wave to finish before the next. See task polling.
- Share one API key across your integration, not one per deployment. Concurrency is per-org, not per-key, so using multiple keys does not raise your effective cap.
- Branch on
error.type, not on HTTP status. Both the rpm limit and the concurrency cap return 429; thetypefield (rate_limited) is the same for both, but future error codes may differentiate them viaerror.code(rate_limit_exceededvsconcurrency_limit_exceeded).
Need a higher limit?
Email support@moda.app with:- Your organization name or slug.
- Which limit you’re hitting (request rate, concurrent tasks).
- The peak concurrency or rpm your integration needs.
- A sentence on the use case.