Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.moda.app/llms.txt

Use this file to discover all available pages before exploring further.

Moda enforces two kinds of limits on every API key:
  • Request rate — how many HTTP calls per minute your key can make.
  • Concurrent design tasks — how many agent-driven design tasks (POST /v1/tasks, POST /v1/remix with a prompt) your organization can have running at once.
Both limits are scoped to your organization — multiple API keys under the same org share the same budget — and both defaults scale with your plan.

Request rate

Every API response carries a Retry-After header when rate-limited. If you exceed your minute budget, you receive:
{
  "error": {
    "type": "rate_limited",
    "code": "rate_limit_exceeded",
    "message": "Rate limit exceeded.",
    "retry_after_ms": 30000
  }
}
with HTTP status 429 Too Many Requests. Back off and retry after the hint.

Default request-rate limits

PlanRequests per minute
free120
free_beta120
paid120
ultra120
Today the rate limit is a flat 120 rpm for all plans, applied per API key. We plan to move to per-plan and per-org limits in a future release; when we do, we’ll pre-announce via our changelog and the Moda-Version header.

Concurrent design tasks

Every organization has a cap on how many design tasks can be queued or running at the same time. The cap applies across:
  • POST /v1/tasks (REST API) — counts against your org’s task cap.
  • POST /v1/remix with a prompt (REST API) — counts against your org’s task cap.
  • Design-dispatch tools on the Moda MCP server (start_design_task, remix_design) — also counts against your org’s task cap.
  • Webhook-triggered tasks — count against your org’s task cap.
The cap does not apply to:
  • Interactive design sessions in the Moda web app (WebSocket).
  • Slack integration traffic.
  • Internal / background jobs we run on your behalf.
When your org is at its cap, new requests return 429 Too Many Requests with Retry-After: 30:
{
  "error": {
    "type": "rate_limited",
    "code": "concurrency_limit_exceeded",
    "message": "Rate limit exceeded: 10/10 concurrent tasks active. Please wait for existing tasks to complete before submitting new ones."
  }
}
Polling your existing tasks via GET /v1/tasks/{id} is unaffected by the cap — only starting new tasks counts toward it.

Default concurrent-task limits

PlanMax concurrent tasks
free3
free_beta3
paid10
ultra15
These defaults are tuned for the typical integration. If you consistently hit your cap and your workflow legitimately requires more parallelism, contact support — we can raise your org’s limit without a plan change.
  • Use the Retry-After header. It’s populated on both rate-limit and concurrency-limit responses.
  • Back off with jitter, not a tight loop. A simple time.sleep(retry_after + random()) is enough.
  • Pipeline, don’t parallel-spray. If you have a batch of 50 designs to generate, submit them in waves of N (where N = your cap) and wait for each wave to finish before the next. See task polling.
  • Share one API key across your integration, not one per deployment. Concurrency is per-org, not per-key, so using multiple keys does not raise your effective cap.
  • Branch on error.type, not on HTTP status. Both the rpm limit and the concurrency cap return 429; the type field (rate_limited) is the same for both, but future error codes may differentiate them via error.code (rate_limit_exceeded vs concurrency_limit_exceeded).

Need a higher limit?

Email support@moda.app with:
  • Your organization name or slug.
  • Which limit you’re hitting (request rate, concurrent tasks).
  • The peak concurrency or rpm your integration needs.
  • A sentence on the use case.
We grant case-by-case per-org overrides for legitimate workloads — you don’t need to move plan tiers to get a temporary or permanent bump.