Operations
Reliability
Operational rules for idempotency, rate limits, retries, concurrency, and short-lived output URLs.
Idempotency
Replay-safe writes
Every write endpoint requires an `Idempotency-Key`. Reusing the same key with the same payload is safe. Reusing it with a different payload is not.
| Behavior | Result |
|---|---|
| Same key, same payload | Returns the original job instead of creating a duplicate write. |
| Same key, different payload | Returns `409 idempotency_conflict`. |
| Retention window | Idempotency records are retained for 24 hours. |
Limits
Rate limits and concurrency
The API applies layered limits per key, per organization, and in some cases per IP. Limit checks happen before expensive work starts.
| Route | Limit |
|---|---|
| POST /api/v1/source-assets | 120 requests / minute |
| POST /api/v1/generations | 120 requests / minute |
| POST /api/v1/generations/{job_id}/regenerate | 120 requests / minute |
| GET /api/v1/generations/{job_id} | 1,200 requests / minute |
| GET /api/v1/usage/summary | 120 requests / minute |
| GET /api/v1/billing/events | 120 requests / minute |
| source_url fetch jobs | 60 requests / minute |
Concurrent jobs
The organization-wide in-flight cap is 50 jobs. That is the maximum number of simultaneous generation jobs an organization can hold at one time.
Retry
Retry strategy
Retry only when the failure mode is explicitly retryable. Keep the same idempotency key when repeating the same write.
Retry 429
Retry 5xx
Do not blindly retry most 4xx errors
Freshness
Signed URLs are temporary
Returned output URLs are intentionally short-lived. API responses also send `Cache-Control: no-store` so clients always read fresh job and billing state.
Practical implication
Re-read the job if a signed output URL has expired. Do not assume the first URL returned for a job remains valid for the lifetime of your workflow.
