Rate Limits
Ryvion uses a pay-per-use model with spending velocity controls rather than hard request rate limits.
Spending velocity
New accounts are limited to $5 CAD per hour in spending. This prevents runaway costs from misconfigured clients or compromised API keys. The limit applies on a rolling hourly window.
As your account history builds, the velocity limit increases automatically. Contact support to raise it immediately if needed.
No hard rate limits
There are no hard rate limits on the number of requests per second. You pay for what you use. The network scales horizontally -- more requests are distributed across available GPU nodes.
Request timeout
All requests have a 30-second timeout. If the hub does not receive a response from the executing node within 30 seconds, the request returns a 504 Gateway Timeout.
For long-running workloads (fine-tuning, batch processing), use the async job API instead of synchronous requests.
Retry behavior
The Ryvion API client retries GET requests only on transient errors:
| Status code | Meaning | Retried |
|---|---|---|
429 | Spending velocity exceeded | Yes (GET only) |
502 | Bad gateway | Yes (GET only) |
503 | Service unavailable | Yes (GET only) |
504 | Gateway timeout | Yes (GET only) |
POST, PUT, and DELETE requests are never retried. This prevents duplicate resource creation -- for example, creating two jobs or two API keys from a single user action.
Retries use exponential backoff: 1s, 2s, 4s (up to 3 attempts).
Handling 429 responses
If you receive a 429 response, your account has hit the spending velocity limit:
{
"error": {
"message": "Spending velocity limit exceeded. Current limit: $5.00 CAD/hour.",
"type": "rate_limit_error",
"code": "spending_velocity_exceeded"
}
}
Options:
- Wait -- the limit resets on a rolling hourly window
- Reduce request volume -- batch requests or reduce prompt sizes
- Request a limit increase -- contact support with your use case
Best practices
- Implement exponential backoff for GET retries in your client
- Never retry POST, PUT, or DELETE requests automatically
- Monitor your spending in the billing dashboard
- Use streaming for chat completions to get partial results faster
- Set
max_tokensto avoid unexpectedly long (and expensive) completions - Cache responses when the same query will be repeated