Audio Transcription
Transcribe audio files using OpenAI Whisper models on distributed GPU nodes. Same API format as OpenAI.
Endpoint
POST /v1/audio/transcriptions
Authentication
Include your API key as a Bearer token or X-API-Key header. See Authentication.
Request body (multipart/form-data)
| Parameter | Type | Required | Description |
|---|---|---|---|
file | file | Yes | Audio file to transcribe |
model | string | Yes | Model ID: whisper-base |
language | string | No | Language code (e.g., en). Auto-detected if omitted. |
response_format | string | No | Output format: json, text, verbose_json |
Supported audio formats
- mp3
- wav
- m4a
- flac
- webm
Example
Python
from openai import OpenAI
client = OpenAI(
base_url="https://api.ryvion.ai/v1",
api_key="YOUR_KEY",
)
with open("recording.mp3", "rb") as audio_file:
transcript = client.audio.transcriptions.create(
model="whisper-base",
file=audio_file,
)
print(transcript.text)
curl
curl -X POST https://api.ryvion.ai/v1/audio/transcriptions \
-H "Authorization: Bearer YOUR_API_KEY" \
-F "file=@recording.mp3" \
-F "model=whisper-base"
Response format
{
"text": "Hello, this is a transcription of the audio file."
}
With response_format=verbose_json:
{
"text": "Hello, this is a transcription of the audio file.",
"language": "en",
"duration": 12.5,
"segments": [
{
"start": 0.0,
"end": 3.2,
"text": "Hello, this is a transcription"
},
{
"start": 3.2,
"end": 5.1,
"text": "of the audio file."
}
]
}
Available models
| Model | Description |
|---|---|
whisper-base | OpenAI Whisper base model |
Pricing
$0.006 CAD per minute of audio.
Features
- Automatic language detection
- Timestamp-level segmentation (with
verbose_json) - Supports multiple audio formats
- Each transcription produces a cryptographic receipt