Skip to content

Product

Verified inference on a distributed GPU network

OpenAI-compatible API backed by real GPU nodes. Every response includes a signed receipt — cryptographic proof of what model ran, on which node.

Capabilities

Everything you need to run inference

Drop-in API compatibility, streaming, verification, and jurisdiction routing.

OpenAI drop-in

Chat completions, embeddings, image generation, and audio transcription. Same endpoints, same format. Change one line.

Streaming & non-streaming

Real-time SSE streaming with node assignment events, or synchronous responses. Your choice per request.

Verified inference

Every response includes a signed receipt — cryptographic proof of model, node, and result. No other inference API does this.

Distributed GPU network

Independent operators contribute GPU capacity. More operators = more capacity = lower prices.

Usage dashboard

See your spending, request history, and model usage in one workspace. API keys with scoped permissions.

Jurisdiction routing

Route inference to nodes in specific countries. Data sovereignty built into the request, not bolted on.

How it works

From request to verified response

Your request enters the network, a GPU node runs inference, and you get a response with a signed receipt.

1

Send a request

Use the OpenAI SDK or curl. Chat, images, audio, embeddings — same OpenAI-compatible format.

2

Node picks up the job

The hub routes your request to a GPU node that has the model loaded and meets your requirements.

3

Inference runs on real hardware

LLMs via llama.cpp, images via Stable Diffusion, audio via Whisper — all running natively on GPU nodes.

4

Response with receipt

You get the model response plus a signed receipt: node ID, GPU model, timestamps, and Ed25519 signature.

Example: streaming response

Request

Modelphi-4
Streamtrue
Max tokens2048

Node assignment

GPU: RTX 7900 XTX (24 GB)

Region: North America

Latency: ~2s first token

Receipt

Node signature — Ed25519 verified

Processing time — 2.3s

Roles

One network, two perspectives

Developers consume inference. Operators provide GPU capacity and earn from completions.

Developers

Use the OpenAI SDK with one line changed. Get verified inference with signed receipts, per-token pricing, and scoped API keys.

Node operators

Bring GPU hardware online, load models, and earn from every verified inference completion backed by signed receipts.