API Documentation

OpenAI-compatible API with optional Australian data residency

Authentication

All API requests require a Bearer token in the Authorization header:

Authorization: Bearer <your-api-key>

Generate API keys from your dashboard. Keys are shown only once at creation — store them securely.

Base URL

https://api.au.yarn.prosodylabs.com.au

Endpoints

Chat Completions

POST /v1/chat/completions

Send a list of messages and receive a model-generated response. Supports streaming via "stream": true.

Request body
{
  "model": "Qwen/Qwen2.5-3B-Instruct",
  "messages": [
    {"role": "user", "content": "What is data sovereignty?"}
  ],
  "stream": false
}

Response follows the standard OpenAI chat completion format.

List Models

GET /v1/models

Returns a list of available models with sovereignty tier, capabilities, and context window for each.

Embeddings

POST /v1/embeddings

Generate vector embeddings for text input. Routes to vLLM (sovereign) or OpenAI depending on the model.

Request body
{
  "model": "Qwen/Qwen2.5-3B-Instruct",
  "input": "What is data sovereignty?"
}

Function Calling

Pass tools and tool_choice in your chat completion request. The model can invoke functions and return structured arguments.

Example with tools
{
  "model": "Qwen/Qwen2.5-3B-Instruct",
  "messages": [
    {"role": "user", "content": "What's the weather in Perth?"}
  ],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_weather",
        "description": "Get the current weather",
        "parameters": {
          "type": "object",
          "properties": {
            "location": {"type": "string"}
          },
          "required": ["location"]
        }
      }
    }
  ],
  "tool_choice": "auto"
}

When the model invokes a function, the response includes a tool_calls array with finish_reason: "tool_calls".

Models

Yarn provides two categories of model, each with different data handling:

Sovereign models

Open-weight models running on Australian hardware. Your prompts and responses are processed entirely within Australia and never leave the country. Examples: Qwen 2.5, Mistral 7B.

Global models

Frontier models accessed via overseas providers (OpenAI, Anthropic). Your prompts are sent to the provider's servers outside Australia. Examples: GPT-4o, Claude.

Each model response includes a sovereignty field indicating whether the model is sovereign or global.

Code Examples

curl

curl -X POST https://api.au.yarn.prosodylabs.com.au/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "Qwen/Qwen2.5-3B-Instruct",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

Python (OpenAI SDK)

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://api.au.yarn.prosodylabs.com.au/v1"
)

response = client.chat.completions.create(
    model="Qwen/Qwen2.5-3B-Instruct",
    messages=[{"role": "user", "content": "Hello"}]
)
print(response.choices[0].message.content)

JavaScript

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: 'YOUR_API_KEY',
  baseURL: 'https://api.au.yarn.prosodylabs.com.au/v1',
});

const response = await client.chat.completions.create({
  model: 'Qwen/Qwen2.5-3B-Instruct',
  messages: [{ role: 'user', content: 'Hello' }],
});

Streaming (Python)

stream = client.chat.completions.create(
    model="Qwen/Qwen2.5-3B-Instruct",
    messages=[{"role": "user", "content": "Explain data sovereignty"}],
    stream=True
)
for chunk in stream:
    print(chunk.choices[0].delta.content or "", end="")

Streaming uses Server-Sent Events (SSE). Each chunk is prefixed with data: and the stream ends with data: [DONE].

Python SDK

The yarn-au package provides a CLI and client library for Yarn.

pip install yarn-au

Authenticate and interact from the command line:

# Authenticate (opens browser for login)
yarn auth login

# List available models
yarn models

# List available GPUs with pricing
yarn gpus

# Submit a training job
yarn job submit --gpu rtx-4090 --entrypoint train.py

The SDK is also compatible with the OpenAI Python client. See the Research Portal docs for full SDK documentation.

Rate Limits & Quotas

API requests are subject to per-key quota limits based on your plan. When your quota is exceeded, the API returns 429 Too Many Requests with the following headers:

  • X-RateLimit-Limit — your quota limit
  • X-RateLimit-Remaining — remaining quota
  • X-RateLimit-Reset — when the quota resets

Manage your quota and add funds from your spending dashboard. For higher limits, contact jordan@prosodylabs.com.au.

Data Sovereignty

Yarn is built for Australian data residency. When you use a sovereign model, your data is processed on GPU hardware in Perth, Western Australia. It never leaves Australian infrastructure.

When you use a global model (GPT-4o, Claude, etc.), your prompt is proxied to the provider's API servers, which are located outside Australia. Responses are returned to you via Yarn but the prompt content transits through the provider's infrastructure.

Check the sovereignty field in the model response or the GET /v1/models endpoint before sending sensitive data. For full details, see our Privacy Policy.