OpenAI-compatible API with optional Australian data residency
All API requests require a Bearer token in the Authorization header:
Authorization: Bearer <your-api-key>Generate API keys from your dashboard. Keys are shown only once at creation — store them securely.
https://api.au.yarn.prosodylabs.com.auPOST /v1/chat/completions
Send a list of messages and receive a model-generated response. Supports streaming via "stream": true.
{
"model": "Qwen/Qwen2.5-3B-Instruct",
"messages": [
{"role": "user", "content": "What is data sovereignty?"}
],
"stream": false
}Response follows the standard OpenAI chat completion format.
GET /v1/models
Returns a list of available models with sovereignty tier, capabilities, and context window for each.
POST /v1/embeddings
Generate vector embeddings for text input. Routes to vLLM (sovereign) or OpenAI depending on the model.
{
"model": "Qwen/Qwen2.5-3B-Instruct",
"input": "What is data sovereignty?"
}Pass tools and tool_choice in your chat completion request. The model can invoke functions and return structured arguments.
{
"model": "Qwen/Qwen2.5-3B-Instruct",
"messages": [
{"role": "user", "content": "What's the weather in Perth?"}
],
"tools": [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get the current weather",
"parameters": {
"type": "object",
"properties": {
"location": {"type": "string"}
},
"required": ["location"]
}
}
}
],
"tool_choice": "auto"
}When the model invokes a function, the response includes a tool_calls array with finish_reason: "tool_calls".
Yarn provides two categories of model, each with different data handling:
Open-weight models running on Australian hardware. Your prompts and responses are processed entirely within Australia and never leave the country. Examples: Qwen 2.5, Mistral 7B.
Frontier models accessed via overseas providers (OpenAI, Anthropic). Your prompts are sent to the provider's servers outside Australia. Examples: GPT-4o, Claude.
Each model response includes a sovereignty field indicating whether the model is sovereign or global.
curl -X POST https://api.au.yarn.prosodylabs.com.au/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "Qwen/Qwen2.5-3B-Instruct",
"messages": [{"role": "user", "content": "Hello"}]
}'from openai import OpenAI
client = OpenAI(
api_key="YOUR_API_KEY",
base_url="https://api.au.yarn.prosodylabs.com.au/v1"
)
response = client.chat.completions.create(
model="Qwen/Qwen2.5-3B-Instruct",
messages=[{"role": "user", "content": "Hello"}]
)
print(response.choices[0].message.content)import OpenAI from 'openai';
const client = new OpenAI({
apiKey: 'YOUR_API_KEY',
baseURL: 'https://api.au.yarn.prosodylabs.com.au/v1',
});
const response = await client.chat.completions.create({
model: 'Qwen/Qwen2.5-3B-Instruct',
messages: [{ role: 'user', content: 'Hello' }],
});stream = client.chat.completions.create(
model="Qwen/Qwen2.5-3B-Instruct",
messages=[{"role": "user", "content": "Explain data sovereignty"}],
stream=True
)
for chunk in stream:
print(chunk.choices[0].delta.content or "", end="")Streaming uses Server-Sent Events (SSE). Each chunk is prefixed with data: and the stream ends with data: [DONE].
The yarn-au package provides a CLI and client library for Yarn.
pip install yarn-auAuthenticate and interact from the command line:
# Authenticate (opens browser for login)
yarn auth login
# List available models
yarn models
# List available GPUs with pricing
yarn gpus
# Submit a training job
yarn job submit --gpu rtx-4090 --entrypoint train.pyThe SDK is also compatible with the OpenAI Python client. See the Research Portal docs for full SDK documentation.
API requests are subject to per-key quota limits based on your plan. When your quota is exceeded, the API returns 429 Too Many Requests with the following headers:
X-RateLimit-Limit — your quota limitX-RateLimit-Remaining — remaining quotaX-RateLimit-Reset — when the quota resetsManage your quota and add funds from your spending dashboard. For higher limits, contact jordan@prosodylabs.com.au.
Yarn is built for Australian data residency. When you use a sovereign model, your data is processed on GPU hardware in Perth, Western Australia. It never leaves Australian infrastructure.
When you use a global model (GPT-4o, Claude, etc.), your prompt is proxied to the provider's API servers, which are located outside Australia. Responses are returned to you via Yarn but the prompt content transits through the provider's infrastructure.
Check the sovereignty field in the model response or the GET /v1/models endpoint before sending sensitive data. For full details, see our Privacy Policy.