Integration Guide

Krato sits between your app and LLM providers, enforcing per-user token budgets in real time.

Architecture

Your App  →  Krato SDK  →  Krato Server  →  LLM Provider
                  ↕               ↕
            Budget Check     PostgreSQL + Redis

Before each LLM call, the SDK checks the user's budget
If the budget allows, the call proceeds to the LLM provider
After the call, the SDK reports token usage
The dashboard shows real-time usage and budget status

Quick Start

1. Get your Project Key

2. Install the SDK

TypeScript

npm install krato

Python

pip install krato[openai]

go get github.com/kratosdk/krato-go

3. Integrate (TypeScript)

import { Krato } from "krato"

const krato = new Krato({
  projectKey: process.env.KRATO_PROJECT_KEY!,
  provider: "openai",
  apiKey: process.env.OPENAI_API_KEY!,
});

// Replace openai.chat.completions.create() with krato.chat()
const { result, budgetStatus, usage } = await krato.chat(
  "user_123", "gpt-4o",
  [{ role: "user", content: "Hello!" }],
);

4. Integrate (Python)

from krato import Krato

krato = Krato(
  project_key=os.environ["KRATO_PROJECT_KEY"],
  provider="openai",
  api_key=os.environ["OPENAI_API_KEY"],
)

response = krato.chat("user_123", "gpt-4o", messages)

5. Set Budgets

In the dashboard → Users & Budgets, set Limit (soft) and Cap (elastic buffer). Or via API:

curl -X PUT http://localhost:8080/api/v1/users/user_123/budget \
  -H "Authorization: Bearer krato_your_key" \
  -H "Content-Type: application/json" \
  -d '{"limit": 100000, "cap": 20000}'

API Reference

All endpoints require Authorization: Bearer '<'project_key'>'.

Budget Check (SDK internal)

POST/api/v1/internal/check

Request

{ "user_id": "user_123",
  "provider": "openai",
  "model": "gpt-4o" }

Response

{ "status": "normal",
  "used": 45000,
  "limit": 100000,
  "cap": 20000 }

Usage Report (SDK internal)

POST/api/v1/internal/report

{ "user_id": "user_123", "provider": "openai",
  "model": "gpt-4o",
  "input_tokens": 150, "output_tokens": 80 }

Set Budget

PUT/api/v1/users/{userID}/budget

{ "limit": 100000, "cap": 20000 }

Get Budget

GET/api/v1/users/{userID}/budget

{ "user_id": "user_123",
  "limit_tokens": 100000, "cap_tokens": 20000,
  "period": "none", "used": 45000 }

Delete Budget

DELETE/api/v1/users/{userID}/budget

Get User Usage

GET/api/v1/users/{userID}/usage

Query params: from, to (ISO timestamps), group_by (model)

Reset User Usage

POST/api/v1/users/{userID}/usage/reset

Usage Summary

GET/api/v1/usage/summary

{ "total_tokens": 1250000,
  "input_tokens": 500000,
  "output_tokens": 750000,
  "requests": 3420,
  "top_users": [
    { "user_id": "user_123", "total_tokens": 89000, "requests": 210 }
  ] }

Health Check

GET/health

Returns {status: ok} — no auth required.

Budget Enforcement Logic

Condition	Status	Action
used < limit	normal	Allow
limit ≤ used < limit + cap	warning	Allow with warning
used ≥ limit + cap	rejected	Block request

When rejected, the SDK throws KratoBudgetExceededError (TS/Python) or returns BudgetExceededError (Go) before calling the LLM.

Streaming

All SDKs support streaming. Budget is checked once before the stream starts — Krato never cuts off a stream mid-response.

TypeScript

const stream = await krato.chatStream("user_123", "gpt-4o", messages);
for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content ?? "");
}
console.log(stream.usage); // available after stream ends

Python

stream = krato.chat_stream("user_123", "gpt-4o", messages)
for chunk in stream:
    print(chunk.choices[0].delta.content or "", end="")
print(stream.usage) # available after stream ends

Go

stream, _ := client.ChatStream("user_123", "gpt-4o", messages, nil)
for {
    chunk, ok := stream.Next()
    if !ok { break }
    fmt.Print(chunk)
}
usage, _ := stream.Usage()

Error Handling

Error	When	Behavior
Server unreachable	Network issue	Fail-open: LLM call proceeds
Budget exceeded	used ≥ limit + cap	Throws before calling LLM
Invalid project key	Wrong/missing key	401 from server
Provider error	API key, rate limit	Error propagated as-is

SDK Repositories

TypeScript

npm install krato

View on GitHub ↗

Python

pip install krato

View on GitHub ↗

go get github.com/kratosdk/krato-go

View on GitHub ↗