Client Tools & SDKs

Python SDK

The Alveare Python SDK is designed for production use in backend services, data pipelines, and ML workflows. It provides typed responses, automatic retries with exponential backoff, connection pooling, and both synchronous and asynchronous interfaces. Install it with pip and start making inference requests in three lines of code.

                pip install alveare
            
from alveare import Alveare

client = Alveare(api_key="alv_live_abc123...")

# Classify 1000 support tickets in 30 seconds
tickets = ["My payment failed", "Can't login to dashboard", ...]  # 1000 items

results = client.infer_batch(
    specialist="classify",
    prompts=tickets,
    max_tokens=20,
    concurrency=50  # 50 parallel requests
)

for ticket, result in zip(tickets, results):
    print(f"{ticket[:40]}... -> {result.text}")
# My payment failed...                -> billing
# Can't login to dashboard...          -> account
# Total: 1000 classified in 28.4s (avg 89ms each)

Async Support

For high-throughput services running on asyncio, FastAPI, or similar frameworks, the async client avoids blocking your event loop. It uses httpx under the hood and supports the same interface as the synchronous client.

                pip install alveare[async]
            
from alveare import AsyncAlveare
import asyncio

client = AsyncAlveare(api_key="alv_live_abc123...")

async def summarise_reports(reports: list[str]) -> list[str]:
    tasks = [
        client.infer(
            specialist="summarise",
            prompt=report,
            max_tokens=256
        )
        for report in reports
    ]
    results = await asyncio.gather(*tasks)
    return [r.text for r in results]

Error Handling and Typed Responses

Every response is a typed InferResult object with .text, .tokens_used, .latency_ms, .specialist, and .cached fields. Errors raise typed exceptions: AlveareAuthError, AlveareRateLimitError, AlveareValidationError. Rate limit errors include a .retry_after field in seconds.

Python 3.9+ with zero required dependencies
Automatic retries with configurable backoff (default: 3 retries, exponential)
Connection pooling for sustained throughput in server environments
Streaming support via infer_stream() returning an iterator of chunks
Full type hints compatible with mypy and pyright in strict mode

Full Python SDK documentation →

TypeScript SDK

The TypeScript SDK provides full type safety, native Promise support, and works across Node.js 18+, Deno, and Bun. It ships as both ESM and CommonJS, is tree-shakeable, and adds less than 15 KB to your bundle. Every method returns strongly typed responses so your IDE catches errors before runtime.

                npm install @alveare-ai/sdk
            
import { Alveare } from '@alveare-ai/sdk';
import express from 'express';

// Add AI summarization to your Express.js API in 5 lines
const alv = new Alveare({ apiKey: process.env.ALVEARE_KEY! });
const app = express();

app.post('/api/summarize', async (req, res) => {
  const result = await alv.infer({
    specialist: 'summarise',
    prompt: req.body.text,
    maxTokens: 256,
  });
  res.json({ summary: result.text, latency: result.latencyMs });
});

app.listen(3000);

TypeScript Types

Every request and response is fully typed. The SDK exports interfaces for InferRequest, InferResult, Specialist, UsageStats, and WebhookEvent. Discriminated union types for error responses mean your catch blocks can narrow by error type without casting.

Tree-shakeable ESM with CommonJS fallback for older build systems
15 KB gzipped with zero runtime dependencies
Native fetch with automatic retries and configurable timeout
Streaming via async iterators compatible with Node.js Readable streams
Works everywhere: Node.js 18+, Deno, Bun, Cloudflare Workers, Vercel Edge

Full TypeScript SDK documentation →

CLI Tool

The Alveare CLI is a single binary that lets you manage specialists, run inference, monitor usage, and automate batch jobs from your terminal. It is built for engineers who live in the terminal and want to pipe any file through Alveare without writing code.

                Install & authenticate
            
# Install via Homebrew (macOS/Linux)
brew install alveare/tap/alveare

# Or via curl (any platform)
curl -fsSL https://get.alveare.ai | sh

# Authenticate
alveare auth login
API key: alv_live_abc123...
Authenticated as acme-corp (Professional plan)

                Pipe any file through Alveare from your terminal
            
# Summarise a document
cat quarterly-report.txt | alveare infer --specialist summarise --max-tokens 256

# Classify a support ticket
echo "My payment didn't go through" | alveare infer --specialist classify
billing (89ms, 12 tokens)

# Extract entities from a JSON file
alveare infer --specialist extract --input contract.pdf --output-format json

# Batch process a CSV: classify every row
alveare batch --specialist classify --input tickets.csv --column description --output classified.csv
Processing 4,231 rows... Done in 34.2s (avg 8.1ms/row)

# Manage specialists
alveare specialists list
alveare specialists create --name sentiment --system-prompt "Classify as positive/negative/neutral" --temp 0.0

# Check usage and billing
alveare usage
Plan: Professional ($1,499/mo)
Requests: 142,391 / 500,000 (28.5%)
Days remaining: 18

All Commands

alveare infer -- Single inference request. Accepts stdin, --prompt, or --input file
alveare batch -- Batch process CSV, JSONL, or directory of files
alveare specialists list|create|update|delete -- Manage specialist configurations
alveare usage -- Current billing period usage and plan details
alveare keys list|create|revoke -- API key management
alveare auth login|logout|whoami -- Authentication management
alveare health -- Hive health status and specialist latency metrics

Full CLI documentation →

VS Code Extension

The Alveare VS Code extension brings inference directly into your editor. Select text, right-click, and send it to any specialist. The response appears in a side panel with latency, token count, and cached status. No context switching. No terminal required.

Right-click to Summarise

Select any block of text in your editor, right-click, and choose "Alveare: Summarise". The summary appears in a side panel in under 300ms. Works with any file type.

Inline Classification

Highlight a support ticket, log entry, or any text. Right-click and classify it instantly. The label appears as an inline decoration next to your selection.

Specialist Browser

View all your configured specialists in a sidebar tree view. Edit system prompts, temperature, and max tokens directly from VS Code. Changes deploy to your hive immediately.

Prompt Playground

A dedicated panel for testing prompts against different specialists and comparing outputs. Adjust parameters in real time and see how they affect quality and latency.

Usage Status Bar

A persistent status bar item shows your current request count and plan allocation. Click it to see per-specialist breakdowns, average latency, and cache hit rates.

Response Diff View

Compare outputs from different specialists or different parameter configurations side by side. Essential for prompt engineering and output quality validation.

Install from the VS Code Marketplace: search for "Alveare" or run ext install alveare.alveare-vscode from the command palette. Requires an Alveare API key.

Full VS Code extension documentation →

OpenAI Compatibility

If you already use the OpenAI API, you do not need to learn a new SDK. Alveare's inference endpoint is wire-compatible with the OpenAI chat completions API. Change two lines of code -- the base URL and the API key -- and your existing application works without modification.

This is not a subset. We support system messages, multi-turn conversations, streaming responses (SSE), temperature, top_p, max_tokens, stop sequences, JSON mode, and function calling. The Alveare endpoint also accepts OpenAI model names and maps them to the appropriate specialist automatically, so you do not even need to change your model parameter if you prefer.

                    Before: OpenAI Python
                
import openai

client = openai.OpenAI(
  api_key="sk-...",
  base_url="https://api.openai.com/v1"
)

resp = client.chat.completions.create(
  model="gpt-3.5-turbo",
  messages=[{
    "role": "user",
    "content": "Classify: payment issue"
  }],
  temperature=0.1
)

                    After: Alveare (2 lines changed)
                
import openai

client = openai.OpenAI(
  api_key="alv_live_abc123...",
  base_url="https://api.alveare.ai/v1"
)

resp = client.chat.completions.create(
  model="classify",
  messages=[{
    "role": "user",
    "content": "Classify: payment issue"
  }],
  temperature=0.1
)

                    Before: OpenAI TypeScript
                
import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: 'sk-...',
  baseURL: 'https://api.openai.com/v1',
});

                    After: Alveare (2 lines changed)
                
import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: 'alv_live_abc123...',
  baseURL: 'https://api.alveare.ai/v1',
});

The OpenAI compatibility layer means you can evaluate Alveare without modifying a single line of application code beyond the configuration. Run your existing test suite against Alveare, compare latency and quality, and make a decision based on production-grade evidence.

Full migration guide: Migrate from OpenAI in 15 minutes →