Migration Guide

Switch from OpenAI in 15 minutes. Keep your code.

Alveare's API is wire-compatible with OpenAI. Change the base URL and API key. Your request format, response format, SDKs, and test suite stay the same. Start saving 80-95% on inference costs today.

Three Steps. Two Lines of Code.

1

Change the base URL

Replace https://api.openai.com/v1 with https://api.alveare.ai/v1

1 line of code
2

Change the API key

Replace sk-... with alv_live_.... Sign up at alveare.ai/features/pricing.html and your key is emailed within 60 seconds.

1 line of code
3

Map model names (optional)

If you use gpt-3.5-turbo as your model name, it still works. Alveare accepts OpenAI model names and maps them to the appropriate specialist. Or specify a specialist name directly for more control: classify, summarise, extract.

Optional -- 0 lines if you keep OpenAI model names

Side-by-Side Code Comparison

Python

Python: OpenAI to Alveare
import openai # OpenAI (before) client = openai.OpenAI( api_key="sk-proj-abc123...", # ← change this base_url="https://api.openai.com/v1" # ← change this ) # Alveare (after) client = openai.OpenAI( api_key="alv_live_abc123...", # ← changed base_url="https://api.alveare.ai/v1" # ← changed ) # Everything below is IDENTICAL for both response = client.chat.completions.create( model="gpt-3.5-turbo", # works as-is, or use "classify" messages=[ {"role": "system", "content": "Classify support tickets into categories."}, {"role": "user", "content": "My payment failed and I was charged twice."} ], temperature=0.1, max_tokens=50 ) print(response.choices[0].message.content) # "billing"

TypeScript

TypeScript: OpenAI to Alveare
import OpenAI from 'openai'; // Change 2 lines. Everything else is identical. const client = new OpenAI({ apiKey: 'alv_live_abc123...', // was: 'sk-...' baseURL: 'https://api.alveare.ai/v1', // was: 'https://api.openai.com/v1' }); const response = await client.chat.completions.create({ model: 'gpt-3.5-turbo', messages: [ { role: 'user', content: 'Summarize this quarterly report...' } ], max_tokens: 256, }); console.log(response.choices[0].message.content);

curl

curl: OpenAI to Alveare
# Before: OpenAI curl https://api.openai.com/v1/chat/completions \ -H "Authorization: Bearer sk-proj-abc123..." \ -H "Content-Type: application/json" \ -d '{"model":"gpt-3.5-turbo","messages":[{"role":"user","content":"Hello"}]}' # After: Alveare (change URL and key, that's it) curl https://api.alveare.ai/v1/chat/completions \ -H "Authorization: Bearer alv_live_abc123..." \ -H "Content-Type: application/json" \ -d '{"model":"gpt-3.5-turbo","messages":[{"role":"user","content":"Hello"}]}'

What Changes vs What Stays the Same

Feature Changes? Details
Base URL Yes (1 line) api.openai.com/v1 → api.alveare.ai/v1
API key Yes (1 line) sk-... → alv_live_...
Request format No change Same JSON structure, same fields, same types
Response format No change Same response object with choices, usage, etc.
OpenAI SDK No change Works with official openai Python and Node packages
Streaming (SSE) No change Same SSE protocol, same chunk format
Function calling No change Same tools/functions interface (7B+ models)
JSON mode No change Same response_format parameter
Model names Optional gpt-3.5-turbo works, or use specialist names
Error codes No change Same HTTP status codes and error object format
Rate limit headers No change Same X-RateLimit-* headers

Cost Savings Calculator

See how much you would save by moving your OpenAI workload to Alveare. These estimates use GPT-3.5 Turbo pricing at an average of 500 tokens per request. Actual savings depend on your prompt lengths, model selection, and request volume.

Solo Developer
OpenAI: $320/mo
$49/mo
Save $271/mo (85%)
10K requests/month
~$3.3K saved per year
Low Volume
OpenAI: $3,200/mo
$499/mo
Save $2,701/mo (84%)
100K requests/month
~$32K saved per year
Medium Volume
OpenAI: $16,000/mo
$1,499/mo
Save $14,501/mo (91%)
500K requests/month
~$174K saved per year
High Volume
OpenAI: $64,000/mo
$2,999/mo
Save $61,001/mo (95%)
2M requests/month
~$732K saved per year

OpenAI costs estimated using GPT-3.5 Turbo at ~$0.002/1K input + $0.002/1K output tokens, averaging 500 tokens per request. Alveare costs are flat monthly subscriptions with no per-token charges.


Migration FAQ

Will my existing code break?
No. Alveare's API is wire-compatible with the OpenAI chat completions endpoint. If your code works with OpenAI, it works with Alveare. The request format, response format, error codes, and streaming protocol are identical. We test against the official OpenAI Python and Node SDKs to ensure compatibility.
What about streaming?
Fully supported. Alveare uses the same Server-Sent Events (SSE) protocol as OpenAI. The chunk format is identical. If you use stream=True in the OpenAI SDK, it works without modification. Time to first token is typically 50-80ms for a 7B model.
What about function calling?
Supported on 7B+ models. The tools/functions parameter works the same way as OpenAI. The model generates a function call in the expected JSON format, and your code handles the response identically. For complex function calling with many tools, 13B models provide better accuracy.
What if I need GPT-4 quality?
For routine tasks (classification, extraction, summarization, template generation), a well-configured 7B model delivers equivalent quality to GPT-3.5 and handles 80% of typical SaaS workloads. For the remaining 20% that truly requires frontier-model reasoning, keep using OpenAI for those specific calls. Many customers use Alveare for the 80% and OpenAI for the 20%, saving 70-80% of their total inference costs.
Can I run Alveare and OpenAI in parallel?
Absolutely. A common migration pattern is to route a percentage of traffic to Alveare first (10%, then 50%, then 100%) while comparing quality and latency. The API compatibility means you can use the same client code with a simple URL switch, making A/B testing straightforward.
What if I use the OpenAI Assistants API or GPTs?
Alveare is compatible with the chat completions API, not the Assistants API or GPTs. If you use those features, you would need to refactor those specific calls to use the chat completions format. For most production applications that use the chat completions endpoint directly, migration is a 2-line change.
How long does migration actually take?
For a typical integration that uses the OpenAI chat completions API: 15 minutes. That includes signing up (2 min), changing the URL and key (2 min), and running your test suite (10 min). If you use the Assistants API, embeddings, or fine-tuned models, allow additional time for those specific integrations.
What happens to my OpenAI fine-tuned models?
OpenAI fine-tuned models cannot be exported. However, if you have the training data, Alveare can fine-tune an open-weight model (Mistral, Llama) on the same data. The result is a model you own that runs on your dedicated infrastructure. Fine-tuning is available on Scale and Enterprise plans.

Ready to switch?

Sign up, get your API key, change two lines, and start saving. 7-day free trial, no credit card required.

Start Free Trial