Side-by-Side Code Comparison
Python
import openai
client = openai.OpenAI(
api_key="sk-proj-abc123...",
base_url="https://api.openai.com/v1"
)
client = openai.OpenAI(
api_key="alv_live_abc123...",
base_url="https://api.alveare.ai/v1"
)
response = client.chat.completions.create(
model="gpt-3.5-turbo",
messages=[
{"role": "system", "content": "Classify support tickets into categories."},
{"role": "user", "content": "My payment failed and I was charged twice."}
],
temperature=0.1,
max_tokens=50
)
print(response.choices[0].message.content)
TypeScript
import OpenAI from 'openai';
const client = new OpenAI({
apiKey: 'alv_live_abc123...',
baseURL: 'https://api.alveare.ai/v1',
});
const response = await client.chat.completions.create({
model: 'gpt-3.5-turbo',
messages: [
{ role: 'user', content: 'Summarize this quarterly report...' }
],
max_tokens: 256,
});
console.log(response.choices[0].message.content);
curl
curl https://api.openai.com/v1/chat/completions \
-H "Authorization: Bearer sk-proj-abc123..." \
-H "Content-Type: application/json" \
-d '{"model":"gpt-3.5-turbo","messages":[{"role":"user","content":"Hello"}]}'
curl https://api.alveare.ai/v1/chat/completions \
-H "Authorization: Bearer alv_live_abc123..." \
-H "Content-Type: application/json" \
-d '{"model":"gpt-3.5-turbo","messages":[{"role":"user","content":"Hello"}]}'
Cost Savings Calculator
See how much you would save by moving your OpenAI workload to Alveare. These estimates use
GPT-3.5 Turbo pricing at an average of 500 tokens per request. Actual savings depend on your
prompt lengths, model selection, and request volume.
Solo Developer
OpenAI: $320/mo
$49/mo
Save $271/mo (85%)
10K requests/month
~$3.3K saved per year
Low Volume
OpenAI: $3,200/mo
$499/mo
Save $2,701/mo (84%)
100K requests/month
~$32K saved per year
Medium Volume
OpenAI: $16,000/mo
$1,499/mo
Save $14,501/mo (91%)
500K requests/month
~$174K saved per year
High Volume
OpenAI: $64,000/mo
$2,999/mo
Save $61,001/mo (95%)
2M requests/month
~$732K saved per year
OpenAI costs estimated using GPT-3.5 Turbo at ~$0.002/1K input + $0.002/1K output tokens,
averaging 500 tokens per request. Alveare costs are flat monthly subscriptions with no per-token charges.
Migration FAQ
Will my existing code break?
No. Alveare's API is wire-compatible with the OpenAI chat completions endpoint. If your code works with OpenAI, it works with Alveare. The request format, response format, error codes, and streaming protocol are identical. We test against the official OpenAI Python and Node SDKs to ensure compatibility.
What about streaming?
Fully supported. Alveare uses the same Server-Sent Events (SSE) protocol as OpenAI. The chunk format is identical. If you use stream=True in the OpenAI SDK, it works without modification. Time to first token is typically 50-80ms for a 7B model.
What about function calling?
Supported on 7B+ models. The tools/functions parameter works the same way as OpenAI. The model generates a function call in the expected JSON format, and your code handles the response identically. For complex function calling with many tools, 13B models provide better accuracy.
What if I need GPT-4 quality?
For routine tasks (classification, extraction, summarization, template generation), a well-configured 7B model delivers equivalent quality to GPT-3.5 and handles 80% of typical SaaS workloads. For the remaining 20% that truly requires frontier-model reasoning, keep using OpenAI for those specific calls. Many customers use Alveare for the 80% and OpenAI for the 20%, saving 70-80% of their total inference costs.
Can I run Alveare and OpenAI in parallel?
Absolutely. A common migration pattern is to route a percentage of traffic to Alveare first (10%, then 50%, then 100%) while comparing quality and latency. The API compatibility means you can use the same client code with a simple URL switch, making A/B testing straightforward.
What if I use the OpenAI Assistants API or GPTs?
Alveare is compatible with the chat completions API, not the Assistants API or GPTs. If you use those features, you would need to refactor those specific calls to use the chat completions format. For most production applications that use the chat completions endpoint directly, migration is a 2-line change.
How long does migration actually take?
For a typical integration that uses the OpenAI chat completions API: 15 minutes. That includes signing up (2 min), changing the URL and key (2 min), and running your test suite (10 min). If you use the Assistants API, embeddings, or fine-tuned models, allow additional time for those specific integrations.
What happens to my OpenAI fine-tuned models?
OpenAI fine-tuned models cannot be exported. However, if you have the training data, Alveare can fine-tune an open-weight model (Mistral, Llama) on the same data. The result is a model you own that runs on your dedicated infrastructure. Fine-tuning is available on Scale and Enterprise plans.