Security & Architecture

Your data boundary, explained.

You're right to ask: if you're calling our API, how is your data protected? Here's exactly what happens -- and what doesn't.


The question everyone asks

"If I send my data to api.alveare.ai, aren't I sending it to a third party -- just like OpenAI?"

This is a legitimate concern, and we are not going to dismiss it with marketing language. Let us explain what actually happens.

Yes, your request travels over the internet to our API endpoint. Your HTTP request containing your prompt text is sent from your infrastructure to ours. In that narrow sense, your data does leave your network and arrive at a system managed by Alveare. This is true for any hosted API service -- AWS Lambda, Google Cloud Functions, Stripe's payment API, or Twilio's messaging API.

But what happens after your request arrives is fundamentally different from what happens at OpenAI, Anthropic, or any other shared-infrastructure LLM provider. The difference is not in the network layer. It is in the compute layer, the storage layer, and the data handling policy.


OpenAI vs Alveare: what actually happens

Here is exactly what happens to your data at each step, compared side-by-side.

OPENAI Your App Your data ! OpenAI API Shared GPU Pool 1,000s of customers Logged 30 days May train models Shared with others ALVEARE Your App TLS 1.3 Alveare API YOUR Hive Dedicated GPU, isolated Zero logging Never trains Isolated compute Crowded. Exposed. Uncertain. Clean. Private. Controlled.

The critical architectural difference is compute isolation. At OpenAI, your prompt is processed on a GPU that is simultaneously serving hundreds or thousands of other customers. At Alveare, your prompt is processed on a GPU dedicated to your workload. No other customer's data, model weights, or inference requests exist on that machine.

What happens to your data

Request sent TLS 1.3 API Gateway Authenticated, rate-limited Routed to your hive YOUR HIVE Processed in isolated GPU memory Response returned Response received Data discarded from memory VRAM released. Nothing persists.

This is not a theoretical distinction. It has practical implications for security audits, compliance certifications, and incident response. If there is a vulnerability in the inference stack, the blast radius at OpenAI includes every customer on that GPU cluster. At Alveare, the blast radius is limited to your hive.


What is the data boundary?

The "data boundary" is a set of architectural guarantees about how your data is handled during inference. It is not a brand name or a marketing concept. It is a specific set of technical controls that you can audit, verify, and hold us accountable to.

1. Isolation

Your hive runs on a dedicated GPU instance. No other customer's code, model weights, or data is on that machine. The instance is provisioned exclusively for your account, with its own memory space, its own model checkpoint, and its own network namespace. This is equivalent to a dedicated EC2 instance -- not a shared Lambda function.

2. No Prompt Logging

We log metadata for billing and monitoring: timestamp, token count, latency, specialist used, request ID, and HTTP status code. We do not log your prompt text, the model's response, or any content from your request body. This is enforced at the infrastructure level -- the logging pipeline has no access to the inference payload. Our logging system physically cannot capture prompt content because it receives only the metadata struct, not the request body.

3. No Training

Your data is never used to train, fine-tune, or improve any model -- ours or anyone else's. This is not an opt-out setting you need to remember to toggle. It is a structural guarantee. Our inference infrastructure has no training pipeline connected to it. There is no mechanism by which inference data could flow into a training job, because the systems are architecturally separate.

4. No Retention

After we return the response to you, the prompt and response are discarded from GPU memory. Nothing is written to disk. The VRAM used by your request is overwritten by the next inference operation. There is no buffer, no queue, no temporary file, and no database entry containing your prompt or response text. Once the HTTP response is sent, the data exists only in your systems.

5. Encryption

TLS 1.3 for all data in transit. Every connection to api.alveare.ai uses TLS 1.3 with modern cipher suites. KMS-managed encryption at rest for any stored data (model checkpoints, usage records, account information). Your prompts are never stored, so encryption at rest is not applicable to inference data -- but every other piece of stored data is encrypted with AES-256 via AWS KMS with automatic key rotation.

6. Region Control

You choose where your hive runs. US-East is available today. EU-West (Frankfurt) is planned and will enable full GDPR data residency compliance. When you select a region, your data stays in that region -- the API gateway, the GPU instance, and the metadata logs are all co-located. No cross-region data transfer occurs during inference.


"But you still see my data momentarily?"

The honest answer: yes, briefly.

For the few hundred milliseconds that your request is being processed, your prompt text exists in GPU VRAM on your dedicated instance. This is unavoidable -- the model needs to read your text to process it. You cannot run inference on data you cannot see.

Here is what makes this different from OpenAI:

This is the same security model as using AWS Lambda, Google Cloud Functions, or Azure Functions. Your code runs on their hardware momentarily, but it is isolated, ephemeral, and not accessible to the provider's employees. The difference between Alveare and OpenAI is that we treat inference like a stateless compute operation -- process and discard. OpenAI treats it like a data pipeline -- process, log, retain, and optionally train.


"What about the Shared Hive on Solo?"

This is an important distinction, and we want to be transparent about it.

Solo plan customers ($49/month) share a hive with other Solo users. This means multiple Solo customers' requests may be processed on the same GPU. Here is exactly what that means and what it does not mean:

If infrastructure-level isolation is critical for your compliance requirements, the Starter plan ($499/month) and above provide a fully dedicated hive. Your inference runs on a GPU instance that serves only your account. For HIPAA, SOC 2, or any regulatory framework that requires dedicated compute, the Starter plan is the minimum we recommend.

To summarise the difference plainly: Solo gives you request-level isolation (same security guarantees around logging and retention, but shared hardware). Starter and above give you hardware-level isolation (a dedicated GPU that only processes your workload).


How do we verify this?

Trust but verify. We do not expect you to take our word for any of this. Here is how you can independently validate our data boundary claims.

SOC 2 Type II Audit

Independent third-party audit of our security controls, data handling, and operational practices. Currently in progress. Report will be available to customers and prospects under NDA upon completion.

Architecture Documentation

Detailed technical documentation of our infrastructure architecture is available for security review. We provide this to your security team during vendor evaluation. Request it at security@alveare.ai.

BAA for HIPAA

Business Associate Agreement available for HIPAA-covered entities and their business associates. The BAA covers all data processed through your dedicated hive on Starter plan and above.

DPA for GDPR

Data Processing Agreement available for organizations subject to GDPR. Covers data processing terms, subprocessor lists, data transfer mechanisms, and breach notification procedures.

Penetration Testing

We welcome responsible security research. If you want to conduct penetration testing against your own Alveare deployment, contact us to coordinate. We support responsible disclosure and have a published security policy.

Vendor Security Review

Send us your vendor security questionnaire. We complete SIG, CAIQ, and custom questionnaires regularly. Contact security@alveare.ai with your review requirements.


The bottom line

Here is the comparison that matters. Three options for running AI inference, evaluated honestly across the dimensions that your security and compliance team cares about.

OpenAI vs Self-Hosted vs Alveare

Dimension OpenAI Self-Hosted (vLLM) Alveare
Data leaves your network Yes No Yes (encrypted, ephemeral)
Shared infrastructure Yes (multi-tenant GPUs) No No (Starter+)
Prompt logging Yes (30 days minimum) Your choice No (metadata only)
Used for training Opt-out required No Never
Data retention 30 days Your choice Zero
Compliance ready Limited (SOC 2 only) Full control HIPAA / SOC 2 / GDPR
Ops burden Zero High (GPU mgmt, model serving, monitoring) Zero
Monthly cost (100K req) $3,000-5,000 $2,000+ plus eng time $499
Time to production Hours Weeks to months Minutes
Model version control Vendor-controlled Full control Customer-controlled

The trade-off is clear. Self-hosting gives you maximum control but demands significant engineering investment and ongoing operational burden. OpenAI gives you zero ops burden but sends your data to shared infrastructure with logging, retention, and potential training use. Alveare gives you the operational simplicity of a managed API with the data isolation guarantees that approach self-hosting -- at a fraction of the cost of either alternative.

If your compliance requirements allow third-party API usage with proper controls (which most do -- your company already uses AWS, Stripe, and Twilio), Alveare provides the strongest data boundary available in a managed inference service. If your requirements prohibit any third-party data processing (rare, but some government and defense applications require this), self-hosting is your only option.

For everyone else -- which is the vast majority of companies we talk to -- the question is not whether to use a managed service. It is whether to use one that logs your data and trains on it, or one that does not.


Start your free trial

7 days, no credit card. Test the data boundary with your own workload. Review our architecture documentation. Send us your vendor security questionnaire. We will earn your trust.

Start Free Trial Contact Security Team