Understanding the Data Boundary

The question everyone asks

"If I send my data to api.alveare.ai, aren't I sending it to a third party -- just like OpenAI?"

This is a legitimate concern, and we are not going to dismiss it with marketing language. Let us explain what actually happens.

Yes, your request travels over the internet to our API endpoint. Your HTTP request containing your prompt text is sent from your infrastructure to ours. In that narrow sense, your data does leave your network and arrive at a system managed by Alveare. This is true for any hosted API service -- AWS Lambda, Google Cloud Functions, Stripe's payment API, or Twilio's messaging API.

But what happens after your request arrives is fundamentally different from what happens at OpenAI, Anthropic, or any other shared-infrastructure LLM provider. The difference is not in the network layer. It is in the compute layer, the storage layer, and the data handling policy.

OpenAI vs Alveare: what actually happens

Here is exactly what happens to your data at each step, compared side-by-side.

The critical architectural difference is compute isolation. At OpenAI, your prompt is processed on a GPU that is simultaneously serving hundreds or thousands of other customers. At Alveare, your prompt is processed on a GPU dedicated to your workload. No other customer's data, model weights, or inference requests exist on that machine.

What happens to your data

This is not a theoretical distinction. It has practical implications for security audits, compliance certifications, and incident response. If there is a vulnerability in the inference stack, the blast radius at OpenAI includes every customer on that GPU cluster. At Alveare, the blast radius is limited to your hive.

What is the data boundary?

The "data boundary" is a set of architectural guarantees about how your data is handled during inference. It is not a brand name or a marketing concept. It is a specific set of technical controls that you can audit, verify, and hold us accountable to.

1. Isolation

Your hive runs on a dedicated GPU instance. No other customer's code, model weights, or data is on that machine. The instance is provisioned exclusively for your account, with its own memory space, its own model checkpoint, and its own network namespace. This is equivalent to a dedicated EC2 instance -- not a shared Lambda function.

2. No Prompt Logging

We log metadata for billing and monitoring: timestamp, token count, latency, specialist used, request ID, and HTTP status code. We do not log your prompt text, the model's response, or any content from your request body. This is enforced at the infrastructure level -- the logging pipeline has no access to the inference payload. Our logging system physically cannot capture prompt content because it receives only the metadata struct, not the request body.

3. No Training

Your data is never used to train, fine-tune, or improve any model -- ours or anyone else's. This is not an opt-out setting you need to remember to toggle. It is a structural guarantee. Our inference infrastructure has no training pipeline connected to it. There is no mechanism by which inference data could flow into a training job, because the systems are architecturally separate.

4. No Retention

After we return the response to you, the prompt and response are discarded from GPU memory. Nothing is written to disk. The VRAM used by your request is overwritten by the next inference operation. There is no buffer, no queue, no temporary file, and no database entry containing your prompt or response text. Once the HTTP response is sent, the data exists only in your systems.

5. Encryption

TLS 1.3 for all data in transit. Every connection to api.alveare.ai uses TLS 1.3 with modern cipher suites. KMS-managed encryption at rest for any stored data (model checkpoints, usage records, account information). Your prompts are never stored, so encryption at rest is not applicable to inference data -- but every other piece of stored data is encrypted with AES-256 via AWS KMS with automatic key rotation.

6. Region Control

You choose where your hive runs. US-East is available today. EU-West (Frankfurt) is planned and will enable full GDPR data residency compliance. When you select a region, your data stays in that region -- the API gateway, the GPU instance, and the metadata logs are all co-located. No cross-region data transfer occurs during inference.

"But you still see my data momentarily?"

The honest answer: yes, briefly.

For the few hundred milliseconds that your request is being processed, your prompt text exists in GPU VRAM on your dedicated instance. This is unavoidable -- the model needs to read your text to process it. You cannot run inference on data you cannot see.

Here is what makes this different from OpenAI:

It is your dedicated instance. No other customer's data is on that machine. The GPU, the CPU, the RAM, and the network interface are allocated to your account. This is single-tenant hardware, not a multi-tenant cluster.
It is in GPU VRAM only. Your prompt is loaded into GPU memory for inference. It is not written to disk. It is not stored in a database. It is not buffered in a message queue. It exists in volatile memory only, and only during the inference operation.
It is discarded after inference completes. When the model finishes generating the response, the VRAM used by your prompt is released and overwritten by the next operation. There is no retention period -- the data is gone as soon as the response is sent.
It is never logged, stored, or transmitted elsewhere. No copy of your prompt is sent to any logging system, analytics pipeline, or monitoring tool. The inference process is a dead end for your data -- it goes in, the result comes out, and nothing persists.
No human at Alveare ever sees it. Our operations team has access to infrastructure metrics (CPU usage, GPU utilisation, memory pressure, latency percentiles) but no access to inference payloads. Even if an engineer SSH'd into your instance (which would trigger an alert and require documented justification), the prompt data no longer exists in memory after inference completes.

This is the same security model as using AWS Lambda, Google Cloud Functions, or Azure Functions. Your code runs on their hardware momentarily, but it is isolated, ephemeral, and not accessible to the provider's employees. The difference between Alveare and OpenAI is that we treat inference like a stateless compute operation -- process and discard. OpenAI treats it like a data pipeline -- process, log, retain, and optionally train.

"What about the Shared Hive on Solo?"

This is an important distinction, and we want to be transparent about it.

Solo plan customers ($49/month) share a hive with other Solo users. This means multiple Solo customers' requests may be processed on the same GPU. Here is exactly what that means and what it does not mean:

Requests are processed sequentially, not concurrently. Your prompt is loaded into GPU memory, processed, and the result is returned before the next customer's request is loaded. At no point are two customers' prompts in GPU memory simultaneously. The inference engine processes one request at a time per GPU.
No prompt logging applies equally. Whether you are on the Solo plan or the Scale plan, we do not log prompt content. The no-logging guarantee is the same across all plans. It is an infrastructure-level control, not a plan-level feature.
No customer can see another customer's data. There is no shared state between requests. Each inference call starts with a clean context. The model has no memory of previous requests -- every call is independent. The specialist system prompt is injected fresh for each request.
The security model is equivalent to a shared web server. When you use any SaaS product, your HTTP requests are processed on servers that also process other customers' requests. The isolation is at the request level, not the hardware level. This is standard for shared infrastructure.

If infrastructure-level isolation is critical for your compliance requirements, the Starter plan ($499/month) and above provide a fully dedicated hive. Your inference runs on a GPU instance that serves only your account. For HIPAA, SOC 2, or any regulatory framework that requires dedicated compute, the Starter plan is the minimum we recommend.

To summarise the difference plainly: Solo gives you request-level isolation (same security guarantees around logging and retention, but shared hardware). Starter and above give you hardware-level isolation (a dedicated GPU that only processes your workload).

How do we verify this?

Trust but verify. We do not expect you to take our word for any of this. Here is how you can independently validate our data boundary claims.

SOC 2 Type II Audit

Independent third-party audit of our security controls, data handling, and operational practices. Currently in progress. Report will be available to customers and prospects under NDA upon completion.

Architecture Documentation

Detailed technical documentation of our infrastructure architecture is available for security review. We provide this to your security team during vendor evaluation. Request it at security@alveare.ai.

BAA for HIPAA

Business Associate Agreement available for HIPAA-covered entities and their business associates. The BAA covers all data processed through your dedicated hive on Starter plan and above.

DPA for GDPR

Data Processing Agreement available for organizations subject to GDPR. Covers data processing terms, subprocessor lists, data transfer mechanisms, and breach notification procedures.

Penetration Testing

We welcome responsible security research. If you want to conduct penetration testing against your own Alveare deployment, contact us to coordinate. We support responsible disclosure and have a published security policy.

Vendor Security Review

Send us your vendor security questionnaire. We complete SIG, CAIQ, and custom questionnaires regularly. Contact security@alveare.ai with your review requirements.

The bottom line

Here is the comparison that matters. Three options for running AI inference, evaluated honestly across the dimensions that your security and compliance team cares about.

OpenAI vs Self-Hosted vs Alveare

Dimension	OpenAI	Self-Hosted (vLLM)	Alveare
Data leaves your network	Yes	No	Yes (encrypted, ephemeral)
Shared infrastructure	Yes (multi-tenant GPUs)	No	No (Starter+)
Prompt logging	Yes (30 days minimum)	Your choice	No (metadata only)
Used for training	Opt-out required	No	Never
Data retention	30 days	Your choice	Zero
Compliance ready	Limited (SOC 2 only)	Full control	HIPAA / SOC 2 / GDPR
Ops burden	Zero	High (GPU mgmt, model serving, monitoring)	Zero
Monthly cost (100K req)	$3,000-5,000	$2,000+ plus eng time	$499
Time to production	Hours	Weeks to months	Minutes
Model version control	Vendor-controlled	Full control	Customer-controlled

The trade-off is clear. Self-hosting gives you maximum control but demands significant engineering investment and ongoing operational burden. OpenAI gives you zero ops burden but sends your data to shared infrastructure with logging, retention, and potential training use. Alveare gives you the operational simplicity of a managed API with the data isolation guarantees that approach self-hosting -- at a fraction of the cost of either alternative.

If your compliance requirements allow third-party API usage with proper controls (which most do -- your company already uses AWS, Stripe, and Twilio), Alveare provides the strongest data boundary available in a managed inference service. If your requirements prohibit any third-party data processing (rare, but some government and defense applications require this), self-hosting is your only option.

For everyone else -- which is the vast majority of companies we talk to -- the question is not whether to use a managed service. It is whether to use one that logs your data and trains on it, or one that does not.

Your data boundary, explained.

The question everyone asks

"If I send my data to api.alveare.ai, aren't I sending it to a third party -- just like OpenAI?"

OpenAI vs Alveare: what actually happens

What happens to your data

What is the data boundary?

1. Isolation

2. No Prompt Logging

3. No Training

4. No Retention

5. Encryption

6. Region Control

"But you still see my data momentarily?"

The honest answer: yes, briefly.

"What about the Shared Hive on Solo?"

How do we verify this?

SOC 2 Type II Audit

Architecture Documentation

BAA for HIPAA

DPA for GDPR

Penetration Testing

Vendor Security Review

The bottom line

OpenAI vs Self-Hosted vs Alveare

Start your free trial