As seen in our previous blog post, we take enterprise readiness very seriously at Codeium, and are constantly discussing internally on all of the ways that we could make Codeium’s enterprise experience better. Today, we are officially launching our Hybrid deployment to complement our Enterprise SaaS deployment, which is a first-of-a-kind deployment method in the AI code assistant space, ideal for companies that want to get the most value and personalization out of their tool of choice without requiring their proprietary code to be persisted on third-party servers.

SaaS Deployment Method

In our SaaS deployment, everything runs in Codeium’s hosted servers. Companies don’t incur any GPU costs or management overhead, get access to a massive pool of GPU compute, and get the earliest access to new features. However, because of data security requirements, we need to be able to guarantee zero-data retention (e.g. Codeium’s servers are simply pass-through for inference and do not serialize, store, or persist any code IP or derived data). This means we cannot, by default, enable features that would require us to persist code or derived data structures, which means we cannot offer remote indexing and multi-repo context awareness, attribution logs, audit logs or a number of other features. This inability to offer powerful features because of zero-data retention is the main reason why many enterprises we talk to see capped performance from other SaaS-only tools, such as GitHub Copilot.

Hybrid Architecture

To understand Hybrid, we need to separate Codeium’s stack into the data layer and compute layer: the data layer has any persisted code and derived information such as remote indices for context awareness, while the compute layer are the GPUs that are required for inference. Hybrid keeps the data layer in a customer’s tenant, requiring a cheaper CPU-only machine, while has the compute layer in Codeium’s hosted servers, which completely guarantees zero-data retention. This ends up negating the drawbacks of SaaS:

No GPU costs and management: The costs are a lot cheaper with CPUs, which most companies are very familiar with managing. Customers also get access to run our most powerful models on even more powerful GPUs, which most customers currently do not even have access to. Customers also get access to Codeium’s massive cluster of GPUs for spiky, high load tasks, which future capabilities may require.
Zero-data retention for security: This remains because Codeium does not actually store any of the customer’s IP, which may contain code snippets or code-derived information.
Access to all existing features: Since we can persist code IP and derived information somewhere, customers get access to all of the powerful capabilities like remote indexing, multirepo context awareness, and remote repo context pinning.
Early access to features and capabilities: As new features are developed, they will be first available to SaaS and Hybrid customers, since we can roll back changes if needed. Because the compute layer is the same as SaaS, we can release features built on top of the compute layer to Hybrid customers without needing to make compute-level changes.

For the data layer, since we do not require a GPU-containing instance, Hybrid dramatically reduces deployment complexity and hardware costs on the customer’s end. Some options on various cloud providers include:

AWS: m6a.8xlarge (32 vcpu, 128GB, approx $1.38/hr)
Azure: Standard_D32as_v5 (32 vcpu, 128GB, approx $1.38/hr)
GCP: n2d-standard-32 (32 vcpu, 128GB, approx $1.52/hr)
+ 1TB of persistent disk (AWS gp3, Azure standard ssd, GCP balanced persistent disk)

Of course, similar to SaaS, all data is encrypted in transit, and for network connectivity between the customer’s CPU instance and Codeium, we only require outbound connections from the customer’s instance to Codeium. This is intended to simplify configuration and to minimize the risk of unintentionally exposing any resources on the customer’s side.

Deployment Comparisons

Below, we show the high level architectures for Enterprise SaaS and Hybrid. As a key for the various colors:

For outlines (components), Teal refers to a Codeium component while Grey refers to a User / Enterprise component
For arrows (data transfer), Blue refers to authentication flow, Purple refers to inference execution flow, and Orange refers to precomputation flow

Who Should Use Hybrid?

Hybrid is the best solution for you if your enterprise satisfies the following:

You self-host your code but use other SaaS services that process your code (as long as there are zero-data retention guarantees!)
Have over 100k lines of code or code distributed across multiple repositories
Care about minimizing hallucinations and increasing quality of results from AI

It is really that simple. We believe the vast majority of our customers long-term will be using the Hybrid deployment. SaaS is ideal for companies who store their code in cloud source code management platforms and have an openness to storing code externally, while Hybrid is the perfect solution for everyone else.

If you believe your company fits these factors, reach out: