Why the Compliance Graph Lives Inside Your Tenant

In early 2024, we had a working prototype. Multi-tenant SaaS. Standard architecture. Customer data in shared Cosmos DB collections with tenant isolation at the application layer. It worked fine. It was cheaper to operate. It was easier to deploy updates.

We scrapped it.

The reason was a single slide in a customer presentation. We were showing the compliance graph, the connected data model that links controls to evidence to risks to policies to frameworks to vendors. A CISO in the room said: "So this is the most complete map of our security posture that exists anywhere. And you want to store it on your infrastructure."

He wasn't wrong. And once you see the problem clearly, you can't unsee it.

What the compliance graph actually contains

The compliance graph isn't a database. It's a connected model of your entire governance posture. The nodes and edges represent:

Controls. Every security and compliance control your organization operates. Their implementation status. Their maturity level. Their mapping to frameworks. Their evidence state.

Evidence. Every artifact that proves a control is operating. Configuration exports, access review logs, policy documents, training records, vulnerability scan results. Each linked to the control it supports, timestamped, hashed.

Risks. Your risk register entries. Likelihood, impact, treatment plans, residual risk calculations. Linked to the controls that mitigate them and the evidence that proves those controls are working.

Policies. Your policy documents, version-controlled. Linked to the controls they mandate and the frameworks they satisfy.

Frameworks. ISO 27001, SOC 2, NIST CSF, EU AI Act, DORA, CMMC, PCI DSS, and the 70+ others you might be mapped to. Each framework's requirements linked to the controls that satisfy them via the STRM Engine.

Vendors. Third-party risk assessments, questionnaire responses, SLA compliance, incident history. Linked to the controls affected by vendor dependencies.

Taken together, this graph is a precise, queryable, relationship-rich model of where your organization is strong, where it's weak, what's proven, and what's assumed. It's more detailed than your board reporting. More granular than what you share with auditors. More sensitive than most of the data your controls protect.

The question of where this data lives isn't a preference. It's a security decision.

The trust gap in vendor-hosted GRC

Every vendor-hosted GRC platform creates the same structural contradiction. As we explored in The Sovereignty Question No GRC Vendor Wants You to Ask, the system designed to prove you're managing risk becomes a risk you haven't fully assessed.

But the compliance graph makes this worse. A traditional GRC platform stores controls, evidence, and risk registers as flat records. The compliance graph stores the relationships between them. That means:

Attack surface intelligence. An attacker who accesses your compliance graph doesn't just know you have a firewall. They know which firewall rules lack recent evidence, which controls are at Level 1 maturity (documented but not operating), which risk treatments are overdue. The graph reveals not just what you have, but where the gaps are.

Regulatory posture. The graph shows exactly which framework obligations you haven't satisfied yet, which ones you're partially compliant with, and which controls are your weakest links. For a sophisticated threat actor or competitor, this is a complete map of where pressure will produce results.

Operational relationships. The graph connects vendors to the controls they affect. An attacker who reads this knows which vendor compromise would cascade into the most control failures in your environment.

This isn't paranoia. It's threat modeling applied to the data model itself. If you wouldn't host your penetration test results on a third party's infrastructure, you shouldn't host your compliance graph there either.

The architectural decision

In Q2 2024, we committed to a constraint: every customer's compliance graph deploys inside their own Azure tenant. Not in a shared environment with logical separation. Inside their subscription, their resource group, their network boundary.

This shaped everything else.

Azure Kubernetes Service (AKS) runs the Kyudo application plane. Customer-managed cluster in their subscription. Their node pools, their networking, their resource policies. We provide Helm charts. They control the infrastructure.

Entra ID handles all authentication and authorization. No Kyudo-managed identity store. Users authenticate against the customer's directory. RBAC maps to Entra groups. Conditional Access policies apply natively. If the customer requires phishing-resistant MFA for all access, they enforce it through their existing Entra ID policies without asking us to implement it.

Azure Private Link ensures the compliance graph is never exposed to the public internet. Data flows between Azure services within the customer's virtual network. No public endpoints for the graph database, the evidence store, or the application APIs. The customer's existing NSG rules and Azure Firewall policies apply automatically.

Azure Policy enforces deployment compliance. Customers can apply their own Azure Policy definitions to the Kyudo resource group. Require encryption standards, enforce tagging, restrict regions, mandate diagnostic settings. The same governance they apply to their own workloads applies to ours.

Customer-managed keys. Encryption at rest uses keys from the customer's Key Vault. We never hold, manage, or have access to encryption keys. Revoking the key renders the compliance graph unreadable to anyone, including us.

What "zero vendor access" means technically

This phrase gets thrown around loosely. Here's what it means in our architecture:

We have no standing access to customer environments. No service accounts. No shared credentials. No SSH keys to customer clusters. No read access to the compliance graph database.

For support scenarios that require access (diagnosing a deployment issue, for example), access is provisioned through the customer's Privileged Identity Management (PIM). Time-bounded. Approval-required. Fully logged in the customer's audit trail. When the window expires, access reverts automatically.

For updates, we publish Helm chart versions. Customers pull and deploy on their schedule. We don't push changes into their environments. We don't have the access path to do so.

For telemetry, we receive aggregated operational metrics (pod health, error rates, performance data) only if the customer explicitly configures diagnostic export. No compliance data, no evidence content, no graph queries leave the tenant unless the customer decides otherwise.

This is different from "we promise not to look at your data." It's "we architected the system so that looking at your data requires your active cooperation."

Access Scenario	Traditional SaaS GRC	Kyudo (Customer-Hosted)
Day-to-day operations	Vendor has standing access	Zero vendor access; customer-operated
Support escalation	Vendor support team accesses data	Customer grants time-bounded PIM access with approval
Software updates	Vendor pushes updates to shared infra	Customer pulls Helm chart updates on their schedule
Encryption keys	Vendor manages keys	Customer Key Vault, customer-managed keys
Network exposure	Public endpoints with IP restrictions	Private Link, no public endpoints
Audit trail	Vendor's logs, shared on request via SOC 2	Customer's Azure Monitor, full visibility
Data at contract end	Export process, schema lock-in risk	Data stays in customer's tenant, it never left

The tradeoffs we accepted

This architecture is harder to operate. We won't pretend otherwise.

Helm complexity. Every customer deployment is a separate Helm release with environment-specific values. We can't run one control plane that manages all customer instances. Each is independent. Our deployment documentation is extensive because it has to be.

Per-tenant operations. When we find a bug, we publish a fix. But we can't deploy it across all customers simultaneously. Each customer pulls the update, tests in their staging environment, and promotes to production. This means some customers run newer versions than others. We support n-2 versions at any time.

Onboarding time. A multi-tenant SaaS can provision a customer in minutes. Our deployment takes days, not minutes. There's AKS cluster provisioning, Entra ID configuration, Private Link setup, Helm deployment, and validation. We've automated most of it with Terraform modules, but it's not instant.

Cost per customer. Dedicated infrastructure per customer costs more than shared infrastructure amortized across hundreds of tenants. Customers pay for their own Azure resources. For regulated enterprises already running workloads on Azure, this is marginal. For a 10-person startup, it's expensive per-seat.

We accepted these tradeoffs because the alternative, asking customers to trust us with the most sensitive description of their security posture, is an architectural contradiction we couldn't rationalize away.

The counter-argument: "SOC 2 and encryption solve this"

The strongest objection goes like this: "Responsible vendors have SOC 2 Type II. They encrypt at rest and in transit. They have incident response. They undergo regular pen tests. These controls adequately protect customer data without requiring per-tenant deployment."

This is true for most data types. Customer records in a CRM don't need per-tenant isolation. Transactional data in an ERP can live in multi-tenant infrastructure with adequate controls.

But compliance data is different in one specific way: it's the data that describes your defensive posture. A breach of your CRM exposes customer contact information. A breach of your compliance graph exposes where all your other defenses are weakest. The attacker learns not just that you have data, but exactly where to attack next.

SOC 2 reduces the probability of a breach at your vendor. It doesn't change the consequence. And for data where the consequence is "attacker gets a detailed map of our control gaps," the probability would need to be zero for the risk to be acceptable.

Zero probability doesn't exist in cybersecurity. But zero exposure does. If the data never leaves your tenant, a vendor breach doesn't expose it. That's not defense in depth. It's elimination of an attack surface.

What this means for your evaluation

If you're evaluating GRC platforms and data residency matters to you, ask three questions:

Where does the data physically reside? "In your region" isn't the same as "in your tenant." Shared infrastructure in your region still means shared infrastructure.
Who holds the encryption keys? Vendor-managed keys with customer-controlled access is not the same as customer-managed keys in their own Key Vault.
What happens at contract termination? If you need to "export" your data, it left your control at some point. If it never left, there's nothing to export.
What access does the vendor retain post-deployment? Standing access, even read-only, means your compliance data is accessible to personnel outside your organization.

The compliance graph contains the highest-fidelity map of your security posture that exists. Where it lives is a security architecture decision, not a procurement preference.

Want to see the deployment architecture? Book a demo and we'll walk through the tenant-native deployment model, Private Link configuration, and the sovereignty architecture in detail.