[EAAPL-SEC003] Model Isolation
Category: Security / Compute Isolation
Sub-category: Blast Radius Limitation
Version: 1.2
Maturity: Proven
Tags: isolation sandboxing network-segmentation process-isolation resource-quotas egress-control zero-trust
Regulatory Relevance: APRA CPS234 §21, EU AI Act Art. 9, ISO 27001 A.13.1, NIST AI RMF MANAGE 2.2
1. Executive Summary
Model Isolation defines the architectural pattern for constraining the execution environment of AI models — whether hosted internally or accessed via API — to limit the blast radius of a model compromise, data exfiltration attempt, or misconfigured model workload. It treats every model execution environment as a potential attack surface and applies defence-in-depth isolation controls at the network, process, storage, and identity layers.
For executives, the business case is straightforward: AI models represent a new class of compute workload with unique risk characteristics. Unlike a web server that executes deterministic code, an LLM or ML model can be manipulated by adversarial inputs to produce unexpected outputs, access data it should not access, or generate outputs that cause downstream harm. If a model execution environment is not properly isolated, a compromised model can become a pivot point for lateral movement across the enterprise network, access secrets or databases it should never reach, or exfiltrate data through its outputs.
This pattern is especially critical for organisations running on-premises model inference, fine-tuning workloads, or AI agents with tool access. It is equally relevant as a design requirement when evaluating model hosting vendors: the vendor's isolation architecture should be reviewed against this pattern before procurement decisions.
2. Problem Statement
Business Problem
Enterprises running AI model workloads face a novel risk: a model serving network requests is a stateful, long-running process that processes inputs from potentially adversarial sources. Unlike a stateless API endpoint, a model's behaviour can be influenced by its inputs in complex ways. If the model process runs with broad network access, file system access, or cloud IAM permissions, a successful adversarial input or model misconfiguration can lead to data exfiltration, lateral movement, or privilege escalation.
Technical Problem
AI model serving processes — whether Python (PyTorch, Transformers, vLLM), Go-based inference servers, or containerised model endpoints — typically run with more permissions than required:
- Network access to all subnets (allowing lateral movement if compromised).
- File system access to model weights, configuration, and sometimes application data.
- Cloud IAM roles with broad permissions (inherited from the host's instance profile).
- Outbound internet access (a model can exfiltrate data through HTTP calls in a tool-enabled agentic context).
- No resource quotas (a single runaway inference job can starve other workloads).
Symptoms
- Model serving processes running as root or with excessive IAM permissions.
- No network segmentation between model servers and sensitive data stores.
- Model weights stored on writable file systems (enabling weight poisoning).
- No egress controls on model serving infrastructure.
- Unrestricted resource usage by individual inference jobs.
Cost of Inaction
| Dimension | Impact |
|---|---|
| Security | Compromised model server enables lateral movement to databases, secrets stores, or cloud control plane |
| Data | Adversarial inputs causing model to exfiltrate data from its context through tool calls or logging |
| Regulatory | APRA CPS234 requires controls commensurate with information security risks — ungoverned model execution is a cited gap |
| Operational | Runaway inference jobs cause resource exhaustion and service degradation for other workloads |
| Financial | Unrestricted egress from model servers can lead to data exfiltration costs and regulatory fines |
3. Context
When to Apply
- Any on-premises or cloud-hosted AI model inference workload.
- AI agents with tool access (the consequences of a compromised agent are significantly higher than a passive model).
- Fine-tuning workloads that process proprietary or sensitive training data.
- Multi-tenant AI platforms where multiple teams or customers share model infrastructure.
- RAG systems where the model has access to a document retrieval layer.
When NOT to Apply
- External API-only model usage (Azure OpenAI, Anthropic Claude via API) — isolation is the provider's responsibility; however, this pattern informs the contractual and audit questions to ask the provider.
- Single-developer local experimentation environments.
Prerequisites
| Prerequisite | Detail |
|---|---|
| Container/VM infrastructure | Kubernetes, ECS, or VM-based deployment required for isolation controls |
| Network segmentation capability | VPC/VNet subnetting; security groups or network policies |
| Secrets management | Vault or cloud-native secrets manager for model credentials |
| IAM maturity | Ability to create fine-grained service accounts/roles for model workloads |
| Monitoring stack | Process-level and network-level monitoring for anomaly detection |
Industry Applicability
| Industry | Applicability | Key Driver |
|---|---|---|
| Financial Services | Critical | Data sovereignty; lateral movement risk to core banking systems |
| Healthcare | Critical | Patient data protection; PHI access controls |
| Government / Defence | Critical | Classified data segregation; adversarial threat model |
| Technology / SaaS | High | Multi-tenant isolation; intellectual property protection |
| Manufacturing / Industrial | High | OT/IT boundary protection; model access to operational data |
| Retail | Medium | PII protection; model access to customer data stores |
4. Architecture Overview
Model isolation is implemented as a set of concentric isolation boundaries — each layer reduces the blast radius of a compromise at the layer above. The architecture philosophy is: assume the model is compromised; design the environment so that a compromised model cannot reach anything of value.
Network Isolation
Model serving workloads are deployed in a dedicated, isolated network segment (VPC subnet, Kubernetes namespace with NetworkPolicy, or dedicated VLAN). This segment has no direct connectivity to:
- Core databases (customer data, financial records).
- Secrets stores (vault, secrets manager).
- Internal corporate networks.
- Internet (unless explicitly permitted by egress policy).
Inbound traffic reaches the model only from the AI Gateway (EAAPL-SEC001) via a specific port. All other inbound traffic is denied. Outbound traffic is restricted to: the model registry (to fetch weights), the telemetry endpoint (to ship metrics/logs), and explicitly allowlisted tool endpoints (for agentic use cases). A DNS sinhole or DNS firewall prevents the model from resolving arbitrary internet hostnames.
Process Isolation
Model serving processes run with the minimum operating system permissions required:
- Non-root user (UID 1000+).
- Read-only root filesystem (model weights and configuration are mounted read-only).
- No
CAP_SYS_ADMINor other privileged Linux capabilities. - Seccomp profile restricting available system calls to those required for inference.
- AppArmor or SELinux policy enforcing the process's access to file system paths.
- No access to the host network namespace (container network only).
In Kubernetes, this is implemented via a PodSecurityPolicy (or Pod Security Admission in modern Kubernetes) with runAsNonRoot: true, readOnlyRootFilesystem: true, allowPrivilegeEscalation: false, and a custom seccomp profile.
Resource Quotas
Runaway inference jobs cause denial of service. Resource quotas are enforced at:
- Container level: CPU requests/limits, memory requests/limits, GPU memory limits.
- Kubernetes namespace level: ResourceQuota objects limiting total CPU, memory, and GPU across all pods.
- Per-request level: token limits enforced by the serving layer (vLLM max_tokens, TGI max_new_tokens).
Resource quotas protect not only the model infrastructure but also adjacent workloads sharing the cluster.
Secret Access Minimisation
The model serving process requires no secrets beyond what is needed to authenticate to the model registry and emit telemetry. It does not hold database credentials, user API keys, or service-to-service tokens. Any secrets required are:
- Injected at startup via a sidecar (Vault Agent, AWS Secrets Manager CSI driver) and expire after use.
- Never stored in environment variables (accessible to any process in the container).
- Never stored in the model's context window.
Read-Only Model Weights
Model weight files are mounted read-only. This prevents weight poisoning attacks (where an attacker with write access to the model's file system can modify weights to alter model behaviour). Weights are loaded from a signed, immutable artefact store (container registry with image signing, S3 with Object Lock) and verified at startup using a cryptographic hash.
Egress Controls for Agentic Systems
For AI agents with tool access, egress is the highest-risk attack surface. A compromised agent can use legitimate tool calls to exfiltrate data. Egress controls implement:
- An explicit tool endpoint allowlist enforced at the network layer (not just application layer).
- Rate limits on tool call frequency to limit exfiltration bandwidth.
- Deep packet inspection on HTTP tool calls (payload inspection for data exfiltration patterns).
- Tool call audit logging to detect anomalous patterns.
5. Architecture Diagram
6. Components
| Component | Type | Responsibility | Technology Options | Criticality |
|---|---|---|---|---|
| Network Policy | Network Control | Restricts inbound/outbound traffic for model serving pods to allowlisted endpoints | Kubernetes NetworkPolicy, AWS Security Groups, Calico, Cilium | Critical |
| Pod Security Controls | Process Isolation | Enforces non-root execution, read-only filesystem, capability restrictions, seccomp profile | Kubernetes PodSecurity Admission, OPA Gatekeeper, Kyverno | Critical |
| Secret Sidecar | Secrets Management | Injects required secrets at startup; rotates and expires credentials; never persists secrets to disk | Vault Agent sidecar, AWS Secrets Manager CSI driver, Azure Key Vault CSI driver | Critical |
| Resource Quota | Resource Control | Limits CPU, memory, GPU consumption per pod and per namespace | Kubernetes ResourceQuota + LimitRange, Slurm (HPC), AWS Fargate resource limits | High |
| Model Registry | Artefact Store | Stores model weights in immutable, signed artefacts; enforces content-addressed retrieval | Docker Registry + Notary/cosign, S3 with Object Lock + SHA-256 manifest, MLflow Registry | High |
| Weight Integrity Verifier | Integrity Check | Verifies cryptographic hash of model weights at container startup before serving begins | Cosign, custom hash verification script in init container | High |
| Egress Controller | Network Control | Enforces outbound connection allowlist; optionally performs deep packet inspection on tool calls | Envoy egress proxy, Squid with allowlist, AWS VPC Endpoints, Cilium egress gateway | High |
| Log Sidecar | Observability | Collects process logs, system call traces, and network connection logs; forwards to SIEM | Fluentd, Fluent Bit, Datadog Agent, AWS FireLens | High |
| Seccomp Profile | OS Hardening | Restricts Linux system calls available to the inference process | Custom seccomp JSON profile, Docker default seccomp, Bottlerocket cgroups v2 | Medium |
| AppArmor / SELinux Policy | OS Hardening | Mandatory access control enforcing file system and capability boundaries | AppArmor (Ubuntu/Debian), SELinux (RHEL/Amazon Linux) | Medium |
7. Data Flow
Primary Flow
| Step | Actor | Action | Output |
|---|---|---|---|
| 1 | DevOps / MLOps | Publishes model weights to model registry with cosign signature | Signed, immutable model artefact with SHA-256 digest |
| 2 | Container Orchestrator | Schedules model serving pod in isolated namespace; applies NetworkPolicy and PodSecurity constraints | Pod scheduled on dedicated node pool with isolation labels |
| 3 | Init Container | Fetches model weights from registry; verifies cosign signature and SHA-256 hash | Verified weights mounted at read-only path |
| 4 | Secret Sidecar | Authenticates to Vault using Kubernetes Service Account token; retrieves telemetry credentials; injects into shared memory | Short-lived credentials available to inference process |
| 5 | Inference Process | Starts serving; accepts inbound requests only from AI Gateway over mTLS | Model ready to serve |
| 6 | AI Gateway | Forwards validated, sanitised request to model | Request received by inference process |
| 7 | Inference Process | Runs inference; generates response; for agentic workloads, makes tool calls only to allowlisted endpoints | Response or tool call output |
| 8 | Log Sidecar | Collects process logs, resource metrics, and network connection events; forwards to telemetry endpoint | Observability data available in SIEM/monitoring stack |
| 9 | Resource Quota Controller | Enforces CPU/memory/GPU limits; throttles or terminates if limits exceeded | Normal operation or throttle/OOMKill event |
Error Flow
| Error Condition | Behaviour | Alert |
|---|---|---|
| Weight integrity check fails | Pod fails to start; alert MLOps team | Critical: model weight integrity violation |
| Secret sidecar cannot authenticate to Vault | Pod fails to start; no credentials available | Critical: secret injection failure |
| Network policy violation attempt | Connection rejected by Kubernetes NetworkPolicy; logged by Cilium/Calico | Security: model attempting disallowed egress |
| Resource quota exceeded | Pod throttled (CPU) or OOMKilled (memory); pod restarted | Warning: resource exhaustion |
| Seccomp violation (blocked syscall) | Process terminated with SIGSYS; pod restarted | Security: unexpected syscall from model process |
8. Security Considerations
Authentication & Authorisation
- Model serving process has no inbound authentication to manage (auth handled by AI Gateway before request reaches model).
- Outbound authentication for tool calls uses short-lived tokens injected by the secret sidecar — never long-lived credentials embedded in configuration.
- Kubernetes Service Account tokens used for Vault authentication are bound to the specific pod's namespace and expire within 1 hour.
Secrets Management
- No secrets in environment variables (visible in container inspect, logs, crash dumps).
- No secrets in model weights or configuration files.
- Secret sidecar injects credentials into in-memory tmpfs only.
- All credential access logged by Vault for audit.
Data Classification
- Model execution environment is classified at the sensitivity level of the highest-classification data it will process. A model serving requests containing CONFIDENTIAL data must be isolated in a CONFIDENTIAL-tier network segment.
- Cross-classification boundary serving is prohibited — a model serving CONFIDENTIAL requests must not also serve PUBLIC requests (context window contamination risk).
Encryption
- Model weights encrypted at rest in registry (AES-256, provider-managed key) and in transit (TLS 1.3 from registry to pod).
- Network traffic within the pod is encrypted using Kubernetes pod-to-pod mTLS (Istio/Linkerd) or WireGuard (Cilium).
- Scratch space (for intermediate computation) uses encrypted ephemeral volumes.
Auditability
- All egress connection attempts (successful and blocked) logged with source pod, destination IP/hostname, and timestamp.
- All secret access events logged by Vault.
- All resource quota violations logged for security review (may indicate attempted resource exhaustion attack).
OWASP LLM Top 10 Coverage
| OWASP LLM Risk | Model Isolation Mitigation | Coverage |
|---|---|---|
| LLM01: Prompt Injection | Isolation limits blast radius if injection succeeds; does not prevent injection itself | Low |
| LLM02: Insecure Output Handling | Egress controls limit exfiltration of data through tool calls in agentic contexts | High |
| LLM03: Training Data Poisoning | Read-only model weights + weight integrity verification prevent weight-level poisoning post-deployment | High |
| LLM04: Model Denial of Service | Resource quotas prevent runaway inference from affecting other workloads | High |
| LLM05: Supply Chain Vulnerabilities | Signed model artefacts and integrity verification at startup prevent supply chain compromise of model weights | High |
| LLM06: Sensitive Information Disclosure | Network isolation prevents direct access to data stores; context window data cannot reach external endpoints | High |
| LLM07: Insecure Plugin Design | Egress allowlist enforces tool endpoint restrictions at network layer | High |
| LLM08: Excessive Agency | Egress controls and tool allowlist limit the actions an agent can take | High |
| LLM09: Overreliance | Not applicable | None |
| LLM10: Model Theft | Read-only filesystem; encrypted weights at rest; no external weight exfiltration path | High |
9. Governance Considerations
Responsible AI
- Model isolation ensures that AI model behaviour is bounded — a model cannot access data beyond its authorised scope, which is a prerequisite for responsible deployment.
- Isolation boundaries must be documented in the AI system's risk register and reviewed as part of the AI impact assessment process.
Model Risk Management
- Isolation controls form a critical part of the model risk management framework: they limit the operational risk from a model behaving unexpectedly.
- Weight integrity verification is a model risk control — it ensures the deployed model is the validated, approved model.
Human Approval
- Changes to network policy (e.g., adding a new egress allowlist entry) require approval from Security Architecture and are subject to change management.
- Changes to seccomp profiles or AppArmor policies require security team review.
Governance Artefacts
| Artefact | Owner | Frequency | Purpose |
|---|---|---|---|
| Model Isolation Design Document | Security Architecture | With each new model deployment | Documents isolation controls for each model environment |
| Network Policy Audit Report | Security Operations | Quarterly | Verifies network policies are correctly applied and not bypassed |
| Weight Integrity Verification Log | MLOps | Continuous | Evidence that deployed models match approved artefacts |
| Egress Connection Log | Security Operations | Continuous review | Detects anomalous outbound connections from model serving |
| Resource Quota Review | Platform Engineering | Quarterly | Ensures quotas are appropriate for workload without over-provisioning risk |
10. Operational Considerations
Monitoring
- Process-level: CPU, memory, GPU utilisation per inference process; seccomp violation events.
- Network-level: egress connection attempts (blocked and permitted); inbound connection sources.
- Storage-level: write attempts to read-only filesystem (apparmor/seccomp violation).
- Resource-level: quota utilisation trends; OOMKill events.
SLOs
| SLO | Target | Measurement |
|---|---|---|
| Weight integrity verification time | <30s at pod startup | Init container span |
| Secret injection latency | <5s at pod startup | Secret sidecar span |
| Network policy enforcement latency | <1ms per connection | Cilium/Calico metrics |
| Egress block alert latency | <60s from connection attempt to alert | Alert pipeline latency |
| Seccomp/AppArmor violation alert | <30s from violation to SIEM | SIEM ingestion latency |
Logging
- Structured JSON from all sidecars. Mandatory:
pod_name,namespace,event_type(egress_attempt, seccomp_violation, oomkill, weight_integrity_check),outcome(allowed/blocked/failed),timestamp_utc. - Network connection logs include
src_pod,dst_ip,dst_hostname,dst_port,protocol,bytes_transferred,outcome.
Incident Management
- Egress connection attempt to non-allowlisted destination → P1 security incident; immediate pod isolation; security operations investigation.
- Seccomp violation → P2; pod quarantined; security review of syscall.
- Weight integrity failure → P1; pod does not start; MLOps escalation; artefact store integrity investigation.
DR
| Scenario | RTO | Recovery |
|---|---|---|
| Pod OOMKilled | 30s | Kubernetes restarts pod; alert to platform team |
| Model registry unavailable | 5min (new pods cannot start; existing pods continue) | Cached weights in running pods; restore registry |
| Vault unavailable | 2min (pods can't start or rotate secrets) | Vault HA cluster; emergency credential cache in CSI driver |
| Network policy misconfiguration | 5min | Rollback network policy to last known-good version via GitOps |
11. Cost Considerations
Cost Drivers
| Cost Driver | Description | Relative Impact |
|---|---|---|
| Dedicated node pool | Model workloads often require GPU nodes; isolation to dedicated pools prevents bin packing with other workloads | High |
| Egress proxy | Envoy or Squid egress proxy adds compute cost | Low |
| Secret sidecar | Vault Agent or CSI driver adds memory overhead per pod | Low |
| Security scanning | Image scanning, seccomp profile generation, AppArmor policy authoring engineering time | Medium |
| GPU underutilisation | Isolation prevents sharing GPU nodes with non-model workloads | Medium–High |
Optimisations
- Use node affinity and taints to co-locate multiple isolated model workloads on the same GPU node while maintaining pod-level isolation — share the node's GPU, not the network or filesystem.
- Implement GPU time-slicing (MIG on NVIDIA A100) to allow multiple isolated pods to share a single GPU without memory isolation risk.
Indicative Cost Range
| Scale | Monthly AWS Additional Cost (USD) | Notes |
|---|---|---|
| Small (1–2 model endpoints) | $200–$600 | Dedicated EKS node group, NAT Gateway for egress control |
| Medium (5–20 model endpoints) | $1,000–$4,000 | Dedicated node pools; Cilium enterprise for egress; additional monitoring |
| Large (50+ model endpoints) | $8,000–$25,000 | Multi-tenant GPU cluster with fine-grained isolation; dedicated security tooling |
12. Trade-Off Analysis
Option Comparison
| Option | Description | Pros | Cons | Best For |
|---|---|---|---|---|
| A: Namespace-only isolation | Separate Kubernetes namespace with NetworkPolicy; no process-level hardening | Low operational overhead; fast to implement | Process escapes still possible; shared kernel; no egress DPI | Dev/staging environments; low-sensitivity workloads |
| B: Full pod hardening (this pattern) | Namespace + process isolation (seccomp, AppArmor, non-root, read-only FS) + egress control | Comprehensive isolation; industry-standard | Requires seccomp profile authoring; AppArmor policy management; operational overhead | Production AI workloads; regulated environments |
| C: VM-level isolation | Each model in a dedicated VM (or Kata Containers for VM-level isolation in Kubernetes) | Kernel isolation; strongest blast radius containment | High cost; poor bin packing; slow start time | Highest-risk workloads; multi-tenant with hostile tenants |
| D: Managed service isolation | Use cloud-managed model serving (SageMaker, Azure ML, Vertex AI) and accept provider isolation | Low operational burden; provider SLAs | Vendor lock-in; less control; data residency constraints; can't customise seccomp | Organisations without Kubernetes expertise |
Architectural Tensions
| Tension | Trade-Off |
|---|---|
| Isolation vs Operability | Strict seccomp profiles and read-only filesystems can break inference libraries that write temp files. Resolution: profile the inference process's system call requirements before writing the seccomp profile; use tmpfs for scratch space. |
| Performance vs Security | Network policy enforcement (Cilium eBPF) and seccomp add per-request overhead. At high inference volumes, this can be measurable. Resolution: eBPF-based enforcement (Cilium) is near-zero-overhead; seccomp adds <1% CPU overhead for inference workloads. |
| GPU Sharing vs Isolation | GPU memory isolation requires MIG (A100/H100 only); older GPUs share GPU memory between processes. Resolution: use MIG for production; accept soft isolation (process-level) for other GPU types. |
13. Failure Modes
| Failure | Likelihood | Impact | Detection | Recovery |
|---|---|---|---|---|
| Seccomp profile too restrictive (breaks inference library) | Medium | High (model unavailable) | Pod CrashLoopBackOff; SIGSYS in logs | Audit required syscalls; update seccomp profile; redeploy |
| Network policy rule error (legitimate traffic blocked) | Medium | High (model unreachable from gateway) | 503 errors from gateway → model; network connectivity check | Roll back network policy; investigate and fix |
| Weight integrity check false negative | Very Low | Critical | Post-deployment model behaviour anomaly detection | Forensic analysis of model registry; rolling restart from clean artefact |
| Secret sidecar certificate rotation failure | Low | High (credentials expire; model cannot authenticate for tool calls) | Secret expiry metric approaching zero | Sidecar restart; Vault token renewal |
| GPU memory isolation breach (non-MIG GPU) | Low | Medium (process memory accessible between pods) | Process-level memory boundary monitoring | Migrate to MIG-capable hardware; temporary: single-tenant GPU nodes |
14. Regulatory Considerations
| Regulation | Requirement | Model Isolation Implementation |
|---|---|---|
| APRA CPS234 §21 | Information security controls commensurate with sensitivity | Network and process isolation directly address information asset protection |
| APRA CPS234 §23 | Capability to detect and respond to information security incidents | Egress logging and violation alerting implement incident detection for model environments |
| EU AI Act Art. 9 (Risk Management) | Implement technical and organisational measures to manage AI risks | Model isolation is a core technical risk management measure for on-premises AI workloads |
| ISO 27001 A.13.1 (Network Security) | Manage and control networks to protect information systems | Network policy and egress control implement this requirement for AI workloads |
| ISO 27001 A.12.6 (Technical Vulnerability Management) | Prevent exploitation of technical vulnerabilities | Read-only filesystem and weight integrity verification address model-layer vulnerability management |
| NIST AI RMF MANAGE 2.2 | Mechanisms exist to prevent improper access | Isolation controls implement access prevention at network, process, and storage layers |
15. Reference Implementations
AWS
| Component | AWS Service |
|---|---|
| Container isolation | EKS with Bottlerocket OS (seccomp by default); OPA Gatekeeper for policy |
| Network isolation | VPC subnets + Security Groups; EKS NetworkPolicy via Cilium or Calico |
| Egress control | AWS Network Firewall; VPC Endpoints for AWS services (no internet path) |
| Process isolation | Bottlerocket OS seccomp profiles; AWS Fargate (VM-level isolation) |
| Secret injection | AWS Secrets Manager CSI driver; IAM Roles for Service Accounts (IRSA) |
| Weight storage | ECR (OCI artefacts) with image signing (cosign); S3 with Object Lock |
| Resource quotas | EKS ResourceQuota + LimitRange; NVIDIA GPU Operator for GPU quotas |
Azure
| Component | Azure Service |
|---|---|
| Container isolation | AKS with Azure Linux (CBL Mariner); Azure Policy for pod security |
| Network isolation | AKS NetworkPolicy (Azure CNI or Calico); private AKS cluster |
| Egress control | Azure Firewall with FQDN allow rules |
| Secret injection | Azure Key Vault CSI driver; Workload Identity |
| Weight storage | Azure Container Registry with Notation signing; Azure Blob with immutability |
| Resource quotas | AKS ResourceQuota; Node Taints for GPU isolation |
GCP
| Component | AWS Service |
|---|---|
| Container isolation | GKE Autopilot (enforces security best practices by default); Workload Identity |
| Network isolation | GKE NetworkPolicy; Private GKE cluster; VPC Service Controls |
| Egress control | Cloud Armor; VPC firewall rules with FQDN |
| Secret injection | Secret Manager CSI driver; Workload Identity Federation |
| Weight storage | Artifact Registry with Binary Authorization |
On-Premises
| Component | Technology |
|---|---|
| Container isolation | Kubernetes with OPA Gatekeeper; custom seccomp profiles per model workload |
| Network isolation | Calico or Cilium NetworkPolicy; dedicated VLAN per model tier |
| Egress control | Envoy egress proxy with explicit upstream allowlist |
| Secret injection | HashiCorp Vault Agent sidecar injector |
| Weight storage | Harbor registry with Notary signing; Ceph S3 with WORM policies |
| GPU isolation | NVIDIA MIG on A100; one MIG instance per isolated model workload |
16. Related Patterns
| Pattern | ID | Relationship |
|---|---|---|
| AI Gateway | EAAPL-SEC001 | Gateway is the only permitted inbound path to the model; isolation enforces this at network layer |
| Secure Tool Invocation | EAAPL-SEC004 | Egress controls in model isolation are the network-layer enforcement of tool invocation policy |
| Zero-Trust AI Pipeline | EAAPL-SEC007 | Model isolation implements the compute-layer zero-trust controls within the broader pipeline |
| Secrets Management for AI | EAAPL-SEC008 | Secret injection sidecar pattern depends on SEC008 for the vault infrastructure |
| AI Telemetry | EAAPL-OBS001 | Log sidecar pattern provides the telemetry pipeline for model execution events |
| Adversarial Input Defence | EAAPL-SEC010 | Isolation limits blast radius of adversarial inputs that succeed in manipulating model behaviour |
17. Maturity Assessment
Overall Maturity: Proven
| Dimension | Score (1–5) | Rationale |
|---|---|---|
| Pattern definition clarity | 4 | Well-defined; some GPU-specific isolation guidance still evolving |
| Technology availability | 4 | Kubernetes + Cilium + OPA provides complete implementation; GPU MIG requires specific hardware |
| Industry adoption | 3 | Applied in security-mature organisations; underimplemented in most enterprises deploying AI |
| Operational tooling | 4 | Strong Kubernetes security tooling ecosystem |
| Regulatory alignment | 4 | Directly addresses CPS234, EU AI Act Art. 9 requirements |
| Community knowledge | 4 | Kubernetes security community well-documented; AI-specific extensions are newer |
18. Revision History
| Version | Date | Author | Changes |
|---|---|---|---|
| 1.0 | 2024-03-01 | Security Architecture Team | Initial pattern definition |
| 1.1 | 2024-07-15 | Security Architecture Team | Added GPU MIG isolation guidance; updated OWASP LLM mapping |
| 1.2 | 2025-02-01 | Security Architecture Team | Added weight integrity verification; updated regulatory mapping for EU AI Act |