EAAPLEnterprise AI Architecture Pattern Library
EAAPLLibraryAI SecurityEAAPL-SEC007
EAAPL-SEC007Proven
⇄ Compare

Zero-Trust AI Pipeline

🔐 AI SecurityAPRA CPS234EU AI Act🏭 Field-tested in AU

[EAAPL-SEC007] Zero-Trust AI Pipeline

Category: Security / Architecture Sub-category: Pipeline Security Model Version: 1.1 Maturity: Proven Tags: zero-trust mtls identity-verification micro-segmentation jit-access continuous-verification pipeline-security Regulatory Relevance: APRA CPS234, NIST SP 800-207 (Zero Trust Architecture), EU AI Act Art. 9, ISO 27001 A.9, ISO 42001 §6.1


1. Executive Summary

The Zero-Trust AI Pipeline applies the "never trust, always verify" security architecture principle to the complete AI request pipeline — from the initial client request through gateway, prompt processing, model inference, tool calls, and response delivery. Traditional perimeter security assumes that traffic inside the enterprise network is trustworthy. Zero-trust rejects this assumption: every component, every service call, and every data exchange must be authenticated, authorised, and verified regardless of network location.

This matters specifically for AI pipelines because they involve multiple service-to-service calls (gateway → prompt filter → model → tool → output filter), often span cloud and on-premises boundaries, and process data that, if intercepted at any hop, can represent a significant security breach. An AI pipeline that trusts its internal service calls is only as secure as its least-secured component — and AI pipelines have more components than most.

The pattern establishes five pillars for a zero-trust AI pipeline: identity verification on every service call (mutual TLS + JWT), micro-segmentation of pipeline stages, just-in-time access for AI workloads (no standing permissions), continuous verification of pipeline integrity, and comprehensive audit of every inter-service exchange. Organisations implementing this pattern achieve a documented reduction in lateral movement risk and a clear audit trail that satisfies both security operations and regulatory requirements.


2. Problem Statement

Business Problem

Enterprise AI pipelines are complex distributed systems with many inter-service dependencies. Organisations that have invested in perimeter security (firewalls, VPN) often assume that traffic between internal services is safe — a "flat network" model. For AI pipelines, this creates an unacceptable risk: if any single component in the pipeline is compromised (through a vulnerability, supply chain attack, or insider threat), an attacker has implicit trust across the pipeline and can intercept or inject data at any point.

Technical Problem

AI pipelines typically have multiple service-to-service calls that are authenticated weakly or not at all:

  • Gateway to prompt firewall: internal HTTP call with no authentication.
  • Model inference to tool endpoints: service account with broad permissions.
  • RAG retrieval service to vector database: network-only access control.
  • Output filter to model: no authentication (same cluster).
  • Pipeline services sharing a single service account credential.

None of these connections verify that the caller is who it claims to be, or that the caller's permissions are appropriate for the specific request being made.

Symptoms

  • Internal service calls using API keys stored in environment variables.
  • Microservices sharing a single omnibus service account.
  • No mTLS between AI pipeline components.
  • Service-to-service calls not logged separately from client-to-gateway calls.
  • Broad IAM permissions on model serving infrastructure.
  • No mechanism to detect if a pipeline component has been tampered with.

Cost of Inaction

Dimension Impact
Security Pipeline component compromise enables lateral movement to all connected services
Data Data exfiltration at any pipeline hop without detection
Regulatory Cannot demonstrate end-to-end pipeline security to APRA / EU AI Act auditors
Integrity Pipeline tampering (modifying prompts or responses in transit) undetectable
Accountability Cannot attribute a security incident to a specific pipeline component

3. Context

When to Apply

  • Multi-component AI pipelines with more than one service-to-service call.
  • AI pipelines that cross network boundaries (on-premises to cloud, VPC to VPC).
  • Organisations in regulated industries where audit of data flow is required.
  • AI pipelines processing data at CONFIDENTIAL classification or above.
  • Organisations with mature identity infrastructure (existing PKI or SPIFFE implementation).

When NOT to Apply

  • Single-process AI applications (model and application in the same process/container) — zero-trust is a network architecture pattern.
  • Early-stage development environments where operational overhead outweighs security benefits.
  • Extremely latency-sensitive inference paths where mTLS handshake overhead is unacceptable (note: with session resumption, mTLS adds <1ms per request after the initial handshake).

Prerequisites

Prerequisite Detail
PKI / SPIFFE Certificate authority or SPIFFE/SPIRE deployment for workload identity issuance
Service Mesh or mTLS Sidecar Istio, Linkerd, Consul Connect, or manual mTLS implementation
Short-lived Credential Infrastructure Vault or cloud-native IAM for JIT access credential issuance
Policy Engine OPA or equivalent for authorisation policy evaluation
Distributed Tracing OpenTelemetry for end-to-end pipeline audit

Industry Applicability

Industry Applicability Key Driver
Financial Services Critical APRA CPS234; data-in-transit integrity requirements
Healthcare Critical PHI pipeline integrity; cross-boundary data flows
Government / Defence Critical Classified data handling; adversarial threat model
Technology / Cloud High Multi-tenant AI platforms; cross-boundary SaaS pipelines
Retail Medium Customer data pipeline security

4. Architecture Overview

The zero-trust AI pipeline replaces all implicit trust relationships between pipeline components with explicit, verified, and logged trust relationships. Every component operates as if it is on a hostile network — because in a zero-trust model, it is.

Pillar 1: Workload Identity (SPIFFE/SPIRE)

Every pipeline component — gateway, prompt firewall, model server, RAG retriever, tool adapter, output filter — is assigned a cryptographic workload identity via SPIFFE (Secure Production Identity Framework for Everyone). Each workload receives a short-lived X.509 certificate (SVID) issued by SPIRE, valid for 1 hour and automatically rotated. The SVID encodes the workload's identity (e.g., spiffe://enterprise.ai/gateway, spiffe://enterprise.ai/model-server). No component can forge another component's identity — it requires the SPIRE agent running on the specific compute instance.

Pillar 2: Mutual TLS on Every Service Call

Every service-to-service call in the pipeline uses mutual TLS (mTLS): both the caller and the callee present their SPIFFE SVIDs. This provides cryptographic authentication at both ends of every connection. A compromised component cannot impersonate another component; a man-in-the-middle attack is cryptographically impossible. Service mesh sidecars (Envoy via Istio or Linkerd) handle mTLS transparently — application code makes plain HTTP calls; the sidecar upgrades to mTLS and validates peer identity.

Pillar 3: Per-Request Authorisation

Authentication (you are who you claim to be) is necessary but not sufficient. Authorisation (you are permitted to make this specific request) must also be verified on every request. OPA policies evaluate: is the calling component permitted to call this component? Is the data classification of the request within the permitted range for this component pair? Is the request rate within configured limits?

Critically, authorisation is scoped to the request, not the connection. A gateway instance that is permitted to call the prompt firewall is not automatically permitted to call the model server directly — each hop's authorisation is evaluated independently.

Pillar 4: Just-in-Time Access

No pipeline component holds standing permissions to the resources it needs. Instead:

  • Model weights are fetched from the registry at startup using a time-limited Vault lease.
  • Tool call credentials are generated at the moment of tool invocation, scoped to the specific tool and operation.
  • Database read credentials for RAG retrieval are generated per-session, scoped to the retrieval tenant's data.
  • Cloud IAM role assumptions are time-limited (1-hour maximum).

JIT access eliminates the risk of credential theft: stolen credentials are expired within minutes to an hour, dramatically reducing the attack window.

Pillar 5: Continuous Verification and Pipeline Integrity

Zero-trust is not a static configuration — it requires continuous verification. This includes:

  • Periodic rotation of SVIDs (every 1 hour) and re-authentication.
  • Runtime integrity checks: each pipeline component signs its outputs; downstream components verify the signature before processing. If a prompt firewall output arrives at the model server without a valid signature from the firewall, the request is rejected.
  • Anomaly detection on pipeline traffic patterns: unexpected call volumes, unusual source identities, or calls between non-adjacent components trigger alerts.
  • Binary authorisation: container images in the pipeline must be signed and verified against an approved image registry before deployment.

5. Architecture Diagram

ARCHITECTURE DIAGRAM
flowchart TD subgraph Identity["Trust Infrastructure"] A[SPIRE Identity Server] B[OPA Policy Engine] C[Vault JIT Credentials] end subgraph Pipeline["AI Pipeline"] D[AI Gateway] E[Prompt Firewall] F[Model Server] G[Output Filter] end subgraph ClientZ["Client"] H[Application] end H -->|mTLS + JWT| D --> E --> F --> G --> H A -.->|SVID per workload| D A -.->|SVID per workload| F B -.->|per-request authz| D C -.->|JIT credentials| F style H fill:#dbeafe,stroke:#3b82f6 style D fill:#f0fdf4,stroke:#22c55e style E fill:#f0fdf4,stroke:#22c55e style F fill:#fef9c3,stroke:#eab308 style G fill:#f0fdf4,stroke:#22c55e style A fill:#fef9c3,stroke:#eab308 style B fill:#fef9c3,stroke:#eab308 style C fill:#fef9c3,stroke:#eab308

6. Components

Component Type Responsibility Technology Options Criticality
SPIRE Server Identity Issues SVID certificates to attested workloads; rotates SVIDs every hour SPIFFE/SPIRE OSS, Istio CA, HashiCorp Vault PKI Critical
SPIRE Agent Identity Runs on each compute node; attests workloads; manages SVID lifecycle SPIFFE/SPIRE Agent Critical
Service Mesh Sidecar mTLS Transparent mTLS proxy for all inter-service communication Envoy (Istio), Linkerd proxy, Consul Connect Critical
OPA Policy Engine Authorisation Per-request authorisation policy evaluation for all inter-service calls Open Policy Agent, Cedar Critical
Vault (JIT Credentials) Secrets Dynamic, time-limited credential issuance for pipeline components HashiCorp Vault, AWS IAM roles with short TTL, Azure Managed Identity Critical
Pipeline Output Signer Integrity Signs outputs at each pipeline stage for downstream verification Custom HMAC signer, SPIFFE SVID-based signing High
Binary Authorisation Supply Chain Verifies container image signatures before deployment Google Binary Authorization, AWS Signer, Sigstore Cosign High
Anomaly Detector Monitoring Detects unusual inter-component call patterns Datadog APM, Elastic SIEM, custom OTel-based detector High
Distributed Tracing Audit End-to-end trace of every request through all pipeline components OpenTelemetry Collector, Jaeger, AWS X-Ray High

7. Data Flow

Primary Flow

Step Actor Action Output
1 Application Sends request to AI Gateway with mTLS client cert + JWT Authenticated connection at gateway
2 Gateway Validates client identity; evaluates OPA policy; generates internal request context with trace_id Authorised request with trace_id
3 Gateway → Prompt Firewall mTLS call using SPIFFE SVID; OPA validates gateway→firewall call authorisation Authenticated, authorised call
4 Prompt Firewall Processes request; signs output with SVID-derived HMAC Signed firewall result
5 Prompt Firewall → Input Sanitiser mTLS call; sanitiser verifies firewall output signature Verified, authenticated handoff
6 Input Sanitiser → Model Server mTLS call with signed sanitised prompt Authenticated, signed prompt at model server
7 Model Server Generates response; for tool calls, requests JIT credential from Vault per tool Raw model response
8 Model Server → Output Filter mTLS call with signed model output Authenticated handoff to output filter
9 Output Filter → Gateway mTLS response with signed filtered output Verified response returned to gateway
10 Gateway Returns response to application; full trace in distributed tracing system End-to-end traced, verified response

Error Flow

Error Behaviour Alert
SVID expired (component fails to renew) Downstream rejects mTLS connection; component isolated Critical: SVID renewal failure
OPA policy denies inter-component call Request rejected; component attempts logged Security: unexpected inter-component call
Pipeline output signature verification fails Downstream rejects processed request Security: pipeline integrity violation — possible MITM
JIT credential request rejected by Vault Component cannot proceed with operation; error returned Critical: JIT credential failure
Binary authorisation fails Container deployment blocked Security: unsigned container in pipeline

8. Security Considerations

Authentication & Authorisation

  • Every component has a unique, cryptographic workload identity (SPIFFE SVID).
  • Every inter-component call authenticated via mTLS using SPIFFE SVIDs.
  • Every call authorised by OPA with policies that specify exactly which components may call which other components.

Secrets Management

  • No standing credentials. All credentials JIT-issued by Vault with minimum TTL.
  • SVID private keys never leave the compute instance.

Data Classification

  • OPA policies can enforce classification-based routing: a CONFIDENTIAL request may only traverse components cleared for CONFIDENTIAL processing.

Encryption

  • All inter-service communication: TLS 1.3 via mTLS.
  • TLS session keys rotated with SVID rotation (every hour maximum).
  • At-rest encryption on all pipeline state.

OWASP LLM Top 10 Coverage

OWASP LLM Risk Zero-Trust Pipeline Mitigation Coverage
LLM01: Prompt Injection Pipeline integrity signatures detect prompt tampering between stages Medium
LLM02: Insecure Output Handling Component isolation limits blast radius of output handling vulnerability Medium
LLM03: Training Data Poisoning Binary authorisation prevents tampered pipeline components from being deployed High
LLM04: Model Denial of Service Per-component authorisation enables request quota enforcement at each hop Medium
LLM05: Supply Chain Vulnerabilities Binary authorisation + SVID workload attestation prevent supply chain compromise Critical
LLM06: Sensitive Information Disclosure mTLS prevents data interception in transit; JIT credentials limit access scope High
LLM07: Insecure Plugin Design Tool adapter has its own SVID and OPA policy; not implicitly trusted High
LLM08: Excessive Agency Per-component authorisation limits what each component can call High
LLM09: Overreliance Not applicable None
LLM10: Model Theft Workload identity + JIT credentials prevent unauthorised access to model weights High

9. Governance Considerations

Governance Artefacts

Artefact Owner Frequency Purpose
Zero Trust Policy Definitions (OPA) Security Architecture Reviewed quarterly; updated with pipeline changes Documents all authorised inter-component relationships
SVID Issuance Audit Log Security Operations Continuous; weekly review Tracks all workload identity events; detects anomalous attestations
Pipeline Integrity Violation Log Security Operations Continuous; daily review Records all signature verification failures; triggers investigation
JIT Credential Audit Compliance Monthly Evidence of least-privilege access for APRA/regulatory review
Binary Authorisation Violation Log Security Operations Continuous Unauthorised container deployments

10. Operational Considerations

SLOs

SLO Target Measurement
SVID rotation latency <5s SPIRE rotation metric
mTLS overhead per hop (p99) <2ms (with session resumption) Inter-service span latency
OPA policy evaluation latency (p99) <3ms OPA decision latency
SPIRE availability 99.99% (critical path) SPIRE health checks
Pipeline trace completeness >99.9% (all spans captured) OTel collector metrics

Incident Management

  • SVID rotation failure → P1: affected component isolated; SPIRE investigation.
  • Pipeline integrity violation (signature failure) → P1: security incident; possible MITM; full pipeline forensics.
  • Binary authorisation failure → P2: deployment blocked; investigate image provenance.

DR

Scenario RTO Recovery
SPIRE server failure 1min (in-flight SVIDs valid for remaining TTL) SPIRE HA cluster; failover to secondary
OPA server failure 0 (fail-closed: deny all) OPA HA; policy cache in sidecar (last-known-good)
Full service mesh failure 30min Runbook for graceful mesh recovery; failover path without mTLS with emergency alert

11. Cost Considerations

Cost Drivers

Cost Driver Description Relative Impact
SPIRE infrastructure SPIRE server + agents; relatively modest compute Low
Service mesh overhead Sidecar memory per pod (~50MB); mTLS CPU (<1% on modern CPUs) Low–Medium
OPA evaluation Per-request policy evaluation overhead Low (sub-millisecond)
Engineering / operations Initial implementation and ongoing policy management High (one-time)
Distributed tracing storage Full pipeline traces stored per request; grows with traffic Medium

Indicative Cost Range

Scale Monthly Additional Cost (USD) Notes
Small pipeline $300–$700 SPIRE, OPA, service mesh infrastructure
Medium pipeline $1,000–$3,000 Larger mesh footprint; distributed tracing storage
Large pipeline $3,000–$10,000 Multi-region SPIRE; dedicated OPA cluster; high-volume tracing

12. Trade-Off Analysis

Option Comparison

Option Description Pros Cons Best For
A: Network perimeter only Trust internal traffic; firewall at boundary Simple; low operational overhead Flat network allows lateral movement; no audit of inter-service calls Non-regulated, low-sensitivity AI applications
B: Service-level API keys Per-service shared API keys Simple to implement; light overhead Keys are long-lived; no workload attestation; hard to rotate at scale Transitional state toward full zero-trust
C: Full SPIFFE/SPIRE zero-trust (this pattern) Cryptographic workload identity + mTLS + OPA + JIT Strongest security posture; auditable; no credential sprawl Significant initial implementation effort; SPIRE ops burden Production AI pipelines in regulated industries
D: Cloud-native service mesh AWS App Mesh, Azure Service Fabric, GCP Traffic Director Cloud-managed; lower ops burden Vendor lock-in; less control over SVID format; may not support on-premises Cloud-committed organisations

Architectural Tensions

Tension Trade-Off
Security vs Developer Productivity Zero-trust adds complexity to local development. Resolution: develop with simplified mTLS (self-signed certs); enforce full SPIFFE/SPIRE only in staging and production.
SVID TTL vs Revocation Latency Shorter SVID TTL (1hr) means faster revocation but more rotation overhead. Resolution: 1-hour TTL is well-established; SPIRE handles rotation transparently.
OPA Centralisation vs Latency Centralised OPA adds a network hop. Resolution: deploy OPA as a local sidecar (or in-process bundle evaluation) for sub-millisecond decisions.

13. Failure Modes

Failure Likelihood Impact Detection Recovery
SPIRE server unavailable (SVIDs cannot be renewed) Low Critical (all components fail when SVIDs expire) SPIRE health check → P1 alert SPIRE HA cluster; multi-region deployment
OPA policy misconfiguration (denies legitimate calls) Medium High (pipeline components blocked) Pipeline error rate spike Rollback OPA policy to previous version
mTLS certificate CA compromise Very Low Critical (all pipeline trust invalidated) CA certificate monitoring; anomaly detection Emergency CA rotation; pipeline restart with new SVIDs
Pipeline signature verification false positive Low Medium (legitimate calls rejected) Signature failure rate metric Investigate signer; update trust bundle
JIT credential generation bottleneck Medium High (tool calls blocked) Vault latency metric Vault scale-out; credential caching

14. Regulatory Considerations

Regulation Requirement Implementation
APRA CPS234 §21 Controls must address data-in-transit protection mTLS on every inter-component call directly addresses this
NIST SP 800-207 (Zero Trust Architecture) Zero trust principles: verify explicitly, use least-privilege access, assume breach All three pillars implemented: SPIFFE (verify explicitly), JIT/OPA (least privilege), anomaly detection (assume breach)
ISO 27001 A.9 (Access Control) Access control to services and systems Per-component OPA authorisation implements service access control
EU AI Act Art. 9 (Risk Management) Technical risk management for high-risk AI Zero-trust pipeline is a documented technical risk management measure
SOC 2 CC6.3 Network security with least-privilege access mTLS + JIT credentials implement SOC 2 CC6.3

15. Reference Implementations

AWS

Component AWS Service
Workload identity AWS IAM Roles for Service Accounts (IRSA) + SPIRE on EKS
mTLS AWS App Mesh + Envoy sidecar, or Istio on EKS
Policy engine OPA on Lambda / ECS
JIT credentials AWS IAM temporary credentials via STS
Binary authorisation AWS Signer + ECR image signing
Distributed tracing AWS X-Ray

Azure

Component Azure Service
Workload identity Azure Workload Identity + SPIRE
mTLS Istio on AKS or Linkerd
Policy engine OPA on AKS
JIT credentials Azure Key Vault + Managed Identity
Binary authorisation Azure Container Registry + Notary
Distributed tracing Azure Monitor + Application Insights

On-Premises

Component Technology
Workload identity SPIFFE/SPIRE (open source)
mTLS Istio or Linkerd service mesh
Policy engine OPA (sidecar bundle mode)
JIT credentials HashiCorp Vault
Binary authorisation Sigstore Cosign + in-cluster policy controller
Distributed tracing OpenTelemetry + Jaeger

Pattern ID Relationship
AI Gateway EAAPL-SEC001 Gateway is the entry point to the zero-trust pipeline
Model Isolation EAAPL-SEC003 Compute-layer isolation complements network-layer zero-trust
Secure Tool Invocation EAAPL-SEC004 Tool adapter has its own SVID and per-request authorisation in the zero-trust model
Secrets Management for AI EAAPL-SEC008 Vault underpins the JIT credential pillar of the zero-trust pipeline
Distributed AI Tracing EAAPL-OBS007 Distributed tracing provides the audit layer for zero-trust pipeline verification

17. Maturity Assessment

Overall Maturity: Proven

Dimension Score (1–5) Rationale
Pattern definition clarity 5 NIST 800-207 provides clear foundation; AI-specific extensions well-defined
Technology availability 4 SPIRE, Istio, OPA are production-ready; full integration requires engineering investment
Industry adoption 3 Adopted in mature security organisations; AI-specific zero-trust still emerging
Implementation complexity 2 Significant operational complexity; requires dedicated platform engineering
Regulatory alignment 5 Directly referenced in NIST 800-207; strong APRA and EU AI Act alignment
Community knowledge 4 SPIFFE/SPIRE community strong; AI-specific guidance is newer

18. Revision History

Version Date Author Changes
1.0 2024-05-01 Security Architecture Team Initial pattern definition
1.1 2025-03-10 Security Architecture Team Added pipeline output signing; updated OWASP mapping; expanded binary authorisation guidance
← Back to LibraryMore AI Security