EAAPL-SEC007Proven

Zero-Trust AI Pipeline

🔐 AI SecurityAPRA CPS234EU AI Act🏭 Field-tested in AU

[EAAPL-SEC007] Zero-Trust AI Pipeline

Category: Security / Architecture Sub-category: Pipeline Security Model Version: 1.1 Maturity: Proven Tags: zero-trust mtls identity-verification micro-segmentation jit-access continuous-verification pipeline-security Regulatory Relevance: APRA CPS234, NIST SP 800-207 (Zero Trust Architecture), EU AI Act Art. 9, ISO 27001 A.9, ISO 42001 §6.1

1. Executive Summary

The Zero-Trust AI Pipeline applies the "never trust, always verify" security architecture principle to the complete AI request pipeline — from the initial client request through gateway, prompt processing, model inference, tool calls, and response delivery. Traditional perimeter security assumes that traffic inside the enterprise network is trustworthy. Zero-trust rejects this assumption: every component, every service call, and every data exchange must be authenticated, authorised, and verified regardless of network location.

This matters specifically for AI pipelines because they involve multiple service-to-service calls (gateway → prompt filter → model → tool → output filter), often span cloud and on-premises boundaries, and process data that, if intercepted at any hop, can represent a significant security breach. An AI pipeline that trusts its internal service calls is only as secure as its least-secured component — and AI pipelines have more components than most.

The pattern establishes five pillars for a zero-trust AI pipeline: identity verification on every service call (mutual TLS + JWT), micro-segmentation of pipeline stages, just-in-time access for AI workloads (no standing permissions), continuous verification of pipeline integrity, and comprehensive audit of every inter-service exchange. Organisations implementing this pattern achieve a documented reduction in lateral movement risk and a clear audit trail that satisfies both security operations and regulatory requirements.

2. Problem Statement

Business Problem

Enterprise AI pipelines are complex distributed systems with many inter-service dependencies. Organisations that have invested in perimeter security (firewalls, VPN) often assume that traffic between internal services is safe — a "flat network" model. For AI pipelines, this creates an unacceptable risk: if any single component in the pipeline is compromised (through a vulnerability, supply chain attack, or insider threat), an attacker has implicit trust across the pipeline and can intercept or inject data at any point.

Technical Problem

AI pipelines typically have multiple service-to-service calls that are authenticated weakly or not at all:

Gateway to prompt firewall: internal HTTP call with no authentication.
Model inference to tool endpoints: service account with broad permissions.
RAG retrieval service to vector database: network-only access control.
Output filter to model: no authentication (same cluster).
Pipeline services sharing a single service account credential.

None of these connections verify that the caller is who it claims to be, or that the caller's permissions are appropriate for the specific request being made.

Symptoms

Internal service calls using API keys stored in environment variables.
Microservices sharing a single omnibus service account.
No mTLS between AI pipeline components.
Service-to-service calls not logged separately from client-to-gateway calls.
Broad IAM permissions on model serving infrastructure.
No mechanism to detect if a pipeline component has been tampered with.

Cost of Inaction

Dimension	Impact
Security	Pipeline component compromise enables lateral movement to all connected services
Data	Data exfiltration at any pipeline hop without detection
Regulatory	Cannot demonstrate end-to-end pipeline security to APRA / EU AI Act auditors
Integrity	Pipeline tampering (modifying prompts or responses in transit) undetectable
Accountability	Cannot attribute a security incident to a specific pipeline component

3. Context

When to Apply

Multi-component AI pipelines with more than one service-to-service call.
AI pipelines that cross network boundaries (on-premises to cloud, VPC to VPC).
Organisations in regulated industries where audit of data flow is required.
AI pipelines processing data at CONFIDENTIAL classification or above.
Organisations with mature identity infrastructure (existing PKI or SPIFFE implementation).

When NOT to Apply

Single-process AI applications (model and application in the same process/container) — zero-trust is a network architecture pattern.
Early-stage development environments where operational overhead outweighs security benefits.
Extremely latency-sensitive inference paths where mTLS handshake overhead is unacceptable (note: with session resumption, mTLS adds <1ms per request after the initial handshake).

Prerequisites

Prerequisite	Detail
PKI / SPIFFE	Certificate authority or SPIFFE/SPIRE deployment for workload identity issuance
Service Mesh or mTLS Sidecar	Istio, Linkerd, Consul Connect, or manual mTLS implementation
Short-lived Credential Infrastructure	Vault or cloud-native IAM for JIT access credential issuance
Policy Engine	OPA or equivalent for authorisation policy evaluation
Distributed Tracing	OpenTelemetry for end-to-end pipeline audit

Industry Applicability

Industry	Applicability	Key Driver
Financial Services	Critical	APRA CPS234; data-in-transit integrity requirements
Healthcare	Critical	PHI pipeline integrity; cross-boundary data flows
Government / Defence	Critical	Classified data handling; adversarial threat model
Technology / Cloud	High	Multi-tenant AI platforms; cross-boundary SaaS pipelines
Retail	Medium	Customer data pipeline security

4. Architecture Overview

The zero-trust AI pipeline replaces all implicit trust relationships between pipeline components with explicit, verified, and logged trust relationships. Every component operates as if it is on a hostile network — because in a zero-trust model, it is.

Pillar 1: Workload Identity (SPIFFE/SPIRE)

Every pipeline component — gateway, prompt firewall, model server, RAG retriever, tool adapter, output filter — is assigned a cryptographic workload identity via SPIFFE (Secure Production Identity Framework for Everyone). Each workload receives a short-lived X.509 certificate (SVID) issued by SPIRE, valid for 1 hour and automatically rotated. The SVID encodes the workload's identity (e.g., spiffe://enterprise.ai/gateway, spiffe://enterprise.ai/model-server). No component can forge another component's identity — it requires the SPIRE agent running on the specific compute instance.

Pillar 2: Mutual TLS on Every Service Call

Every service-to-service call in the pipeline uses mutual TLS (mTLS): both the caller and the callee present their SPIFFE SVIDs. This provides cryptographic authentication at both ends of every connection. A compromised component cannot impersonate another component; a man-in-the-middle attack is cryptographically impossible. Service mesh sidecars (Envoy via Istio or Linkerd) handle mTLS transparently — application code makes plain HTTP calls; the sidecar upgrades to mTLS and validates peer identity.

Pillar 3: Per-Request Authorisation

Authentication (you are who you claim to be) is necessary but not sufficient. Authorisation (you are permitted to make this specific request) must also be verified on every request. OPA policies evaluate: is the calling component permitted to call this component? Is the data classification of the request within the permitted range for this component pair? Is the request rate within configured limits?

Critically, authorisation is scoped to the request, not the connection. A gateway instance that is permitted to call the prompt firewall is not automatically permitted to call the model server directly — each hop's authorisation is evaluated independently.

Pillar 4: Just-in-Time Access

No pipeline component holds standing permissions to the resources it needs. Instead:

Model weights are fetched from the registry at startup using a time-limited Vault lease.
Tool call credentials are generated at the moment of tool invocation, scoped to the specific tool and operation.
Database read credentials for RAG retrieval are generated per-session, scoped to the retrieval tenant's data.
Cloud IAM role assumptions are time-limited (1-hour maximum).

JIT access eliminates the risk of credential theft: stolen credentials are expired within minutes to an hour, dramatically reducing the attack window.

Pillar 5: Continuous Verification and Pipeline Integrity

Zero-trust is not a static configuration — it requires continuous verification. This includes:

Periodic rotation of SVIDs (every 1 hour) and re-authentication.
Runtime integrity checks: each pipeline component signs its outputs; downstream components verify the signature before processing. If a prompt firewall output arrives at the model server without a valid signature from the firewall, the request is rejected.
Anomaly detection on pipeline traffic patterns: unexpected call volumes, unusual source identities, or calls between non-adjacent components trigger alerts.
Binary authorisation: container images in the pipeline must be signed and verified against an approved image registry before deployment.

5. Architecture Diagram

ARCHITECTURE DIAGRAM

flowchart TD subgraph Identity["Trust Infrastructure"] A[SPIRE Identity Server] B[OPA Policy Engine] C[Vault JIT Credentials] end subgraph Pipeline["AI Pipeline"] D[AI Gateway] E[Prompt Firewall] F[Model Server] G[Output Filter] end subgraph ClientZ["Client"] H[Application] end H -->|mTLS + JWT| D --> E --> F --> G --> H A -.->|SVID per workload| D A -.->|SVID per workload| F B -.->|per-request authz| D C -.->|JIT credentials| F style H fill:#dbeafe,stroke:#3b82f6 style D fill:#f0fdf4,stroke:#22c55e style E fill:#f0fdf4,stroke:#22c55e style F fill:#fef9c3,stroke:#eab308 style G fill:#f0fdf4,stroke:#22c55e style A fill:#fef9c3,stroke:#eab308 style B fill:#fef9c3,stroke:#eab308 style C fill:#fef9c3,stroke:#eab308

6. Components

Component	Type	Responsibility	Technology Options	Criticality
SPIRE Server	Identity	Issues SVID certificates to attested workloads; rotates SVIDs every hour	SPIFFE/SPIRE OSS, Istio CA, HashiCorp Vault PKI	Critical
SPIRE Agent	Identity	Runs on each compute node; attests workloads; manages SVID lifecycle	SPIFFE/SPIRE Agent	Critical
Service Mesh Sidecar	mTLS	Transparent mTLS proxy for all inter-service communication	Envoy (Istio), Linkerd proxy, Consul Connect	Critical
OPA Policy Engine	Authorisation	Per-request authorisation policy evaluation for all inter-service calls	Open Policy Agent, Cedar	Critical
Vault (JIT Credentials)	Secrets	Dynamic, time-limited credential issuance for pipeline components	HashiCorp Vault, AWS IAM roles with short TTL, Azure Managed Identity	Critical
Pipeline Output Signer	Integrity	Signs outputs at each pipeline stage for downstream verification	Custom HMAC signer, SPIFFE SVID-based signing	High
Binary Authorisation	Supply Chain	Verifies container image signatures before deployment	Google Binary Authorization, AWS Signer, Sigstore Cosign	High
Anomaly Detector	Monitoring	Detects unusual inter-component call patterns	Datadog APM, Elastic SIEM, custom OTel-based detector	High
Distributed Tracing	Audit	End-to-end trace of every request through all pipeline components	OpenTelemetry Collector, Jaeger, AWS X-Ray	High

7. Data Flow

Primary Flow

Step	Actor	Action	Output
1	Application	Sends request to AI Gateway with mTLS client cert + JWT	Authenticated connection at gateway
2	Gateway	Validates client identity; evaluates OPA policy; generates internal request context with trace_id	Authorised request with trace_id
3	Gateway → Prompt Firewall	mTLS call using SPIFFE SVID; OPA validates gateway→firewall call authorisation	Authenticated, authorised call
4	Prompt Firewall	Processes request; signs output with SVID-derived HMAC	Signed firewall result
5	Prompt Firewall → Input Sanitiser	mTLS call; sanitiser verifies firewall output signature	Verified, authenticated handoff
6	Input Sanitiser → Model Server	mTLS call with signed sanitised prompt	Authenticated, signed prompt at model server
7	Model Server	Generates response; for tool calls, requests JIT credential from Vault per tool	Raw model response
8	Model Server → Output Filter	mTLS call with signed model output	Authenticated handoff to output filter
9	Output Filter → Gateway	mTLS response with signed filtered output	Verified response returned to gateway
10	Gateway	Returns response to application; full trace in distributed tracing system	End-to-end traced, verified response

Error Flow

Error	Behaviour	Alert
SVID expired (component fails to renew)	Downstream rejects mTLS connection; component isolated	Critical: SVID renewal failure
OPA policy denies inter-component call	Request rejected; component attempts logged	Security: unexpected inter-component call
Pipeline output signature verification fails	Downstream rejects processed request	Security: pipeline integrity violation — possible MITM
JIT credential request rejected by Vault	Component cannot proceed with operation; error returned	Critical: JIT credential failure
Binary authorisation fails	Container deployment blocked	Security: unsigned container in pipeline

8. Security Considerations

Authentication & Authorisation

Every component has a unique, cryptographic workload identity (SPIFFE SVID).
Every inter-component call authenticated via mTLS using SPIFFE SVIDs.
Every call authorised by OPA with policies that specify exactly which components may call which other components.

Secrets Management

No standing credentials. All credentials JIT-issued by Vault with minimum TTL.
SVID private keys never leave the compute instance.

Data Classification

OPA policies can enforce classification-based routing: a CONFIDENTIAL request may only traverse components cleared for CONFIDENTIAL processing.

Encryption

All inter-service communication: TLS 1.3 via mTLS.
TLS session keys rotated with SVID rotation (every hour maximum).
At-rest encryption on all pipeline state.

OWASP LLM Top 10 Coverage

OWASP LLM Risk	Zero-Trust Pipeline Mitigation	Coverage
LLM01: Prompt Injection	Pipeline integrity signatures detect prompt tampering between stages	Medium
LLM02: Insecure Output Handling	Component isolation limits blast radius of output handling vulnerability	Medium
LLM03: Training Data Poisoning	Binary authorisation prevents tampered pipeline components from being deployed	High
LLM04: Model Denial of Service	Per-component authorisation enables request quota enforcement at each hop	Medium
LLM05: Supply Chain Vulnerabilities	Binary authorisation + SVID workload attestation prevent supply chain compromise	Critical
LLM06: Sensitive Information Disclosure	mTLS prevents data interception in transit; JIT credentials limit access scope	High
LLM07: Insecure Plugin Design	Tool adapter has its own SVID and OPA policy; not implicitly trusted	High
LLM08: Excessive Agency	Per-component authorisation limits what each component can call	High
LLM09: Overreliance	Not applicable	None
LLM10: Model Theft	Workload identity + JIT credentials prevent unauthorised access to model weights	High

9. Governance Considerations

Governance Artefacts

Artefact	Owner	Frequency	Purpose
Zero Trust Policy Definitions (OPA)	Security Architecture	Reviewed quarterly; updated with pipeline changes	Documents all authorised inter-component relationships
SVID Issuance Audit Log	Security Operations	Continuous; weekly review	Tracks all workload identity events; detects anomalous attestations
Pipeline Integrity Violation Log	Security Operations	Continuous; daily review	Records all signature verification failures; triggers investigation
JIT Credential Audit	Compliance	Monthly	Evidence of least-privilege access for APRA/regulatory review
Binary Authorisation Violation Log	Security Operations	Continuous	Unauthorised container deployments

10. Operational Considerations

SLOs

SLO	Target	Measurement
SVID rotation latency	<5s	SPIRE rotation metric
mTLS overhead per hop (p99)	<2ms (with session resumption)	Inter-service span latency
OPA policy evaluation latency (p99)	<3ms	OPA decision latency
SPIRE availability	99.99% (critical path)	SPIRE health checks
Pipeline trace completeness	>99.9% (all spans captured)	OTel collector metrics

Incident Management

SVID rotation failure → P1: affected component isolated; SPIRE investigation.
Pipeline integrity violation (signature failure) → P1: security incident; possible MITM; full pipeline forensics.
Binary authorisation failure → P2: deployment blocked; investigate image provenance.

DR

Scenario	RTO	Recovery
SPIRE server failure	1min (in-flight SVIDs valid for remaining TTL)	SPIRE HA cluster; failover to secondary
OPA server failure	0 (fail-closed: deny all)	OPA HA; policy cache in sidecar (last-known-good)
Full service mesh failure	30min	Runbook for graceful mesh recovery; failover path without mTLS with emergency alert

11. Cost Considerations

Cost Drivers

Cost Driver	Description	Relative Impact
SPIRE infrastructure	SPIRE server + agents; relatively modest compute	Low
Service mesh overhead	Sidecar memory per pod (~50MB); mTLS CPU (<1% on modern CPUs)	Low–Medium
OPA evaluation	Per-request policy evaluation overhead	Low (sub-millisecond)
Engineering / operations	Initial implementation and ongoing policy management	High (one-time)
Distributed tracing storage	Full pipeline traces stored per request; grows with traffic	Medium

Indicative Cost Range

Scale	Monthly Additional Cost (USD)	Notes
Small pipeline	$300–$700	SPIRE, OPA, service mesh infrastructure
Medium pipeline	$1,000–$3,000	Larger mesh footprint; distributed tracing storage
Large pipeline	$3,000–$10,000	Multi-region SPIRE; dedicated OPA cluster; high-volume tracing

12. Trade-Off Analysis

Option Comparison

Option	Description	Pros	Cons	Best For
A: Network perimeter only	Trust internal traffic; firewall at boundary	Simple; low operational overhead	Flat network allows lateral movement; no audit of inter-service calls	Non-regulated, low-sensitivity AI applications
B: Service-level API keys	Per-service shared API keys	Simple to implement; light overhead	Keys are long-lived; no workload attestation; hard to rotate at scale	Transitional state toward full zero-trust
C: Full SPIFFE/SPIRE zero-trust (this pattern)	Cryptographic workload identity + mTLS + OPA + JIT	Strongest security posture; auditable; no credential sprawl	Significant initial implementation effort; SPIRE ops burden	Production AI pipelines in regulated industries
D: Cloud-native service mesh	AWS App Mesh, Azure Service Fabric, GCP Traffic Director	Cloud-managed; lower ops burden	Vendor lock-in; less control over SVID format; may not support on-premises	Cloud-committed organisations

Architectural Tensions

Tension	Trade-Off
Security vs Developer Productivity	Zero-trust adds complexity to local development. Resolution: develop with simplified mTLS (self-signed certs); enforce full SPIFFE/SPIRE only in staging and production.
SVID TTL vs Revocation Latency	Shorter SVID TTL (1hr) means faster revocation but more rotation overhead. Resolution: 1-hour TTL is well-established; SPIRE handles rotation transparently.
OPA Centralisation vs Latency	Centralised OPA adds a network hop. Resolution: deploy OPA as a local sidecar (or in-process bundle evaluation) for sub-millisecond decisions.

13. Failure Modes

Failure	Likelihood	Impact	Detection	Recovery
SPIRE server unavailable (SVIDs cannot be renewed)	Low	Critical (all components fail when SVIDs expire)	SPIRE health check → P1 alert	SPIRE HA cluster; multi-region deployment
OPA policy misconfiguration (denies legitimate calls)	Medium	High (pipeline components blocked)	Pipeline error rate spike	Rollback OPA policy to previous version
mTLS certificate CA compromise	Very Low	Critical (all pipeline trust invalidated)	CA certificate monitoring; anomaly detection	Emergency CA rotation; pipeline restart with new SVIDs
Pipeline signature verification false positive	Low	Medium (legitimate calls rejected)	Signature failure rate metric	Investigate signer; update trust bundle
JIT credential generation bottleneck	Medium	High (tool calls blocked)	Vault latency metric	Vault scale-out; credential caching

14. Regulatory Considerations

Regulation	Requirement	Implementation
APRA CPS234 §21	Controls must address data-in-transit protection	mTLS on every inter-component call directly addresses this
NIST SP 800-207 (Zero Trust Architecture)	Zero trust principles: verify explicitly, use least-privilege access, assume breach	All three pillars implemented: SPIFFE (verify explicitly), JIT/OPA (least privilege), anomaly detection (assume breach)
ISO 27001 A.9 (Access Control)	Access control to services and systems	Per-component OPA authorisation implements service access control
EU AI Act Art. 9 (Risk Management)	Technical risk management for high-risk AI	Zero-trust pipeline is a documented technical risk management measure
SOC 2 CC6.3	Network security with least-privilege access	mTLS + JIT credentials implement SOC 2 CC6.3

15. Reference Implementations

AWS

Component	AWS Service
Workload identity	AWS IAM Roles for Service Accounts (IRSA) + SPIRE on EKS
mTLS	AWS App Mesh + Envoy sidecar, or Istio on EKS
Policy engine	OPA on Lambda / ECS
JIT credentials	AWS IAM temporary credentials via STS
Binary authorisation	AWS Signer + ECR image signing
Distributed tracing	AWS X-Ray

Azure

Component	Azure Service
Workload identity	Azure Workload Identity + SPIRE
mTLS	Istio on AKS or Linkerd
Policy engine	OPA on AKS
JIT credentials	Azure Key Vault + Managed Identity
Binary authorisation	Azure Container Registry + Notary
Distributed tracing	Azure Monitor + Application Insights

On-Premises

Component	Technology
Workload identity	SPIFFE/SPIRE (open source)
mTLS	Istio or Linkerd service mesh
Policy engine	OPA (sidecar bundle mode)
JIT credentials	HashiCorp Vault
Binary authorisation	Sigstore Cosign + in-cluster policy controller
Distributed tracing	OpenTelemetry + Jaeger

Pattern	ID	Relationship
AI Gateway	EAAPL-SEC001	Gateway is the entry point to the zero-trust pipeline
Model Isolation	EAAPL-SEC003	Compute-layer isolation complements network-layer zero-trust
Secure Tool Invocation	EAAPL-SEC004	Tool adapter has its own SVID and per-request authorisation in the zero-trust model
Secrets Management for AI	EAAPL-SEC008	Vault underpins the JIT credential pillar of the zero-trust pipeline
Distributed AI Tracing	EAAPL-OBS007	Distributed tracing provides the audit layer for zero-trust pipeline verification

17. Maturity Assessment

Overall Maturity: Proven

Dimension	Score (1–5)	Rationale
Pattern definition clarity	5	NIST 800-207 provides clear foundation; AI-specific extensions well-defined
Technology availability	4	SPIRE, Istio, OPA are production-ready; full integration requires engineering investment
Industry adoption	3	Adopted in mature security organisations; AI-specific zero-trust still emerging
Implementation complexity	2	Significant operational complexity; requires dedicated platform engineering
Regulatory alignment	5	Directly referenced in NIST 800-207; strong APRA and EU AI Act alignment
Community knowledge	4	SPIFFE/SPIRE community strong; AI-specific guidance is newer

18. Revision History

Version	Date	Author	Changes
1.0	2024-05-01	Security Architecture Team	Initial pattern definition
1.1	2025-03-10	Security Architecture Team	Added pipeline output signing; updated OWASP mapping; expanded binary authorisation guidance

← Back to Library More AI Security →