[EAAPL-SEC007] Zero-Trust AI Pipeline
Category: Security / Architecture
Sub-category: Pipeline Security Model
Version: 1.1
Maturity: Proven
Tags: zero-trust mtls identity-verification micro-segmentation jit-access continuous-verification pipeline-security
Regulatory Relevance: APRA CPS234, NIST SP 800-207 (Zero Trust Architecture), EU AI Act Art. 9, ISO 27001 A.9, ISO 42001 §6.1
1. Executive Summary
The Zero-Trust AI Pipeline applies the "never trust, always verify" security architecture principle to the complete AI request pipeline — from the initial client request through gateway, prompt processing, model inference, tool calls, and response delivery. Traditional perimeter security assumes that traffic inside the enterprise network is trustworthy. Zero-trust rejects this assumption: every component, every service call, and every data exchange must be authenticated, authorised, and verified regardless of network location.
This matters specifically for AI pipelines because they involve multiple service-to-service calls (gateway → prompt filter → model → tool → output filter), often span cloud and on-premises boundaries, and process data that, if intercepted at any hop, can represent a significant security breach. An AI pipeline that trusts its internal service calls is only as secure as its least-secured component — and AI pipelines have more components than most.
The pattern establishes five pillars for a zero-trust AI pipeline: identity verification on every service call (mutual TLS + JWT), micro-segmentation of pipeline stages, just-in-time access for AI workloads (no standing permissions), continuous verification of pipeline integrity, and comprehensive audit of every inter-service exchange. Organisations implementing this pattern achieve a documented reduction in lateral movement risk and a clear audit trail that satisfies both security operations and regulatory requirements.
2. Problem Statement
Business Problem
Enterprise AI pipelines are complex distributed systems with many inter-service dependencies. Organisations that have invested in perimeter security (firewalls, VPN) often assume that traffic between internal services is safe — a "flat network" model. For AI pipelines, this creates an unacceptable risk: if any single component in the pipeline is compromised (through a vulnerability, supply chain attack, or insider threat), an attacker has implicit trust across the pipeline and can intercept or inject data at any point.
Technical Problem
AI pipelines typically have multiple service-to-service calls that are authenticated weakly or not at all:
- Gateway to prompt firewall: internal HTTP call with no authentication.
- Model inference to tool endpoints: service account with broad permissions.
- RAG retrieval service to vector database: network-only access control.
- Output filter to model: no authentication (same cluster).
- Pipeline services sharing a single service account credential.
None of these connections verify that the caller is who it claims to be, or that the caller's permissions are appropriate for the specific request being made.
Symptoms
- Internal service calls using API keys stored in environment variables.
- Microservices sharing a single omnibus service account.
- No mTLS between AI pipeline components.
- Service-to-service calls not logged separately from client-to-gateway calls.
- Broad IAM permissions on model serving infrastructure.
- No mechanism to detect if a pipeline component has been tampered with.
Cost of Inaction
| Dimension |
Impact |
| Security |
Pipeline component compromise enables lateral movement to all connected services |
| Data |
Data exfiltration at any pipeline hop without detection |
| Regulatory |
Cannot demonstrate end-to-end pipeline security to APRA / EU AI Act auditors |
| Integrity |
Pipeline tampering (modifying prompts or responses in transit) undetectable |
| Accountability |
Cannot attribute a security incident to a specific pipeline component |
3. Context
When to Apply
- Multi-component AI pipelines with more than one service-to-service call.
- AI pipelines that cross network boundaries (on-premises to cloud, VPC to VPC).
- Organisations in regulated industries where audit of data flow is required.
- AI pipelines processing data at CONFIDENTIAL classification or above.
- Organisations with mature identity infrastructure (existing PKI or SPIFFE implementation).
When NOT to Apply
- Single-process AI applications (model and application in the same process/container) — zero-trust is a network architecture pattern.
- Early-stage development environments where operational overhead outweighs security benefits.
- Extremely latency-sensitive inference paths where mTLS handshake overhead is unacceptable (note: with session resumption, mTLS adds <1ms per request after the initial handshake).
Prerequisites
| Prerequisite |
Detail |
| PKI / SPIFFE |
Certificate authority or SPIFFE/SPIRE deployment for workload identity issuance |
| Service Mesh or mTLS Sidecar |
Istio, Linkerd, Consul Connect, or manual mTLS implementation |
| Short-lived Credential Infrastructure |
Vault or cloud-native IAM for JIT access credential issuance |
| Policy Engine |
OPA or equivalent for authorisation policy evaluation |
| Distributed Tracing |
OpenTelemetry for end-to-end pipeline audit |
Industry Applicability
| Industry |
Applicability |
Key Driver |
| Financial Services |
Critical |
APRA CPS234; data-in-transit integrity requirements |
| Healthcare |
Critical |
PHI pipeline integrity; cross-boundary data flows |
| Government / Defence |
Critical |
Classified data handling; adversarial threat model |
| Technology / Cloud |
High |
Multi-tenant AI platforms; cross-boundary SaaS pipelines |
| Retail |
Medium |
Customer data pipeline security |
4. Architecture Overview
The zero-trust AI pipeline replaces all implicit trust relationships between pipeline components with explicit, verified, and logged trust relationships. Every component operates as if it is on a hostile network — because in a zero-trust model, it is.
Pillar 1: Workload Identity (SPIFFE/SPIRE)
Every pipeline component — gateway, prompt firewall, model server, RAG retriever, tool adapter, output filter — is assigned a cryptographic workload identity via SPIFFE (Secure Production Identity Framework for Everyone). Each workload receives a short-lived X.509 certificate (SVID) issued by SPIRE, valid for 1 hour and automatically rotated. The SVID encodes the workload's identity (e.g., spiffe://enterprise.ai/gateway, spiffe://enterprise.ai/model-server). No component can forge another component's identity — it requires the SPIRE agent running on the specific compute instance.
Pillar 2: Mutual TLS on Every Service Call
Every service-to-service call in the pipeline uses mutual TLS (mTLS): both the caller and the callee present their SPIFFE SVIDs. This provides cryptographic authentication at both ends of every connection. A compromised component cannot impersonate another component; a man-in-the-middle attack is cryptographically impossible. Service mesh sidecars (Envoy via Istio or Linkerd) handle mTLS transparently — application code makes plain HTTP calls; the sidecar upgrades to mTLS and validates peer identity.
Pillar 3: Per-Request Authorisation
Authentication (you are who you claim to be) is necessary but not sufficient. Authorisation (you are permitted to make this specific request) must also be verified on every request. OPA policies evaluate: is the calling component permitted to call this component? Is the data classification of the request within the permitted range for this component pair? Is the request rate within configured limits?
Critically, authorisation is scoped to the request, not the connection. A gateway instance that is permitted to call the prompt firewall is not automatically permitted to call the model server directly — each hop's authorisation is evaluated independently.
Pillar 4: Just-in-Time Access
No pipeline component holds standing permissions to the resources it needs. Instead:
- Model weights are fetched from the registry at startup using a time-limited Vault lease.
- Tool call credentials are generated at the moment of tool invocation, scoped to the specific tool and operation.
- Database read credentials for RAG retrieval are generated per-session, scoped to the retrieval tenant's data.
- Cloud IAM role assumptions are time-limited (1-hour maximum).
JIT access eliminates the risk of credential theft: stolen credentials are expired within minutes to an hour, dramatically reducing the attack window.
Pillar 5: Continuous Verification and Pipeline Integrity
Zero-trust is not a static configuration — it requires continuous verification. This includes:
- Periodic rotation of SVIDs (every 1 hour) and re-authentication.
- Runtime integrity checks: each pipeline component signs its outputs; downstream components verify the signature before processing. If a prompt firewall output arrives at the model server without a valid signature from the firewall, the request is rejected.
- Anomaly detection on pipeline traffic patterns: unexpected call volumes, unusual source identities, or calls between non-adjacent components trigger alerts.
- Binary authorisation: container images in the pipeline must be signed and verified against an approved image registry before deployment.
5. Architecture Diagram
flowchart TD
subgraph Identity["Trust Infrastructure"]
A[SPIRE Identity Server]
B[OPA Policy Engine]
C[Vault JIT Credentials]
end
subgraph Pipeline["AI Pipeline"]
D[AI Gateway]
E[Prompt Firewall]
F[Model Server]
G[Output Filter]
end
subgraph ClientZ["Client"]
H[Application]
end
H -->|mTLS + JWT| D --> E --> F --> G --> H
A -.->|SVID per workload| D
A -.->|SVID per workload| F
B -.->|per-request authz| D
C -.->|JIT credentials| F
style H fill:#dbeafe,stroke:#3b82f6
style D fill:#f0fdf4,stroke:#22c55e
style E fill:#f0fdf4,stroke:#22c55e
style F fill:#fef9c3,stroke:#eab308
style G fill:#f0fdf4,stroke:#22c55e
style A fill:#fef9c3,stroke:#eab308
style B fill:#fef9c3,stroke:#eab308
style C fill:#fef9c3,stroke:#eab308
6. Components
| Component |
Type |
Responsibility |
Technology Options |
Criticality |
| SPIRE Server |
Identity |
Issues SVID certificates to attested workloads; rotates SVIDs every hour |
SPIFFE/SPIRE OSS, Istio CA, HashiCorp Vault PKI |
Critical |
| SPIRE Agent |
Identity |
Runs on each compute node; attests workloads; manages SVID lifecycle |
SPIFFE/SPIRE Agent |
Critical |
| Service Mesh Sidecar |
mTLS |
Transparent mTLS proxy for all inter-service communication |
Envoy (Istio), Linkerd proxy, Consul Connect |
Critical |
| OPA Policy Engine |
Authorisation |
Per-request authorisation policy evaluation for all inter-service calls |
Open Policy Agent, Cedar |
Critical |
| Vault (JIT Credentials) |
Secrets |
Dynamic, time-limited credential issuance for pipeline components |
HashiCorp Vault, AWS IAM roles with short TTL, Azure Managed Identity |
Critical |
| Pipeline Output Signer |
Integrity |
Signs outputs at each pipeline stage for downstream verification |
Custom HMAC signer, SPIFFE SVID-based signing |
High |
| Binary Authorisation |
Supply Chain |
Verifies container image signatures before deployment |
Google Binary Authorization, AWS Signer, Sigstore Cosign |
High |
| Anomaly Detector |
Monitoring |
Detects unusual inter-component call patterns |
Datadog APM, Elastic SIEM, custom OTel-based detector |
High |
| Distributed Tracing |
Audit |
End-to-end trace of every request through all pipeline components |
OpenTelemetry Collector, Jaeger, AWS X-Ray |
High |
7. Data Flow
Primary Flow
| Step |
Actor |
Action |
Output |
| 1 |
Application |
Sends request to AI Gateway with mTLS client cert + JWT |
Authenticated connection at gateway |
| 2 |
Gateway |
Validates client identity; evaluates OPA policy; generates internal request context with trace_id |
Authorised request with trace_id |
| 3 |
Gateway → Prompt Firewall |
mTLS call using SPIFFE SVID; OPA validates gateway→firewall call authorisation |
Authenticated, authorised call |
| 4 |
Prompt Firewall |
Processes request; signs output with SVID-derived HMAC |
Signed firewall result |
| 5 |
Prompt Firewall → Input Sanitiser |
mTLS call; sanitiser verifies firewall output signature |
Verified, authenticated handoff |
| 6 |
Input Sanitiser → Model Server |
mTLS call with signed sanitised prompt |
Authenticated, signed prompt at model server |
| 7 |
Model Server |
Generates response; for tool calls, requests JIT credential from Vault per tool |
Raw model response |
| 8 |
Model Server → Output Filter |
mTLS call with signed model output |
Authenticated handoff to output filter |
| 9 |
Output Filter → Gateway |
mTLS response with signed filtered output |
Verified response returned to gateway |
| 10 |
Gateway |
Returns response to application; full trace in distributed tracing system |
End-to-end traced, verified response |
Error Flow
| Error |
Behaviour |
Alert |
| SVID expired (component fails to renew) |
Downstream rejects mTLS connection; component isolated |
Critical: SVID renewal failure |
| OPA policy denies inter-component call |
Request rejected; component attempts logged |
Security: unexpected inter-component call |
| Pipeline output signature verification fails |
Downstream rejects processed request |
Security: pipeline integrity violation — possible MITM |
| JIT credential request rejected by Vault |
Component cannot proceed with operation; error returned |
Critical: JIT credential failure |
| Binary authorisation fails |
Container deployment blocked |
Security: unsigned container in pipeline |
8. Security Considerations
Authentication & Authorisation
- Every component has a unique, cryptographic workload identity (SPIFFE SVID).
- Every inter-component call authenticated via mTLS using SPIFFE SVIDs.
- Every call authorised by OPA with policies that specify exactly which components may call which other components.
Secrets Management
- No standing credentials. All credentials JIT-issued by Vault with minimum TTL.
- SVID private keys never leave the compute instance.
Data Classification
- OPA policies can enforce classification-based routing: a CONFIDENTIAL request may only traverse components cleared for CONFIDENTIAL processing.
Encryption
- All inter-service communication: TLS 1.3 via mTLS.
- TLS session keys rotated with SVID rotation (every hour maximum).
- At-rest encryption on all pipeline state.
OWASP LLM Top 10 Coverage
| OWASP LLM Risk |
Zero-Trust Pipeline Mitigation |
Coverage |
| LLM01: Prompt Injection |
Pipeline integrity signatures detect prompt tampering between stages |
Medium |
| LLM02: Insecure Output Handling |
Component isolation limits blast radius of output handling vulnerability |
Medium |
| LLM03: Training Data Poisoning |
Binary authorisation prevents tampered pipeline components from being deployed |
High |
| LLM04: Model Denial of Service |
Per-component authorisation enables request quota enforcement at each hop |
Medium |
| LLM05: Supply Chain Vulnerabilities |
Binary authorisation + SVID workload attestation prevent supply chain compromise |
Critical |
| LLM06: Sensitive Information Disclosure |
mTLS prevents data interception in transit; JIT credentials limit access scope |
High |
| LLM07: Insecure Plugin Design |
Tool adapter has its own SVID and OPA policy; not implicitly trusted |
High |
| LLM08: Excessive Agency |
Per-component authorisation limits what each component can call |
High |
| LLM09: Overreliance |
Not applicable |
None |
| LLM10: Model Theft |
Workload identity + JIT credentials prevent unauthorised access to model weights |
High |
9. Governance Considerations
Governance Artefacts
| Artefact |
Owner |
Frequency |
Purpose |
| Zero Trust Policy Definitions (OPA) |
Security Architecture |
Reviewed quarterly; updated with pipeline changes |
Documents all authorised inter-component relationships |
| SVID Issuance Audit Log |
Security Operations |
Continuous; weekly review |
Tracks all workload identity events; detects anomalous attestations |
| Pipeline Integrity Violation Log |
Security Operations |
Continuous; daily review |
Records all signature verification failures; triggers investigation |
| JIT Credential Audit |
Compliance |
Monthly |
Evidence of least-privilege access for APRA/regulatory review |
| Binary Authorisation Violation Log |
Security Operations |
Continuous |
Unauthorised container deployments |
10. Operational Considerations
SLOs
| SLO |
Target |
Measurement |
| SVID rotation latency |
<5s |
SPIRE rotation metric |
| mTLS overhead per hop (p99) |
<2ms (with session resumption) |
Inter-service span latency |
| OPA policy evaluation latency (p99) |
<3ms |
OPA decision latency |
| SPIRE availability |
99.99% (critical path) |
SPIRE health checks |
| Pipeline trace completeness |
>99.9% (all spans captured) |
OTel collector metrics |
Incident Management
- SVID rotation failure → P1: affected component isolated; SPIRE investigation.
- Pipeline integrity violation (signature failure) → P1: security incident; possible MITM; full pipeline forensics.
- Binary authorisation failure → P2: deployment blocked; investigate image provenance.
DR
| Scenario |
RTO |
Recovery |
| SPIRE server failure |
1min (in-flight SVIDs valid for remaining TTL) |
SPIRE HA cluster; failover to secondary |
| OPA server failure |
0 (fail-closed: deny all) |
OPA HA; policy cache in sidecar (last-known-good) |
| Full service mesh failure |
30min |
Runbook for graceful mesh recovery; failover path without mTLS with emergency alert |
11. Cost Considerations
Cost Drivers
| Cost Driver |
Description |
Relative Impact |
| SPIRE infrastructure |
SPIRE server + agents; relatively modest compute |
Low |
| Service mesh overhead |
Sidecar memory per pod (~50MB); mTLS CPU (<1% on modern CPUs) |
Low–Medium |
| OPA evaluation |
Per-request policy evaluation overhead |
Low (sub-millisecond) |
| Engineering / operations |
Initial implementation and ongoing policy management |
High (one-time) |
| Distributed tracing storage |
Full pipeline traces stored per request; grows with traffic |
Medium |
Indicative Cost Range
| Scale |
Monthly Additional Cost (USD) |
Notes |
| Small pipeline |
$300–$700 |
SPIRE, OPA, service mesh infrastructure |
| Medium pipeline |
$1,000–$3,000 |
Larger mesh footprint; distributed tracing storage |
| Large pipeline |
$3,000–$10,000 |
Multi-region SPIRE; dedicated OPA cluster; high-volume tracing |
12. Trade-Off Analysis
Option Comparison
| Option |
Description |
Pros |
Cons |
Best For |
| A: Network perimeter only |
Trust internal traffic; firewall at boundary |
Simple; low operational overhead |
Flat network allows lateral movement; no audit of inter-service calls |
Non-regulated, low-sensitivity AI applications |
| B: Service-level API keys |
Per-service shared API keys |
Simple to implement; light overhead |
Keys are long-lived; no workload attestation; hard to rotate at scale |
Transitional state toward full zero-trust |
| C: Full SPIFFE/SPIRE zero-trust (this pattern) |
Cryptographic workload identity + mTLS + OPA + JIT |
Strongest security posture; auditable; no credential sprawl |
Significant initial implementation effort; SPIRE ops burden |
Production AI pipelines in regulated industries |
| D: Cloud-native service mesh |
AWS App Mesh, Azure Service Fabric, GCP Traffic Director |
Cloud-managed; lower ops burden |
Vendor lock-in; less control over SVID format; may not support on-premises |
Cloud-committed organisations |
Architectural Tensions
| Tension |
Trade-Off |
| Security vs Developer Productivity |
Zero-trust adds complexity to local development. Resolution: develop with simplified mTLS (self-signed certs); enforce full SPIFFE/SPIRE only in staging and production. |
| SVID TTL vs Revocation Latency |
Shorter SVID TTL (1hr) means faster revocation but more rotation overhead. Resolution: 1-hour TTL is well-established; SPIRE handles rotation transparently. |
| OPA Centralisation vs Latency |
Centralised OPA adds a network hop. Resolution: deploy OPA as a local sidecar (or in-process bundle evaluation) for sub-millisecond decisions. |
13. Failure Modes
| Failure |
Likelihood |
Impact |
Detection |
Recovery |
| SPIRE server unavailable (SVIDs cannot be renewed) |
Low |
Critical (all components fail when SVIDs expire) |
SPIRE health check → P1 alert |
SPIRE HA cluster; multi-region deployment |
| OPA policy misconfiguration (denies legitimate calls) |
Medium |
High (pipeline components blocked) |
Pipeline error rate spike |
Rollback OPA policy to previous version |
| mTLS certificate CA compromise |
Very Low |
Critical (all pipeline trust invalidated) |
CA certificate monitoring; anomaly detection |
Emergency CA rotation; pipeline restart with new SVIDs |
| Pipeline signature verification false positive |
Low |
Medium (legitimate calls rejected) |
Signature failure rate metric |
Investigate signer; update trust bundle |
| JIT credential generation bottleneck |
Medium |
High (tool calls blocked) |
Vault latency metric |
Vault scale-out; credential caching |
14. Regulatory Considerations
| Regulation |
Requirement |
Implementation |
| APRA CPS234 §21 |
Controls must address data-in-transit protection |
mTLS on every inter-component call directly addresses this |
| NIST SP 800-207 (Zero Trust Architecture) |
Zero trust principles: verify explicitly, use least-privilege access, assume breach |
All three pillars implemented: SPIFFE (verify explicitly), JIT/OPA (least privilege), anomaly detection (assume breach) |
| ISO 27001 A.9 (Access Control) |
Access control to services and systems |
Per-component OPA authorisation implements service access control |
| EU AI Act Art. 9 (Risk Management) |
Technical risk management for high-risk AI |
Zero-trust pipeline is a documented technical risk management measure |
| SOC 2 CC6.3 |
Network security with least-privilege access |
mTLS + JIT credentials implement SOC 2 CC6.3 |
15. Reference Implementations
AWS
| Component |
AWS Service |
| Workload identity |
AWS IAM Roles for Service Accounts (IRSA) + SPIRE on EKS |
| mTLS |
AWS App Mesh + Envoy sidecar, or Istio on EKS |
| Policy engine |
OPA on Lambda / ECS |
| JIT credentials |
AWS IAM temporary credentials via STS |
| Binary authorisation |
AWS Signer + ECR image signing |
| Distributed tracing |
AWS X-Ray |
Azure
| Component |
Azure Service |
| Workload identity |
Azure Workload Identity + SPIRE |
| mTLS |
Istio on AKS or Linkerd |
| Policy engine |
OPA on AKS |
| JIT credentials |
Azure Key Vault + Managed Identity |
| Binary authorisation |
Azure Container Registry + Notary |
| Distributed tracing |
Azure Monitor + Application Insights |
On-Premises
| Component |
Technology |
| Workload identity |
SPIFFE/SPIRE (open source) |
| mTLS |
Istio or Linkerd service mesh |
| Policy engine |
OPA (sidecar bundle mode) |
| JIT credentials |
HashiCorp Vault |
| Binary authorisation |
Sigstore Cosign + in-cluster policy controller |
| Distributed tracing |
OpenTelemetry + Jaeger |
| Pattern |
ID |
Relationship |
| AI Gateway |
EAAPL-SEC001 |
Gateway is the entry point to the zero-trust pipeline |
| Model Isolation |
EAAPL-SEC003 |
Compute-layer isolation complements network-layer zero-trust |
| Secure Tool Invocation |
EAAPL-SEC004 |
Tool adapter has its own SVID and per-request authorisation in the zero-trust model |
| Secrets Management for AI |
EAAPL-SEC008 |
Vault underpins the JIT credential pillar of the zero-trust pipeline |
| Distributed AI Tracing |
EAAPL-OBS007 |
Distributed tracing provides the audit layer for zero-trust pipeline verification |
17. Maturity Assessment
Overall Maturity: Proven
| Dimension |
Score (1–5) |
Rationale |
| Pattern definition clarity |
5 |
NIST 800-207 provides clear foundation; AI-specific extensions well-defined |
| Technology availability |
4 |
SPIRE, Istio, OPA are production-ready; full integration requires engineering investment |
| Industry adoption |
3 |
Adopted in mature security organisations; AI-specific zero-trust still emerging |
| Implementation complexity |
2 |
Significant operational complexity; requires dedicated platform engineering |
| Regulatory alignment |
5 |
Directly referenced in NIST 800-207; strong APRA and EU AI Act alignment |
| Community knowledge |
4 |
SPIFFE/SPIRE community strong; AI-specific guidance is newer |
18. Revision History
| Version |
Date |
Author |
Changes |
| 1.0 |
2024-05-01 |
Security Architecture Team |
Initial pattern definition |
| 1.1 |
2025-03-10 |
Security Architecture Team |
Added pipeline output signing; updated OWASP mapping; expanded binary authorisation guidance |