[EAAPL-RAG003] Secure Retrieval-Augmented Generation
Category: Artificial Intelligence / Retrieval-Augmented Generation
Sub-category: Access Control and Data Security
Version: 1.4
Maturity: Proven
Tags: rag security acl zero-trust data-leakage-prevention pii-redaction classification tenant-isolation
Regulatory Relevance: APRA CPS234, Privacy Act 1988 (APP 11), EU AI Act Articles 9–13, ISO/IEC 27001, NIST AI RMF (Govern 2.2, Map 2.2), GDPR Articles 5 and 25
1. Executive Summary
Secure RAG is the enterprise access-control overlay that must be applied to any RAG deployment handling sensitive, confidential, or regulated content. It enforces three non-negotiable guarantees: users retrieve only documents they are authorised to access (pre-retrieval ACL enforcement), personally identifiable information in retrieved context is redacted before it is sent to the language model (PII redaction), and the classification level of source documents is preserved and propagated to the AI-generated output (classification inheritance).
For CISOs and compliance officers, Secure RAG operationalises zero-trust principles in the AI retrieval layer — treating every query as potentially adversarial and every retrieved document as a potential data leakage vector. The pattern prevents the most dangerous anti-pattern in enterprise RAG deployment: a vector database that returns results based purely on semantic similarity, ignorant of who the querying user is. Without this pattern, a RAG system in a financial institution could return a relationship manager's confidential client notes to an unrelated analyst, or an HR assistant could surface salary bands to individual contributors. The pattern is a mandatory overlay for any RAG deployment in a regulated industry, and a strong recommendation for all enterprise deployments.
2. Problem Statement
Business Problem
Enterprise documents carry access controls for a reason: commercial sensitivity, regulatory obligation, personal privacy, and legal privilege. A RAG system that ignores these controls creates a new attack surface where semantic similarity search can be exploited to extract restricted information — not by bypassing network controls, but by crafting queries that are semantically similar to restricted content.
Technical Problem
Standard vector similarity search is identity-blind: it returns the most semantically similar vectors regardless of which user is asking. Naive RAG implementations add ACL filtering as an afterthought at the response layer ("filter after retrieval"), which is insufficient — the LLM has already processed restricted content by the time the filter is applied. Furthermore, LLMs have no native understanding of document classification; without explicit engineering, they may include classified content in responses without any indication of the source document's sensitivity.
Symptoms
- Internal audit identifies that employees can query the AI assistant and receive answers containing information from documents they cannot access directly
- Helpdesk tickets from users who received answers containing another user's personal data (salary, medical history, disciplinary record)
- Security team discovers that PII from HR records is present verbatim in AI-generated responses
- Compliance team unable to demonstrate that AI-generated answers respect document access controls in regulatory examination
Cost of Inaction
- Regulatory penalty: Privacy Act 1988 APP 11 breach; GDPR Article 83 fines (up to 4% of global annual turnover); APRA CPS 234 supervisory action
- Legal liability: disclosure of legally privileged documents via RAG constitutes privilege waiver
- Reputational damage: cross-tenant data leakage in a SaaS RAG deployment is a reportable data breach
- Staff trust erosion: employees will not use an AI assistant they believe may expose their personal information to colleagues
3. Context
When to Apply
- Any RAG deployment in a regulated industry (financial services, healthcare, legal, government)
- Multi-tenant RAG deployments where different organisations share a vector index
- RAG systems handling personal information (HR, customer records, medical notes)
- RAG systems handling legally privileged, commercially sensitive, or national security-classified content
- Any RAG system subject to GDPR, Privacy Act, HIPAA, or equivalent data protection regulation
When NOT to Apply
- RAG over exclusively public-domain knowledge (open-source code, public web content) with no access restrictions
- Prototyping and development environments with fully synthetic, non-sensitive data
Prerequisites
- Enterprise identity provider (Azure AD, Okta, AWS IAM) with consistent group membership data
- Document metadata including access control lists (who or which groups can access each document) at ingestion time
- PII detection capability (either a dedicated NER model or a cloud PII detection service)
- Agreed document classification taxonomy (e.g., UNCLASSIFIED / OFFICIAL / SENSITIVE / PROTECTED)
Industry Applicability
| Industry |
Primary Security Concern |
Key Controls Required |
| Financial Services |
Cross-client data leakage; insider trading data exposure |
Per-client namespace isolation; strict ACL for market-sensitive research |
| Healthcare |
Patient record disclosure; clinical note leakage |
Row-level security by patient-care-team relationship; HIPAA Safe Harbor de-identification |
| Legal |
Legal professional privilege; client confidentiality |
Matter-level access control; privilege metadata preserved through RAG pipeline |
| Government |
Classified information disclosure |
Classification label inheritance; mandatory access control (MAC) enforcement |
| HR / People Analytics |
Salary, performance, disciplinary data exposure |
Role-based access with explicit manager/HR business partner scope |
| Multi-tenant SaaS |
Cross-tenant data leakage |
Strict namespace/tenant isolation at vector index layer |
4. Architecture Overview
Secure RAG introduces security enforcement at three distinct points in the pipeline: before retrieval (pre-filter), between retrieval and generation (PII redaction), and after generation (output classification). Critically, none of these enforcement points is optional — a "defence in depth" approach requires all three.
Pre-Retrieval ACL Enforcement (The Primary Control)
The pre-retrieval filter is the most important control. At ingestion time, every chunk is tagged with the ACL principals that are permitted to retrieve it: a set of user IDs and/or group IDs derived from the source document's access control list. At query time, the vector search is constrained by a metadata filter that restricts results to chunks where the querying user's identity or group membership is in the permitted set.
This approach — ACL-as-metadata-filter — is architecturally superior to post-retrieval ACL filtering for three reasons: (1) restricted chunks never enter the LLM context window, eliminating the risk of the LLM accidentally including restricted information in its output; (2) it scales with vector database metadata filtering performance (typically <5ms overhead); and (3) it is auditable — the metadata filter expression is logged alongside the query for compliance evidence.
ACL metadata must be kept synchronised with the source access control system. When a user's group membership changes (e.g., an employee changes teams), or a document's ACL is updated, the vector metadata must be updated within the organisation's defined ACL propagation SLA. A dedicated ACL sync job runs continuously, processing access control change events from the identity provider and updating affected chunk metadata in the vector database.
Namespace / Tenant Isolation
For multi-tenant deployments, namespace isolation provides an additional hard boundary. Each tenant's documents are stored in a separate vector database namespace (Pinecone) or collection (Weaviate) or index (OpenSearch). Queries are always scoped to a single tenant namespace — there is no cross-namespace search. This prevents both accidental and intentional cross-tenant data access even if an ACL metadata filter were somehow bypassed.
PII Redaction in Retrieved Context
Before the retrieved chunks are assembled into the LLM prompt, a PII detection and redaction step scans each chunk for personal information. This serves two purposes: (1) it prevents the LLM from echoing verbatim PII in its responses (important for privacy compliance even when the querying user is authorised to access the document), and (2) it reduces the risk of PII appearing in AI-generated outputs that may be cached, logged, or shared.
PII redaction uses a Named Entity Recognition (NER) model to identify personal information categories (name, email, phone, national ID, financial account, medical record number) and replaces them with typed placeholders: [PERSON_NAME], [EMAIL_ADDRESS], [TFN]. The original values are not discarded — they are stored in a redaction mapping keyed by the session ID, so that the response can be post-processed to either restore originals (for authorised users) or maintain redacted form (for less-privileged users or logged outputs).
The PII redaction step must be tuned per domain: financial documents have different PII signatures to HR records. A healthcare deployment may use HIPAA Safe Harbor de-identification rather than NER-based redaction.
Classification Inheritance and Output Labelling
Every retrieved chunk carries its source document's classification label. The assembled context window's classification is the maximum classification of all included chunks. The LLM system prompt includes this classification, and the generated response is automatically labelled with the classification level. If the assembled context includes any PROTECTED chunks, the response is labelled PROTECTED and the user's application must enforce appropriate handling (e.g., no copy to clipboard, no email forwarding, mandatory acknowledgement).
Output Scanner
A post-generation output scanner validates that the LLM response does not contain verbatim text from documents above the user's maximum clearance level, and scans for residual PII that may have bypassed the redaction step. If either check fails, the response is blocked and the query is flagged for security review.
5. Architecture Diagram
flowchart TD
subgraph Ingestion["Secure Ingestion"]
A[Source Document]
B[Classifier + PII Flagger]
C[Isolated Vector Namespace]
end
subgraph Query["Secure Query Pipeline"]
D[Authenticated User]
E[ACL Pre-filter]
F[PII Redaction]
G[Output Scanner]
end
subgraph Audit["Compliance"]
H[Audit Logger]
I[LLM Generation]
end
A --> B -->|chunks + ACL metadata| C
D --> E -->|ACL-filtered search| C
C --> F --> I --> G --> D
E --> H
G --> H
style A fill:#dbeafe,stroke:#3b82f6
style B fill:#f0fdf4,stroke:#22c55e
style C fill:#fef9c3,stroke:#eab308
style D fill:#dbeafe,stroke:#3b82f6
style E fill:#f0fdf4,stroke:#22c55e
style F fill:#f0fdf4,stroke:#22c55e
style G fill:#fee2e2,stroke:#ef4444
style H fill:#fef9c3,stroke:#eab308
style I fill:#d1fae5,stroke:#10b981
6. Components
| Component |
Type |
Responsibility |
Technology Options |
Criticality |
| Classification Tagger |
Data Processing |
Assign classification label to each document at ingestion |
Microsoft Purview, custom NLP classifier, manual metadata field |
Critical |
| PII Presence Flagger |
NLP |
Identify documents containing personal information for special handling |
AWS Comprehend, Azure AI Language (PII detection), spaCy NER, Presidio |
High |
| ACL Metadata Attachment |
Data Processing |
Encode groups_allowed and users_allowed as chunk metadata at ingestion |
Custom Python; vector DB metadata schema |
Critical |
| ACL Sync Job |
Integration |
Monitor identity provider for group changes; update vector metadata |
Azure Function + Microsoft Graph Delta API; AWS Lambda + Okta webhooks |
Critical |
| AuthN/AuthZ Gateway |
Security |
Validate OAuth tokens; enforce endpoint-level authorisation |
Azure API Management, AWS API Gateway + Cognito, Apigee, Kong |
Critical |
| Group Membership Resolver |
Security |
Expand user identity to transitive group membership set |
LDAP/AD group expansion; Okta Groups API; custom resolver |
Critical |
| Pre-Retrieval ACL Filter |
Security |
Build and inject metadata filter expression into vector query |
Custom Python + vector DB metadata filter syntax |
Critical |
| Namespace Isolation |
Security |
Restrict vector search to tenant/classification namespace |
Pinecone namespaces, Weaviate tenants, OpenSearch index-per-tenant |
Critical |
| PII Redaction Engine |
NLP / Security |
Detect and replace PII in retrieved chunks before context assembly |
Microsoft Presidio, AWS Comprehend Detect PII, Google DLP API |
High |
| Classification Aggregator |
Security |
Compute max classification across all retrieved chunks |
Custom Python aggregator |
High |
| Output Scanner |
Security |
Validate generated response for verbatim leakage and residual PII |
Custom scanner + Presidio + regex patterns |
High |
| Audit Logger |
Compliance |
Record full audit trail per query |
Splunk, Datadog, AWS CloudTrail, Azure Sentinel |
Critical |
7. Data Flow
Primary Flow
| Step |
Actor |
Action |
Output |
| 1 |
Source Document + Metadata |
Ingested into pipeline with ACL and classification metadata |
Raw document with ACL principals and classification label |
| 2 |
Classification Tagger |
Assign classification label if not already present |
{classification: "SENSITIVE"} metadata |
| 3 |
PII Presence Flagger |
Flag document if PII is present; set contains_pii: true metadata |
PII presence flag per document |
| 4 |
Chunking Engine |
Split document into chunks; inherit ACL and classification per chunk |
Chunks with {acl, classification, contains_pii} metadata |
| 5 |
ACL Metadata Attachment |
Write groups_allowed list to each chunk's vector metadata |
Chunk with ACL metadata payload |
| 6 |
Embedding + Vector DB |
Embed chunk; upsert with metadata |
Indexed vector with security metadata |
| 7 |
User Request |
Authenticated HTTP request with Bearer token |
Token + query string |
| 8 |
AuthN/AuthZ Gateway |
Validate JWT; extract sub, groups, classification_clearance claims |
Validated identity context |
| 9 |
Group Membership Resolver |
Expand user's groups to full transitive membership set |
{user_id, groups: [...], max_clearance: "SENSITIVE"} |
| 10 |
Pre-Retrieval ACL Filter |
Construct filter: groups_allowed CONTAINS_ANY user.groups AND classification <= user.max_clearance |
Metadata filter expression for vector query |
| 11 |
Vector ANN Search |
Execute scoped search with ACL filter |
Top-K chunks from permitted documents only |
| 12 |
PII Redaction Engine |
Scan each chunk for PII; replace with typed placeholders |
Redacted chunks + redaction map keyed by session ID |
| 13 |
Classification Aggregator |
Compute max classification of all retrieved chunks |
max_classification: "SENSITIVE" |
| 14 |
Context Assembler |
Assemble prompt with classification label in system context |
Classified prompt |
| 15 |
LLM |
Generate response referencing cited sources |
Raw response |
| 16 |
Output Scanner |
Scan for verbatim restricted content; scan for residual PII |
Pass / Block decision |
| 17 |
Classification Label Attachment |
Label response with max classification |
[SENSITIVITY: SENSITIVE] header on response |
| 18 |
Audit Logger |
Log: user_id, query, chunk_ids, ACL filter expression, classification, response hash, latency |
Immutable audit record |
Error Flow
| Error Condition |
Detection |
Recovery |
| Output scanner detects verbatim restricted content |
Post-generation scan |
Block response; return "Unable to provide answer"; flag P1 security incident |
| ACL sync lag > SLA (user's permission changed but metadata not updated) |
ACL sync monitoring |
Serve degraded results; alert; for high-classification queries enforce real-time ACL check |
| PII redaction model error |
Runtime exception |
Block query; return error to user; do not serve partially redacted context to LLM |
| Token validation failure |
AuthN gateway |
Return 401; log failed authentication attempt |
8. Security Considerations
Defence in Depth Layers
| Layer |
Control |
Failure Mode if Absent |
| Network |
mTLS between all pipeline components; private VPC endpoints to vector DB |
Man-in-the-middle on inter-component traffic |
| Authentication |
OAuth 2.0 / OIDC; short-lived tokens (15-min expiry); refresh token rotation |
Replay attack with stolen long-lived token |
| Pre-retrieval ACL |
Metadata filter on every vector query |
User retrieves documents they cannot access |
| Namespace isolation |
Tenant-scoped vector DB namespace |
Cross-tenant data leakage |
| PII Redaction |
NER-based PII removal before LLM context |
LLM echoes verbatim PII in response |
| Classification inheritance |
Max-classification propagation to output |
Unmarked response containing classified content |
| Output scanner |
Post-generation verbatim leak check |
Classified content bypasses PII redaction |
| Audit logging |
Immutable per-query audit trail |
Inability to investigate data leakage incidents |
OWASP LLM Top 10 Mitigations
| OWASP LLM Risk |
Secure RAG Specific Mitigation |
| LLM01: Prompt Injection |
All retrieved chunks treated as untrusted data; system prompt injection guard blocks instruction-like patterns in chunk content |
| LLM06: Sensitive Information Disclosure |
Pre-retrieval ACL filter + PII redaction + output scanner form a three-layer defence |
| LLM04: Model Denial of Service |
Query rate limiting per user/tenant; PII redaction workload bounded by chunk count |
| LLM02: Insecure Output Handling |
Output scanner validates response before delivery; classification label enforced on response object |
9. Governance Considerations
Access Control Review Cadence
- ACL mappings reviewed quarterly as part of access certification process
- Any change to a document's classification level triggers immediate re-ingestion of that document
- Group membership changes in identity provider must be reflected in vector metadata within 1 hour (Tier 1 documents) or 4 hours (Tier 2 documents)
PII Redaction Governance
- PII redaction rules reviewed by Privacy Officer annually
- New PII categories (e.g., biometric identifiers) added to redaction engine within 30 days of Privacy Act amendment
- Redaction effectiveness tested quarterly with synthetic PII insertion
Governance Artefacts
| Artefact |
Owner |
Frequency |
Purpose |
| ACL Mapping Audit Report |
Security |
Monthly |
Evidence of correct access control enforcement |
| PII Redaction Effectiveness Report |
Privacy Officer |
Quarterly |
Validate redaction coverage across document types |
| Classification Inheritance Spot Check |
Compliance |
Quarterly |
Verify response classification labels match source document classifications |
| Security Incident Register (RAG-specific) |
CISO |
Per event |
Track and classify data leakage or access control failures |
| Output Scanner False Positive/Negative Log |
AI Operations |
Weekly |
Tune scanner to reduce false positives while maintaining sensitivity |
10. Operational Considerations
Monitoring
| Metric |
Alert Threshold |
Severity |
| Output scanner blocks (per hour) |
> 5 |
P1 — potential data leakage attempt |
| ACL sync lag (minutes behind IdP) |
> 60 min (Tier 1 docs) |
P2 |
| PII redaction model error rate |
> 0.1% of queries |
P2 |
| Failed authentication rate (per user) |
> 5 failures in 10 min |
P1 — potential brute force |
| Cross-namespace query attempt |
Any |
P0 — immediate security investigation |
Service Level Objectives
| SLO |
Target |
Measurement |
| ACL filter correctness (sampled) |
100% |
Quarterly penetration test |
| PII redaction coverage on test set |
≥ 99% recall |
Monthly automated test |
| Output scanner false negative rate |
0% on test set |
Monthly |
| Query response P95 with security controls |
≤ 3.5 seconds |
Rolling 7-day |
Disaster Recovery
| Component |
RTO |
RPO |
DR Strategy |
| Vector Index with ACL Metadata |
1 hour |
1 hour |
Cross-region replica; ACL metadata backed up separately |
| ACL Sync Job |
30 minutes |
Real-time (event-driven) |
Active-active deployment; IdP webhook retry |
| PII Redaction Service |
15 minutes |
N/A |
Horizontally scaled; auto-scaling group |
11. Cost Considerations
Cost Drivers
| Cost Driver |
Notes |
Optimisation |
| PII detection at ingestion |
Per-document NER inference |
Batch processing; cache PII scan results per content hash |
| PII redaction at query time |
Per-chunk NER inference at query time |
Cache redaction results per chunk; redact at ingestion for static documents |
| ACL sync job compute |
Continuous running job |
Event-driven architecture (only process changes) rather than full re-sync |
| Output scanner |
Per-response LLM or regex scan |
Regex-first with LLM escalation only for uncertain cases |
| Namespace isolation overhead |
Separate indexes per tenant may increase index management cost |
Shared index with strict metadata filtering for lower-risk multi-tenancy |
Indicative Cost Range
| Deployment Scale |
Security Overhead vs. Base RAG |
Notes |
| Small (<1M vectors, <10K queries/day) |
+30–50% |
PII detection and ACL sync are fixed costs |
| Medium (1M–10M vectors, 10K–100K queries/day) |
+15–25% |
Economies of scale; PII caching reduces query-time cost |
| Large (>10M vectors, >100K queries/day) |
+10–15% |
Security controls amortise over query volume |
12. Trade-Off Analysis
ACL Enforcement Strategy Comparison
| Option |
Security Strength |
Performance Impact |
Operational Complexity |
Recommendation |
| Pre-retrieval metadata filter (recommended) |
Highest |
Minimal (<5ms) |
Medium |
Default for all deployments |
| Post-retrieval filter (after ANN search) |
Low (LLM sees restricted content) |
None |
Low |
Not recommended for sensitive content |
| Separate index per user |
Highest |
None |
Extremely High (N×index cost) |
Only for extreme isolation requirements |
| Hybrid (namespace isolation + ACL filter) |
Highest |
Minimal |
High |
Recommended for multi-tenant SaaS |
PII Handling Strategy Comparison
| Option |
Privacy Protection |
User Experience |
Complexity |
| Redact at ingestion (static redaction) |
High |
Consistent; redacted text always returned |
Low operational overhead; loses original for authorised users |
| Redact at query time based on user clearance |
Highest |
Authorised users see originals; others see redacted |
High; requires session-scoped redaction mapping |
| No redaction (rely on ACL only) |
Low |
Best |
Insufficient for privacy compliance |
Architectural Tensions
| Tension |
Trade-off |
Recommendation |
| ACL sync latency vs. security |
Low latency ACL sync = higher ops cost; high latency = window of incorrect access |
Risk-tiered SLA: 15min for PROTECTED, 1hr for SENSITIVE, 4hr for OFFICIAL |
| PII redaction completeness vs. answer quality |
Aggressive redaction removes useful context; permissive redaction exposes PII |
Domain-tuned NER model; human review of redaction rules quarterly |
13. Failure Modes
| Failure Mode |
Likelihood |
Impact |
Detection |
Recovery |
| ACL metadata not set on new document (ingestion gap) |
Medium |
Critical |
ACL presence validation at ingestion; zero-ACL alert |
Reject document from index until ACL metadata populated |
| PII present in response despite redaction (NER miss) |
Low |
High |
Output scanner; user report |
Block response; retrain/fine-tune NER model; report to Privacy Officer |
| User impersonation via stolen JWT |
Low |
Critical |
Anomalous access pattern detection; impossible geography |
Immediate token revocation; security incident response |
| Namespace escape bug in vector DB |
Very Low |
Critical |
Penetration testing; anomalous cross-namespace result in monitoring |
Immediate service suspension; vendor patch; full audit |
| ACL sync job failure (identity provider outage) |
Low |
Medium |
ACL sync monitoring |
Serve queries with last-known ACL; block queries for PROTECTED documents until sync resumes |
14. Regulatory Considerations
| Regulation |
Specific Requirement |
Secure RAG Implementation |
| Privacy Act 1988 APP 11 |
Security of personal information |
PII redaction; ACL enforcement; encrypted at rest and in transit; audit logs |
| GDPR Article 25 (Privacy by Design) |
Privacy controls embedded in system design, not added later |
Pre-retrieval ACL filter and PII redaction are first-class architectural components |
| GDPR Article 17 (Right to Erasure) |
Ability to delete all personal information about an individual |
Chunk deletion pipeline: remove vectors, document store records, and PII redaction cache for specified individual |
| APRA CPS 234 |
Information asset protection commensurate with criticality |
Classification-tiered ACL SLAs; encrypted vector index; WORM audit logs |
| EU AI Act Article 9 (Risk Management) |
Risk management system for high-risk AI |
Security risk assessment covering ACL bypass, PII leakage, and prompt injection scenarios |
| ISO/IEC 27001 A.8.3 |
Information access restriction |
Pre-retrieval ACL filter as primary access restriction control; documented in ISMS |
15. Reference Implementations
AWS
- ACL sync: Lambda triggered by Okta/Azure AD webhooks → update OpenSearch k-NN document metadata
- PII redaction: Amazon Comprehend PII detection + custom redaction Lambda
- Namespace isolation: OpenSearch index-per-tenant with IAM resource-based policy
- Output scanner: Lambda post-processor with Amazon Comprehend + regex
- Audit logging: CloudTrail + CloudWatch Logs to WORM S3 bucket
Azure
- ACL sync: Azure Function triggered by Azure AD Group Change notifications via Microsoft Graph subscriptions
- PII redaction: Azure AI Language PII detection API
- Namespace isolation: Azure AI Search index-per-tenant with managed identity RBAC
- Classification labels: Microsoft Purview Sensitivity Labels propagated through pipeline
- Audit logging: Azure Monitor + Azure Sentinel with immutable storage
GCP
- ACL sync: Cloud Run triggered by Google Workspace Directory push notifications
- PII redaction: Google Cloud DLP (Data Loss Prevention) API
- Namespace isolation: Vertex AI Vector Search index-per-tenant
- Audit logging: Cloud Audit Logs + BigQuery audit export
| Pattern ID |
Pattern Name |
Relationship |
| EAAPL-RAG001 |
Enterprise RAG |
Foundation; RAG003 is a mandatory security overlay |
| EAAPL-RAG002 |
Multi-Source RAG |
Complementary; cross-source ACL enforcement requires the RAG003 ACL normaliser |
| EAAPL-RAG004 |
Federated RAG |
Complementary; federation adds data sovereignty; RAG003 adds access control within each node |
| EAAPL-KNW004 |
Vector Database Management |
Provides the namespace isolation and backup strategy for the secure vector index |
17. Maturity Assessment
Overall Maturity: Proven — ACL-aware vector search is well-understood and implemented in all major vector databases; PII redaction tooling is mature; the pattern is deployed in production in regulated industries.
| Dimension |
Score (1–5) |
Rationale |
| Technology Readiness |
4 |
All components production-ready; ACL metadata filter supported in major vector DBs |
| Tooling Ecosystem |
4 |
Microsoft Presidio, AWS Comprehend, Google DLP are production PII detection services |
| Operational Guidance |
4 |
ACL sync patterns and audit requirements are well-documented in cloud provider guidance |
| Security & Compliance |
5 |
This pattern is the compliance pattern; directly addresses regulatory requirements |
| Scalability Evidence |
4 |
ACL filter adds minimal overhead; PII redaction scales horizontally |
| Cost Predictability |
3 |
PII detection and ACL sync add variable costs; calibration required per deployment |
18. Revision History
| Version |
Date |
Author |
Changes |
| 1.0 |
2024-02-10 |
EAAPL Working Group |
Initial publication |
| 1.1 |
2024-05-15 |
EAAPL Working Group |
Namespace isolation added; multi-tenant patterns formalised |
| 1.2 |
2024-08-20 |
EAAPL Working Group |
PII redaction architecture expanded; session-scoped redaction mapping added |
| 1.3 |
2024-11-30 |
EAAPL Working Group |
GDPR Article 25 mapping; right to erasure pipeline added |
| 1.4 |
2025-04-01 |
EAAPL Working Group |
Output scanner formalised; OWASP LLM 2024 mitigations updated |