Proven

EAAPL-CMP004 — Privacy-Preserving AI Architecture

Name: EAAPL Pattern Library
Creator: Enterprise AI Architecture Pattern Library
License: https://aipatterns.com.au/terms

📋 Regulatory Compliance🏭 Field-tested in AU

EAAPL-CMP004 — Privacy-Preserving AI Architecture

Status: Proven
Tags: privacy-act gdpr pii-handling data-isolation high-complexity
Version: 1.4
Last Updated: 2026-06-12
Author: Enterprise AI Architecture Pattern Library

1. Executive Summary

AI systems that ingest, process, or generate personal information must comply with privacy obligations across multiple jurisdictions simultaneously. This pattern addresses the dual requirements of the Australian Privacy Act 1988 (Commonwealth) Australian Privacy Principles and the EU General Data Protection Regulation. Both frameworks impose obligations on data minimisation, purpose limitation, cross-border transfer restrictions, consent management, and individual rights—each of which creates specific technical requirements when applied to AI inference pipelines, retrieval-augmented generation systems, and AI training workflows.

The Privacy-Preserving AI Architecture provides a layered control framework: a PII detection and redaction pipeline that sanitises inputs before they reach the LLM; purpose limitation enforcement through technical controls on data flows; a consent management integration that gates AI processing on valid consent; data sovereignty controls that prevent regulated personal data from crossing prohibited geographic boundaries; and a machine unlearning capability to respond to erasure requests. Organisations implementing this pattern demonstrate technical compliance with Privacy Act APP 11 (security of personal information), GDPR Articles 5, 6, 17, 20, and 25, and build AI systems that earn customer trust through verifiable privacy practices.

2. Problem Statement

Business Problem

Organisations building AI products on top of personal data face mounting privacy enforcement risk. The Australian Information Commissioner has signalled enforcement focus on AI systems in the 2025–2026 period. European Data Protection Authorities issued EUR 4.5 billion in GDPR fines through 2025, with AI-related processing accounting for a growing share. Privacy-by-design is a legal obligation under GDPR Article 25, not an optional design choice.

Technical Problem

Default LLM and AI service configurations are not privacy-preserving. Cloud LLM providers may use submitted data for model improvement. PII is routinely present in business documents, customer support tickets, and employee records that are ingested as RAG context. Training datasets frequently contain PII that was not adequately screened. The legal basis for AI inference on personal data is frequently assumed (legitimate interest) without a genuine balancing test.

Symptoms

Customer PII (names, email addresses, account numbers) is sent directly to third-party LLM APIs without redaction
There is no documented lawful basis under GDPR Article 6 or APP 3 for the specific AI processing activity
Privacy Impact Assessments are not conducted before deploying AI features that process personal data
RAG vector stores contain embeddings derived from personal data with no retention policy or deletion capability
The organisation cannot respond to a data subject erasure request because it cannot identify all locations where AI systems have stored or derived representations of that individual's data

Cost of Inaction

Dimension	Consequence
Regulatory	GDPR fines up to EUR 20M or 4% global turnover (Article 83(5)); OAIC determinations
Legal	Class action risk; individual compensation claims under GDPR Article 82
Operational	Loss of ability to process EU personal data; suspension of AI services
Reputational	Public data breach notification; customer trust erosion; media coverage

3. Context

When to Apply

Any AI system that receives, processes, or outputs personal information as defined by Privacy Act (information about an identified or reasonably identifiable individual) or GDPR (any information relating to an identified or identifiable natural person)
AI systems processing sensitive information (health, financial, biometric, religious, political, sexual orientation) — highest risk tier requiring strongest controls
RAG pipelines that index internal documents containing personal information
AI training workflows using datasets that may contain personal information
Cross-border AI services where data flows from EU or Australia to third-country AI providers

When NOT to Apply

AI systems processing exclusively synthetic or fully anonymised data (k-anonymity + l-diversity + t-closeness with formal guarantee)
Internal AI tools used by authorised staff on data they are already authorised to access, where no third-party cloud AI API is involved
AI research on fully de-identified public datasets with no personal data re-identification risk

Prerequisites

Prerequisite	Description
Data Classification	An operational data classification scheme that identifies personal and sensitive personal information
Records of Processing	GDPR Article 30 records of processing activities; Australian APP 1 privacy policy
Legal Basis Assessment	Organisation has assessed lawful bases available for the AI processing activity
Consent Management Platform	Existing CMP or ability to deploy one if consent is the chosen lawful basis
Data Flow Mapping	Mapping of data flows from source through AI pipeline to output and storage

Industry Applicability

Industry	Risk Level	Key Privacy Obligations
Healthcare	Very High	Health information is sensitive personal data; strict APP 6 / GDPR Article 9 obligations
Financial Services	High	Financial information; creditworthiness; APP 11 / GDPR Article 6 legitimate interest balancing
Human Resources	High	Employee records; performance data; sensitive categories common
Education	High	Student data; minor's data (enhanced GDPR protections)
Retail / E-commerce	Medium	Purchase history; profile data; consent management critical for personalisation AI
Legal Services	High	Legal professional privilege intersects with AI processing obligations
Government	High	Privacy Act 1988 applies to agencies; FOI and Privacy Act intersection

4. Architecture Overview

The Privacy-Preserving AI Architecture implements privacy controls at each stage of the AI data lifecycle: input, inference, retrieval, output, and storage. The architecture is founded on Privacy by Design (GDPR Article 25) — controls are embedded in the system, not applied as an afterthought.

Stage 1 — Data Minimisation at Input Before any data reaches the AI inference engine, a data minimisation filter applies the principle of processing only what is strictly necessary for the declared purpose. For structured inputs, this means stripping fields that are not required for the AI task (e.g., if an AI system summarises customer support tickets, it does not need the customer's date of birth — this field is removed). For unstructured text inputs, a PII detection and redaction pipeline identifies and pseudonymises or redacts personal identifiers. The minimisation configuration is driven by the purpose specification: each AI use case has a documented purpose, and the data minimisation filter enforces the minimum data set for that purpose.

Stage 2 — PII Detection and Redaction Pipeline The core privacy technical control is a PII detection and redaction pipeline that processes all text inputs destined for LLM inference. The pipeline operates in two modes: redaction mode (PII replaced with a token such as [PERSON_NAME] or [EMAIL_ADDRESS]) for use cases where the AI does not need the actual values; and pseudonymisation mode (PII replaced with a reversible, keyed token) for use cases where the actual value is needed downstream but should not be sent to the LLM. Pseudonymisation uses a per-context encryption key managed in a secrets manager; the mapping between pseudonym and real value is stored in a secure lookup store that is never exposed to the AI model. The pipeline supports Australian and EU PII categories: names, addresses, dates of birth, email addresses, phone numbers, government identifiers (TFN, Medicare, ABN, passport), financial account numbers, health information, and biometric descriptors.

Stage 3 — Purpose Limitation Technical Controls Purpose limitation (APP 6; GDPR Article 5(1)(b)) requires that data collected for one purpose is not used for a different, incompatible purpose. In AI systems, the risk is that data collected for customer service is used to train a marketing personalisation model. Technical purpose limitation controls include: purpose tags on all data assets, enforced by a policy engine that blocks AI workflows from accessing data tagged for incompatible purposes; data access audit logs that record which AI workflow consumed which data; and change management gates that require re-assessment of lawful basis and consent when an AI workflow's purpose changes.

Stage 4 — Consent Management Integration Where consent is the lawful basis for AI processing (GDPR Article 6(1)(a); APP 3.3), the AI processing pipeline must check consent status before processing. The consent management integration queries the consent management platform (CMP) at the start of each processing job or inference request. If valid consent does not exist for the specific processing activity, the request is blocked and the user is prompted to provide consent. Consent must be granular — separate consent for different AI processing activities (personalisation vs. model training vs. analytics). Consent withdrawal must immediately prevent further processing; it does not require the same technical immediacy as an erasure request but must be honoured within a commercially reasonable timeframe (GDPR guidance: without undue delay).

Stage 5 — Cross-Border Transfer Controls GDPR Article 46 restricts transfers of personal data to third countries without adequate protection. Australia's APP 8 imposes a similar obligation on cross-border disclosure. Cloud LLM API calls that send EU or Australian personal data to a provider operating outside those jurisdictions require a valid transfer mechanism: GDPR Standard Contractual Clauses (SCCs), adequacy decision, or Binding Corporate Rules; Australian APP 8.1 accountability mechanism. The cross-border transfer control layer checks each outbound API call against the destination's jurisdiction status and the data's classification. If the data is subject to transfer restrictions and no valid mechanism exists, the call is blocked and a local model fallback is invoked.

Stage 6 — Machine Unlearning for Erasure Requests GDPR Article 17 (right to erasure) and APP 13 (correction and destruction) create obligations to delete personal information on request. For AI systems, this is technically complex because personal data may be encoded in model weights through training, in embedding vectors in vector stores, and in inference logs. The machine unlearning layer provides: immediate deletion from inference logs (database delete + audit record); deletion from RAG vector stores (identify all vectors derived from the data subject's documents; delete by document ID); for model weights — re-training or fine-tuning on a dataset with the individual's data removed (computationally expensive; may be deferred for large foundation models with documented justification); and an erasure request tracking system that records the status of each sub-task.

5. Architecture Diagram

ARCHITECTURE DIAGRAM

flowchart TD subgraph Input["Input Controls"] A[Personal Data Source] B{Consent and Purpose Gate} C[PII Redaction Pipeline] end subgraph Inference["Inference Layer"] D{Cross-Border Check} E[Local Sovereign Model] F[Third-Party LLM API] end subgraph Output["Output and Rights"] G[Output PII Filter] H[Data Subject Rights Portal] end A --> B B -->|blocked| A B -->|permitted| C C --> D D -->|transfer blocked| E D -->|transfer permitted| F E --> G F --> G G --> H H -->|erasure or restrict| B style A fill:#dbeafe,stroke:#3b82f6 style B fill:#f3e8ff,stroke:#a855f7 style C fill:#f0fdf4,stroke:#22c55e style D fill:#f3e8ff,stroke:#a855f7 style E fill:#d1fae5,stroke:#10b981 style F fill:#fef9c3,stroke:#eab308 style G fill:#f0fdf4,stroke:#22c55e style H fill:#d1fae5,stroke:#10b981

6. Components

Component	Type	Responsibility	Technology Options	Criticality
Purpose Limitation Engine	Policy	Enforce data purpose tags; block cross-purpose AI data flows	OPA, custom policy engine, AWS Verified Permissions	High
Consent Management Platform Integration	Policy	Gate AI processing on valid, granular consent; honour withdrawals	OneTrust, Cookiebot, TrustArc, Didomi	High
Data Minimisation Filter	Processing	Strip non-required fields from structured inputs before AI pipeline	Custom Lambda/Function; Apache NiFi; dbt	High
PII Detector	Processing	Named entity recognition for personal identifiers (multi-jurisdiction)	AWS Comprehend, Azure Purview DLP, spaCy, Presidio	Critical
Redaction Engine	Processing	Replace/pseudonymise PII in text; manage pseudonym mappings	Microsoft Presidio, custom, AWS Macie	Critical
Pseudonym Key Store	Storage	Encrypted mapping of pseudonyms to real values; per-context keys	AWS Secrets Manager + DynamoDB, HashiCorp Vault + PostgreSQL	Critical
Cross-Border Transfer Check	Policy	Assess destination jurisdiction vs data classification; block if restricted	Custom policy engine; cloud-native geo-restriction	High
Local Model Runtime	Inference	Execute inference on restricted data without cross-border transfer	vLLM, Triton, Ollama (sovereign cloud deployment)	High
Output PII Filter	Processing	Detect and redact PII in AI-generated responses before delivery	AWS Comprehend, Azure Content Safety, Microsoft Presidio	High
Vector Store with Delete Capability	Storage	Store embeddings; support per-document deletion for erasure requests	Pinecone, Weaviate, pgvector, Chroma	High
Machine Unlearning Scheduler	Operations	Track erasure obligations on model weights; schedule re-training	Custom workflow, MLflow, SageMaker Pipelines	Medium
Privacy Impact Assessment Trigger	Governance	Evaluate new AI projects against PIA criteria; trigger PIA if required	Custom intake form, OneTrust PIA module	High
Erasure Request Tracker	Operations	Track erasure request status across all AI data stores	Custom database, ServiceNow, Jira	Critical

7. Data Flow

Primary Flow

Step	Actor	Action	Output
1	AI Application	Receive user input with data subject personal information	Raw input containing potential PII
2	Purpose Limitation Engine	Check input data's purpose tags against AI workflow's declared purpose	Permitted or denied
3	Consent Management Integration	Query CMP for valid consent for this specific AI processing activity	Consent status: valid / invalid / not-required
4	Data Minimisation Filter	Remove fields not required for the declared AI purpose	Minimised input
5	PII Detector	Scan minimised input for personal identifiers across all supported PII categories	Annotated text with PII spans identified
6	Redaction Engine	Replace identified PII with redaction tokens or pseudonyms; record pseudonym map	Sanitised input text safe for AI processing
7	Cross-Border Transfer Check	Assess data residency requirement vs intended inference endpoint	Route to local model or third-party API
8	AI Inference Engine	Execute inference on sanitised input	AI response (may contain pseudonyms or redacted tokens)
9	Output PII Filter	Scan AI response for PII that may have been generated or leaked	Clean AI response
10	Pseudonym Restore (if authorised)	Map pseudonyms back to real values for authorised consumers	Full response for authorised downstream systems
11	Inference Log	Record pseudonymised audit record of the interaction	Immutable audit entry with no real PII

Error Flow

Step	Failure	Detection	Recovery
PII Detection Miss	PII slips through detector into LLM API call	Output audit; third-party API terms monitoring	Log incident; improve detector; assess whether third-party retained data; notify DPA if GDPR Article 33 threshold met
Cross-Border Block with No Local Fallback	Transfer blocked; no local model available	Block event logged; request fails	Graceful error returned to user; alert operations; provision local model
Consent Withdrawal Not Honoured	Withdrawn consent subject continues to have data processed	Compliance audit; data subject complaint	Immediately halt processing; purge from processing queue; erasure assessment
Erasure Missed in Vector Store	Vectors derived from deleted subject remain in RAG index	Post-erasure verification scan	Delete missed vectors; update erasure request record; document as near-miss
Pseudonym Map Key Loss	Key store failure makes pseudonyms irreversible	Key store availability monitoring	Restore from HSM-backed key backup; assess whether any requests require restoration

8. Security Considerations

Privacy and Security Controls

Domain	Control	Implementation	Privacy Obligation
Authentication	PII pipeline components require service authentication; no anonymous access to pseudonym maps	IAM service accounts; mutual TLS	APP 11; GDPR Article 32
Authorisation	Only authorised consumers can invoke pseudonym restore; access logged	RBAC on pseudonym restore API; audit log	APP 11; GDPR Article 32
Secrets	Pseudonymisation keys stored in HSM-backed secrets manager; rotated annually	AWS KMS, Azure Key Vault, HashiCorp Vault	APP 11
Classification	All AI assets containing personal data classified; controls applied per classification	Data classification policy; automated tagging	APP 11; GDPR Article 32
Encryption	All personal data encrypted in transit (TLS 1.3) and at rest (AES-256 CMEK)	Cloud-native encryption with CMEK	APP 11; GDPR Article 32
Auditability	All PII processing events logged immutably; erasure actions logged	Immutable audit store; retention per regulation	APP 11; GDPR Article 5(2) accountability

OWASP LLM Top 10 — Privacy Mapping

OWASP LLM Risk	Privacy Impact	Control	Privacy Law Reference
LLM01 Prompt Injection	Adversary extracts personal data from AI context window via injection	Input sanitisation; context window size limits; PII redaction before context insertion	APP 11; GDPR Article 32
LLM02 Insecure Output Handling	AI response contains personal data passed to unauthorised downstream systems	Output PII filter; authorisation on pseudonym restore; schema validation	APP 11; GDPR Article 32
LLM03 Training Data Poisoning	Malicious PII injection into training data creates privacy risk in model	Training data governance; PII screening of training corpus; access control on training pipelines	APP 10; GDPR Article 5(1)(f)
LLM04 Model Denial of Service	Availability failure prevents erasure requests being processed in time	High availability for erasure request processing; SLA on erasure fulfilment	APP 13; GDPR Article 17
LLM05 Supply Chain Vulnerabilities	Third-party AI components process personal data without adequate controls	Vendor DPA; cross-border transfer mechanism; supply chain privacy assessment	APP 8; GDPR Article 28
LLM06 Sensitive Information Disclosure	AI reveals personal data from training set or context window	Output PII filter; differential privacy in training; training data audit	APP 11; GDPR Article 9 (special categories)
LLM07 Insecure Plugin Design	Tool calls made by AI agent exfiltrate personal data to unauthorised systems	Tool call whitelisting; data egress controls; purpose limitation on tool outputs	APP 6 (use/disclosure); GDPR Article 5(1)(b) purpose limitation
LLM08 Excessive Agency	Autonomous AI agent processes or shares personal data beyond authorised scope	Scope limits on agent data access; human approval for data-sharing actions	APP 6; GDPR Article 22
LLM09 Overreliance	AI-generated personal data inferences treated as fact; errors in personal records	Confidence scoring; human review for decisions affecting individuals	APP 13 (correction); GDPR Article 16 (rectification)
LLM10 Model Theft	Stolen model may allow extraction of training data containing personal data	Model access control; model weight encryption; training data minimisation	APP 11; GDPR Article 32

9. Governance Considerations

Privacy Governance

Domain	Requirement	Owner	Frequency
Privacy Impact Assessment	Conduct PIA for AI projects meeting trigger criteria (new technology, large-scale processing, sensitive data)	Privacy Officer / DPO	At project inception; re-assess on material change
Records of Processing	GDPR Article 30 records updated for each AI processing activity; APP 1 privacy policy updated	Privacy Officer	On each new AI deployment
Lawful Basis Documentation	Document and review lawful basis for each AI processing activity	Legal / Privacy	Annual review; on change of purpose
Data Subject Rights SLA	Erasure requests fulfilled within 30 days (GDPR) / reasonable time (APP); access requests within 30 days	Operations + Privacy	Per-request tracking
De-identification Assessment	Re-identification risk assessment for data treated as anonymous or de-identified	Privacy Officer	Annual; on technique change
Third-Party Privacy Assessment	Vendor DPA and transfer mechanism in place before AI vendor processes personal data	Procurement + Privacy	On vendor onboarding; annual review

Governance Artefacts

Artefact	Description	Retention
Privacy Impact Assessments	Documented PIA for each AI system processing personal data	Lifetime of system + 7 years
Records of Processing Activities	GDPR Article 30 register of all AI processing activities	Ongoing; current state + 7 years history
Consent Records	Evidence of valid consent obtained for consent-based AI processing	7 years after consent expires or is withdrawn
Erasure Request Log	Full audit trail of each erasure request, sub-tasks completed, and residual limitations	7 years
Cross-Border Transfer Records	DPAs, SCCs, adequacy decisions for each third-country AI provider	Duration of transfer + 7 years
PII Redaction Audit Logs	Log of all PII detections and redactions applied in the AI pipeline	3 years (operational)

10. Operational Considerations

Monitoring and SLOs

SLO	Target	Measurement	Breach Action
PII Redaction Coverage	>99.5% of PII spans detected before LLM call	Audit sample: 1% of inputs re-scanned post-redaction	Alert privacy engineering; model retrain if detection rate drops
Erasure Request Fulfilment	100% within 25 days (5-day buffer before GDPR 30-day limit)	Days from receipt to confirmed completion	Escalate to Privacy Officer; daily tracking
Consent Check Latency	<50ms added to inference pipeline	P99 latency of consent check API call	Investigate CMP performance; implement caching
Cross-Border Block Rate	0 unintended cross-border transfers detected	Egress monitoring; transfer policy check log	Immediate investigation; potential DPA notification
Purpose Limitation Violations	0 per month	Policy engine block events classified as violations	Root cause analysis; architecture review

Disaster Recovery

Scenario	Privacy Impact	Recovery
PII Detector Outage	Unredacted PII may be sent to AI models during outage	Halt AI inference pipeline; restore detector; assess any exposures during outage
Pseudonym Key Store Failure	Cannot restore pseudonyms for authorised consumers	Restore from HSM backup; no data loss if backup current; assess impact on in-flight requests
Consent Management Platform Unavailability	Cannot verify consent for new processing requests	Fail closed: block AI processing that requires consent verification until CMP restored
Vector Store Corruption	Cannot fulfil erasure requests until store restored	Restore from backup; re-run any erasure requests that were in-flight during corruption

11. Cost Considerations

Cost Drivers

Cost Driver	Indicative Cost	Notes
PII Detection Platform	USD 3,000–15,000/month	Scales with text volume; cloud-native services have per-character pricing
Pseudonymisation Infrastructure	USD 1,000–5,000/month	Key management + lookup store; near-flat scaling
Consent Management Platform	USD 2,000–20,000/month	Scales with site traffic and consent complexity
Local Model Hosting (for sovereign inference)	USD 10,000–50,000/month	GPU compute; varies significantly by model size and throughput
Erasure Request Processing	USD 500–3,000/month	Engineering time + automated tooling; spikes with high erasure volume
Privacy Engineering FTE	USD 180,000–350,000/year	1–2 FTE with privacy engineering + AI experience
Privacy Impact Assessments	USD 10,000–30,000 per major AI project	Legal + privacy officer time; external review

Indicative Cost Range

Organisation	Annual Privacy-Preserving AI Cost	Notes
Small (1–3 AI products, low PII volume)	USD 200,000–500,000	Cloud-native tooling; no local model required
Mid-size (4–10 AI products, medium PII)	USD 700,000–2,000,000	Local model for sovereign inference; dedicated privacy engineering
Large (10+ AI products, high PII volume, multi-jurisdiction)	USD 3,000,000–10,000,000	Enterprise CMP; full PII pipeline at scale; machine unlearning programme

12. Trade-Off Analysis

Architecture Options

Option	Description	Pros	Cons	Recommended For
Option A: Redact Before Cloud LLM	Send only redacted/pseudonymised text to cloud LLM APIs	Leverages best LLM capability; lower local compute cost	Redaction reduces AI response quality for name-dependent tasks; pseudonym complexity	Most use cases where AI does not need actual PII values
Option B: Sovereign Local Model Only	All inference on local/sovereign cloud models; no PII leaves jurisdiction	Maximum data control; no cross-border transfer concerns	Higher compute cost; local models may have lower capability	Highly regulated industries; sovereign data requirements; government
Option C: Federated Inference	Model is federated to user devices; no personal data centralised	PII never leaves user device	Very limited model capability; complex deployment; not suitable for server-side AI	Consumer mobile AI on highly sensitive health or financial data

Architectural Tensions

Tension	Trade-Off	Resolution
Redaction Completeness vs AI Response Quality	Aggressive redaction improves privacy but degrades AI output relevance	Use pseudonymisation (not full redaction) where AI needs semantic continuity; restore only at authorised output
Purpose Limitation vs AI Model Training	Using production user data to improve AI models serves product interest but may violate original collection purpose	Explicit consent for training use; or use synthetic data generated from production patterns
Machine Unlearning Completeness vs Operational Feasibility	Full unlearning from model weights is computationally expensive	Tiered approach: immediate erasure from logs/vectors (automated); model weight unlearning deferred and batched; documented justification under GDPR Recital 65
Consent Granularity vs User Experience	Highly granular consent improves legal basis but creates consent fatigue	Use layered consent with clear explanations; default to most privacy-protective option

13. Failure Modes

Failure	Likelihood	Impact	Detection	Recovery
PII Sent to Third-Party LLM API Unredacted	Medium	High — potential GDPR Article 33 breach notification obligation	Output audit; third-party DPA breach notification	Invoke breach response plan; assess exposure; notify DPA if threshold met (72 hours)
Erasure Request Missed in Vector Store	High	Medium — GDPR Article 17 breach; ICO/OAIC investigation	Post-erasure verification scan	Delete missed vectors; document near-miss; improve verification process
Purpose Creep — AI Model Trained on Consent-Gated Data	Medium	High — consent basis undermined; enforcement action	Data governance audit; purpose tag mismatch alert	Remove contaminated training data; re-train without; notify affected data subjects
Pseudonym Map Loss	Low	High — cannot restore data for authorised consumers; erasure verification impaired	Key store health monitoring	Restore from HSM backup; incident report
Cross-Border Transfer Without Mechanism	Medium	Critical — GDPR Article 46 breach; transfer suspension	Egress monitoring; legal review	Halt transfer; implement SCC retroactively; assess whether DPA notification required

Cascading Failure Scenario

A new AI feature is deployed with RAG indexing of customer emails. The PIA was skipped because the feature was deemed "low risk" by the product team. Customer emails contain health information (sensitive personal data under GDPR Article 9). The RAG vector store creates embeddings of this health information. These embeddings are sent as context to a US-based LLM API with no GDPR Standard Contractual Clauses in place—a cross-border transfer without a valid transfer mechanism. A data subject files an erasure request; the organisation cannot identify the vectors derived from the subject's emails because the vector store has no per-document deletion capability and no document-to-vector mapping. The DPA investigation finds three simultaneous breaches: special category processing without explicit consent (Article 9), cross-border transfer without mechanism (Article 46), and inability to fulfil erasure right (Article 17). Fine issued; AI service suspended.

14. Regulatory Considerations

Regulation	Specific Obligation	Architectural Control	Reference
Privacy Act 1988 APP 3	Collection of personal information only for lawful, notified purpose	Lawful basis assessment; purpose documentation before AI deployment	APP 3.1, APP 3.3
Privacy Act 1988 APP 6	Use or disclosure only for primary purpose or compatible secondary purpose	Purpose Limitation Engine; purpose tags on data assets	APP 6.1, APP 6.2
Privacy Act 1988 APP 8	Cross-border disclosure must ensure equivalent privacy protection	Cross-Border Transfer Check; DPA + accountability mechanism	APP 8.1
Privacy Act 1988 APP 11	Reasonable steps to protect personal information from misuse, interference, loss, unauthorised access	PII redaction pipeline; encryption; access controls	APP 11.1
Privacy Act 1988 APP 13	Correct or destroy personal information on request	Erasure Request Tracker; vector store delete; machine unlearning	APP 13.1
GDPR Article 5	Data minimisation, purpose limitation, storage limitation, integrity and confidentiality	Data Minimisation Filter; Purpose Limitation Engine; retention policies	GDPR Article 5(1)(b)(c)(e)(f)
GDPR Article 6	Lawful basis for processing personal data	Lawful basis documentation; consent check integration	GDPR Article 6(1)
GDPR Article 17	Right to erasure ('right to be forgotten')	Erasure Request Tracker; vector delete; log delete; machine unlearning	GDPR Article 17
GDPR Article 20	Right to data portability	Portability export module for AI-processed personal data	GDPR Article 20
GDPR Article 25	Privacy by Design and by Default	Architecture is inherently privacy-preserving; default is most restrictive	GDPR Article 25
GDPR Article 35	Data Protection Impact Assessment for high-risk processing	PIA Trigger engine; DPIA process integration	GDPR Article 35(3)

15. Reference Implementations

AWS

Component	AWS Service
PII Detection	Amazon Comprehend (custom entity recogniser for Australian PII)
Redaction Pipeline	AWS Lambda + Comprehend PII entity detection
Pseudonym Key Store	AWS Secrets Manager (keys) + DynamoDB Encrypted (mappings)
Cross-Border Transfer Control	VPC endpoint geo-restriction + SCPs preventing data to non-approved regions
Local Inference	Amazon Bedrock (Sydney region) or EC2 G-series with vLLM
Vector Store with Delete	Amazon OpenSearch Service (with per-document delete)
Consent Integration	AWS Lambda → CMP API; cached in ElastiCache
Erasure Tracker	DynamoDB + Step Functions orchestration

Azure

Component	Azure Service
PII Detection	Azure AI Language PII detection + custom recognisers
Redaction Pipeline	Azure Function + AI Language SDK
Pseudonym Key Store	Azure Key Vault (keys) + Azure Cosmos DB Encrypted (mappings)
Cross-Border Transfer Control	Azure Policy geo-restriction + Private Link
Local Inference	Azure OpenAI (Australia East) or Azure ML on Azure Government
Vector Store with Delete	Azure AI Search (with per-document delete) or pgvector
Consent Integration	Azure Function → OneTrust/TrustArc API
Erasure Tracker	Azure Cosmos DB + Azure Logic Apps

GCP

Component	GCP Service
PII Detection	Cloud DLP (Data Loss Prevention) API with AU infotypes
Redaction Pipeline	Cloud Functions + Cloud DLP
Pseudonym Key Store	Secret Manager (keys) + Firestore Encrypted (mappings)
Cross-Border Transfer Control	VPC Service Controls + Organisation Policy for resource location
Local Inference	Vertex AI (Australia Southeast region)
Vector Store with Delete	Vertex AI Vector Search or Weaviate on GKE
Consent Integration	Cloud Functions → CMP API; Memorystore cache
Erasure Tracker	Firestore + Cloud Workflows

On-Premises

Component	Technology
PII Detection	Microsoft Presidio (open source); spaCy with custom NER models
Redaction Pipeline	Apache NiFi with custom processors
Pseudonym Key Store	HashiCorp Vault (keys) + PostgreSQL encrypted (mappings)
Cross-Border Transfer Control	Network firewall egress rules; air-gapped inference environment
Local Inference	Ollama or vLLM on bare-metal GPU servers
Vector Store with Delete	Weaviate self-hosted; pgvector with row-level delete
Consent Integration	Custom REST API to CMP; Redis cache
Erasure Tracker	PostgreSQL + Temporal workflow engine

Pattern ID	Pattern Name	Relationship	Notes
EAAPL-CMP002	APRA CPS234 AI Security	COMPLEMENTARY	CPS234 security controls protect personal data at rest and in transit; deploy alongside
EAAPL-CMP003	EU AI Act Compliance	COMPLEMENTARY	GDPR obligations are prerequisites for EU AI Act Article 10 data governance compliance
EAAPL-CMP007	Data Residency for AI	PREREQUISITE	Must establish residency controls before cross-border transfer check is implementable
EAAPL-CMP008	GDPR-Compliant AI	EXTENSION	GDPR-Compliant AI pattern extends this pattern with GDPR-specific Article 22 automated decisions
EAAPL-AGT003	Human-in-the-Loop Oversight	COMPLEMENTARY	HITL oversight is required for AI decisions with significant privacy impact on individuals
EAAPL-PLT005	AI Data Governance	PREREQUISITE	Data classification and purpose tagging must be operational before this pattern can be deployed

17. Maturity Assessment

Overall Maturity Label: Proven

Dimension	Level 1	Level 2	Level 3	Level 4	Level 5	Current Level
PII Detection	No detection	Manual review	Automated NER; known PII categories	Multi-jurisdiction PII; custom entity models; audit sampling	Near-zero miss rate; continuous model improvement	Level 3–4
Purpose Limitation	No controls	Documented policy only	Policy engine; purpose tags enforced	Real-time violation detection; automated blocking	ML-powered purpose inference for unlabelled data	Level 3
Erasure Handling	Cannot fulfil erasure	Manual deletion from primary DB	Automated deletion from logs and vectors	Machine unlearning capability for model weights	Complete erasure including model weights; <7 day SLA	Level 3
Cross-Border Controls	No controls	Ad-hoc SCC review	Automated transfer check; block or route	Real-time jurisdiction monitoring	Continuous adequacy monitoring; instant reroute	Level 3
Privacy Governance	No PIA process	PIAs conducted occasionally	PIAs triggered for all qualifying AI projects	Privacy metrics in AI project KPIs	Privacy risk automated into CI/CD pipeline	Level 3

18. Revision History

Version	Date	Author	Changes
1.0	2025-04-01	EAAPL Working Group	Initial draft
1.1	2025-07-20	EAAPL Working Group	Added GDPR Article 22 automated decisions detail; expanded cross-border section
1.2	2025-10-05	EAAPL Working Group	Added machine unlearning section; updated Australian PII categories
1.3	2026-02-15	EAAPL Working Group	Added OWASP LLM Top 10 privacy mapping; cascading failure scenario
1.4	2026-06-12	EAAPL Working Group	Updated cost ranges; added federated inference option; aligned with Privacy Act 2024 amendments

← Back to Library More Regulatory Compliance →

EAAPL-CMP004 — Privacy-Preserving AI Architecture

EAAPL-CMP004 — Privacy-Preserving AI Architecture

1. Executive Summary

2. Problem Statement

Business Problem

Technical Problem

Symptoms

Cost of Inaction

3. Context

When to Apply

When NOT to Apply

Prerequisites

Industry Applicability

4. Architecture Overview

5. Architecture Diagram

6. Components

7. Data Flow

Primary Flow

Error Flow

8. Security Considerations

Privacy and Security Controls

OWASP LLM Top 10 — Privacy Mapping

9. Governance Considerations

Privacy Governance

Governance Artefacts

10. Operational Considerations

Monitoring and SLOs

Disaster Recovery

11. Cost Considerations

Cost Drivers

Indicative Cost Range

12. Trade-Off Analysis

Architecture Options

Architectural Tensions

13. Failure Modes

Cascading Failure Scenario

14. Regulatory Considerations

15. Reference Implementations

AWS

Azure

GCP

On-Premises

16. Related Patterns

17. Maturity Assessment

18. Revision History