EAAPLEnterprise AI Architecture Pattern Library
EAAPLLibraryAI SecurityEAAPL-SEC005
EAAPL-SEC005Proven
⇄ Compare

LLM Input Sanitisation

🔐 AI SecurityAPRA CPS234EU AI Act🏭 Field-tested in AU

[EAAPL-SEC005] LLM Input Sanitisation

Category: Security / Data Protection Sub-category: Pre-Processing Pipeline Version: 1.2 Maturity: Proven Tags: pii-detection data-redaction input-validation token-budget schema-validation content-inspection privacy Regulatory Relevance: Australian Privacy Act 1988, GDPR Art. 25 (Privacy by Design), APRA CPS234, EU AI Act Art. 10, NIST AI RMF MAP 1.5


1. Executive Summary

LLM Input Sanitisation is a pre-processing pipeline that transforms raw application inputs into safe, compliant, and policy-conformant prompts before they reach a large language model. Where the Prompt Firewall (EAAPL-SEC002) focuses on adversarial intent detection, Input Sanitisation focuses on data governance: ensuring that sensitive information, PII, and confidential context does not reach model providers without appropriate controls.

For organisations in regulated industries, the business imperative is clear: sending personally identifiable information, protected health information, or financial account details to a commercial LLM API may constitute an unauthorised disclosure under privacy legislation. Input sanitisation enforces privacy by design at the AI layer — detecting and redacting sensitive data fields before they leave the organisation's security boundary.

Beyond privacy, this pattern provides: token budget enforcement (preventing prompt bloat that drives cost overruns), schema validation (ensuring prompts conform to expected structure, preventing context injection), and malicious content detection (detecting attempts to smuggle harmful content into the model's context through user-provided data). The pattern is deployed as a pipeline stage within the AI Gateway and operates on assembled prompts, after application-level prompt construction but before model provider submission.


2. Problem Statement

Business Problem

Organisations building AI features routinely construct prompts that include user data: customer names, email addresses, account numbers, medical histories, financial transactions. In many implementations, this data flows directly into model API calls without sanitisation. The consequences:

  • Privacy breaches when PII is included in prompts sent to commercial model providers whose data handling agreements may not cover the specific use.
  • Regulatory violations (Privacy Act, GDPR) if personal information is disclosed to third parties without appropriate consent or contractual protection.
  • Confidential business information in prompts potentially available to model provider staff during safety review processes.
  • Token waste from verbose, unsanitised prompts inflating costs.

Technical Problem

Application developers building AI features rarely have deep expertise in PII detection or data classification. They construct prompts using string templates that include whatever fields are available in the data object — often including fields that should not be sent to the model. Without a centralised sanitisation layer, each application team must independently implement PII detection, which leads to inconsistent coverage and inevitable gaps.

Additionally, prompts can grow unboundedly through: accumulated conversation history, large document chunks from RAG systems, verbose user inputs, and multiple context injections. Without token budget enforcement, prompts exceed model context windows (causing errors) or consume excessive tokens (driving cost overruns).

Symptoms

  • Customer PII appearing in model provider usage logs (discovered during vendor audit).
  • Prompts regularly exceeding model context limits, causing application errors.
  • Different applications sending different categories of data to models with no central policy.
  • No mechanism to audit what data has been sent to model providers.
  • Token costs unexpectedly high due to verbose, unsanitised prompts.

Cost of Inaction

Dimension Impact
Regulatory Privacy Act / GDPR breach from PII disclosure; potential notification obligation and regulatory fine
Reputational Customer trust erosion if PII disclosure becomes public
Financial Token cost overruns from unsanitised verbose prompts; regulatory fines
Security Confidential business logic, credentials, or trade secrets embedded in prompts
Operational Application errors from context window overflow; no visibility into what data reaches models

3. Context

When to Apply

  • Any AI application that constructs prompts including user-provided data, database records, or document content.
  • Applications sending prompts to external (commercial) model provider APIs.
  • RAG systems where document chunks are injected into prompts.
  • Conversational AI systems accumulating multi-turn context.
  • Regulated industries where data handling obligations apply to AI pipelines.

When NOT to Apply

  • Fully offline inference where the model is deployed within the organisation's own security boundary and no data leaves.
  • AI applications processing only non-sensitive, fully public data with no user context.
  • Development/sandbox environments processing synthetic data only.

Prerequisites

Prerequisite Detail
AI Gateway (EAAPL-SEC001) Sanitisation pipeline is a stage within the gateway
PII Detection Library Microsoft Presidio, AWS Comprehend PII, or equivalent
Data Classification Schema Organisation's data classification policy codified into detectable entity types
Token Counter Tokeniser for each supported model family (tiktoken for OpenAI, Anthropic tokeniser, etc.)

Industry Applicability

Industry Applicability Key Driver
Financial Services Critical Account numbers, transaction data, financial advice context
Healthcare Critical PHI (names, DOB, diagnoses, medications, insurance) — HIPAA/Privacy Act
Legal / Professional Services High Privileged information; client confidentiality
Government High Citizen data; classified information controls
Retail / E-commerce High Customer PII; payment card data
HR / Talent Management High Employee data; performance reviews; compensation

4. Architecture Overview

The LLM Input Sanitisation pipeline operates on the fully assembled prompt — after application code has constructed it but before it is forwarded to the model provider. This placement is intentional: sanitisation must occur at a point where the complete prompt context is available (system message + conversation history + user input + RAG context), not in the application before assembly, because PII can appear in any component.

Stage 1: Structural Analysis

The pipeline begins by parsing the prompt structure: identifying system message, user turns, assistant turns, and injected context blocks. This structural awareness is critical because different prompt components warrant different sanitisation policies — system messages may intentionally contain structured data while user inputs should be more aggressively sanitised.

Stage 2: Entity Detection and Classification

The PII detection engine runs structured entity recognition across all prompt components. Detection uses multiple techniques in combination:

  • Pattern-based detection: Regular expressions for highly structured PII (credit card numbers, Australian Tax File Numbers, Medicare numbers, phone numbers, email addresses, IP addresses).
  • NER-based detection: Named Entity Recognition models (spaCy, Presidio) for names, organisations, addresses, dates.
  • Context-aware detection: Recognising that "My name is [X]" followed by a word in a name format context is likely a personal name.
  • Custom entity types: Organisation-specific entity types (internal account numbers, employee IDs, proprietary codes) registered in the detection configuration.

Detected entities are classified by type and sensitivity. Not all PII is treated identically — a first name in a customer service context may be acceptable while a full name combined with account number and DOB constitutes high-sensitivity data.

Stage 3: Redaction Strategy

Detected sensitive entities are processed according to the configured redaction strategy per entity type:

  • Replacement: Entity replaced with a type label [PERSON_NAME], [ACCOUNT_NUMBER], [EMAIL_ADDRESS]. The model can still understand the structure of the prompt without the sensitive value.
  • Pseudonymisation: Entity replaced with a consistent pseudonym (same entity gets the same pseudonym within a session), allowing the model to reason about relationships without knowing the actual value. Pseudonym mapping stored server-side, not in the prompt.
  • Hashing: Entity replaced with a short hash for entity correlation without disclosure.
  • Removal: Entity removed entirely (for very high-sensitivity fields that add no reasoning value).

Redaction decisions are logged for audit: which entities were detected, which redaction strategy was applied, and a hash of the original value for post-hoc investigation if needed.

Stage 4: Token Budget Enforcement

After PII detection, the sanitised prompt is tokenised and measured against the configured token budget for the request type. If the prompt exceeds the budget:

  • Conversation history is truncated (oldest turns removed first) while preserving the system message and current user turn.
  • RAG context blocks are truncated (lowest-relevance chunks removed first, if relevance scores are available).
  • If the prompt still exceeds budget after truncation, the request is rejected with a clear error to the calling application.

Token budget enforcement protects both cost (over-budget requests waste tokens) and model performance (prompts near the context limit produce degraded outputs).

Stage 5: Schema Validation

The final sanitised prompt is validated against a schema for the specific use case. Schema validation catches context injection attempts: if the application template expects {user_question} but the user has provided content that looks like an additional system instruction, schema validation detects the structural anomaly. This is a lightweight but effective complement to the Prompt Firewall's semantic injection detection.


5. Architecture Diagram

ARCHITECTURE DIAGRAM
flowchart TD subgraph Input["Prompt Input"] A[Assembled Prompt] B[Structural Parser] end subgraph Sanitisation["Sanitisation Pipeline"] C[Entity Detection] D[Redaction Engine] E{Token Budget Check} F[Schema Validation] end subgraph Output["Delivery + Audit"] G[Sanitised Prompt] H[Redaction Audit Log] I[Reject + Alert] end A --> B --> C C -->|PII found| D --> E C -->|clean| E E -->|over budget| D E -->|within budget| F F -->|valid| G F -->|invalid| I C --> H G --> J[AI Gateway] style A fill:#dbeafe,stroke:#3b82f6 style B fill:#f0fdf4,stroke:#22c55e style C fill:#f0fdf4,stroke:#22c55e style D fill:#f0fdf4,stroke:#22c55e style E fill:#f3e8ff,stroke:#a855f7 style F fill:#f0fdf4,stroke:#22c55e style G fill:#d1fae5,stroke:#10b981 style H fill:#fef9c3,stroke:#eab308 style I fill:#fee2e2,stroke:#ef4444 style J fill:#fef9c3,stroke:#eab308

6. Components

Component Type Responsibility Technology Options Criticality
Structural Parser Parsing Identifies prompt components (system/user/assistant/context) for targeted sanitisation Custom parser, OpenAI message format parser, Anthropic XML tag parser High
PII Detection Engine NLP Multi-technique PII and sensitive entity detection Microsoft Presidio, AWS Comprehend PII, Google DLP, spaCy + custom models Critical
Redaction Engine Transformation Applies configured redaction strategy per entity type; maintains pseudonym mapping Custom transformation layer, Presidio anonymiser, custom in-memory pseudonym store Critical
Token Counter Measurement Counts tokens in assembled prompt using model-specific tokeniser tiktoken (OpenAI), Anthropic tokeniser, HuggingFace tokenisers High
Truncation Engine Transformation Removes conversation history and RAG context in priority order to meet token budget Custom priority truncator (history by age, RAG by relevance score) High
Schema Validator Security Validates structural integrity of prompt against registered template schema JSON Schema validator, Pydantic, custom template validator Medium
Entity Type Registry Configuration Catalogue of detectable entity types with detection patterns and custom types YAML/JSON config, Presidio registry, custom configuration service Critical
Redaction Policy Store Configuration Maps entity types to redaction strategies per data classification level YAML/JSON config, OPA data document Critical
Pseudonym Store State Session-scoped mapping of original values to consistent pseudonyms Redis (session TTL), in-memory map Medium
Sanitisation Audit Log Compliance Records all redaction events for audit and investigation Kafka → immutable log, same pipeline as AI Gateway audit log Critical

7. Data Flow

Primary Flow

Step Actor Action Output
1 Application / Gateway Submits assembled prompt to sanitisation pipeline Full prompt text with all components
2 Structural Parser Identifies prompt components; extracts system message, user turns, RAG context blocks Tagged prompt structure
3 PII Detection Engine Runs pattern matching + NER across all components; identifies entity spans with type and confidence List of detected entities: (type, span, confidence, component)
4 Redaction Engine Applies redaction strategy per entity type; generates consistent pseudonyms if required Sanitised prompt with entities replaced; audit records of each redaction
5 Token Counter Counts tokens in sanitised prompt using model-appropriate tokeniser Token count
6 Truncation Engine If over budget: removes oldest history turns, then lowest-relevance RAG chunks; re-counts tokens Truncated prompt within budget
7 Schema Validator Validates final prompt structure against registered template schema VALID or INVALID (with violation detail)
8 Sanitisation Audit Logger Records: entity types detected, redaction strategies applied, value hashes, token count before/after, truncations Audit record
9 AI Gateway Receives sanitised, schema-valid, on-budget prompt Forwards to model provider

Error Flow

Error Handling Status Alert
Critical PII detected (SSN, passport) and redaction fails Reject request 400 Security: failed redaction of critical entity
Token budget exceeded after maximum truncation Reject request 400 Warning: prompt too large even after truncation
Schema validation failure (injection indicator) Reject request 400 Security: context injection detected
PII detection model unavailable Fail closed: block request if PII detection is required by policy; or fail-open with alert for non-regulated paths 503 / degraded Critical: PII detection unavailable

8. Security Considerations

Authentication & Authorisation

  • Sanitisation pipeline is an internal component; access controlled by AI Gateway (not exposed directly to applications).
  • Pseudonym mapping store access restricted to the sanitisation service; no application can retrieve original values from pseudonyms.

Secrets Management

  • Commercial PII detection API credentials (if used) managed per EAAPL-SEC008.
  • Pseudonym mapping keys encrypted at rest; scoped to session TTL.

Data Classification

  • Sanitisation policies are classification-aware: data at higher sensitivity levels triggers more aggressive redaction.
  • Prompt classification label is attached after sanitisation (indicating residual sensitivity after redaction).

Encryption

  • All pipeline communication in transit over TLS 1.3.
  • Sanitisation audit log encrypted at rest; includes entity type and value hash but not original sensitive values.
  • Pseudonym store contents encrypted at rest.

OWASP LLM Top 10 Coverage

OWASP LLM Risk Input Sanitisation Mitigation Coverage
LLM01: Prompt Injection Schema validation provides structural injection detection; complements SEC002 semantic detection Medium
LLM02: Insecure Output Handling Prevents PII from entering prompts, reducing PII leakage risk in outputs High (upstream)
LLM03: Training Data Poisoning Not applicable to inference-time pipeline None
LLM04: Model Denial of Service Token budget enforcement prevents resource-exhausting over-long prompts High
LLM05: Supply Chain Vulnerabilities Not directly applicable None
LLM06: Sensitive Information Disclosure Core purpose: remove PII before prompt reaches model provider Critical
LLM07: Insecure Plugin Design Not directly applicable None
LLM08: Excessive Agency Removing PII from context limits agent's ability to act on personal information Medium
LLM09: Overreliance Not applicable None
LLM10: Model Theft Pseudonymisation prevents training data extraction by removing identifying information Medium

9. Governance Considerations

Responsible AI

  • Privacy by design: PII is removed from AI inputs by default, not as an afterthought.
  • Redaction decisions must be reviewed for fairness: aggressive redaction that removes cultural names or non-Western name formats may degrade AI quality for certain users disproportionately.

Governance Artefacts

Artefact Owner Frequency Purpose
PII Detection Coverage Report Privacy Team Quarterly Documents which entity types are detected; coverage gaps
Redaction Audit Log Compliance Continuous; monthly review Evidence of PII sanitisation for Privacy Act compliance
Token Budget Review AI Platform Monthly Ensures budgets are appropriate; reviews truncation frequency
False-Negative Analysis Privacy + AI Platform Quarterly Samples of prompts to verify PII not slipping through detection
Entity Registry Update Log AI Platform With each update Records new entity types added; rationale

10. Operational Considerations

SLOs

SLO Target Measurement
Sanitisation pipeline latency p99 <50ms Pipeline entry → exit span
PII detection recall (known entity types) >98% Monthly test suite against labelled samples
Token budget enforcement accuracy 100% (no over-budget prompts reach model) Token count metric on outbound prompts
Redaction audit record durability 100% Dead-letter queue monitoring

Incident Management

  • PII detected in model provider response (indicating sanitisation miss) → P1: Privacy incident; investigate detection gap; notify privacy team.
  • Sanitisation pipeline degraded (PII detection unavailable) → P2 if fail-open; P1 if regulated data pathway.
  • Unusual spike in redaction volume → Investigate: may indicate a new data integration sending unexpected PII.

11. Cost Considerations

Cost Drivers

Cost Driver Description Relative Impact
NER model inference CPU/GPU compute for PII detection; dominates pipeline cost High
Token counting Trivial CPU cost Very Low
Pseudonym store Redis memory; modest at typical session volumes Low
Commercial PII API (if used) AWS Comprehend, Google DLP per-request pricing Medium

Indicative Cost Range

Scale Monthly Cost (USD) Notes
Small (< 1M requests/day) $300–$800 CPU inference (Presidio); local NER model
Medium (1M–20M requests/day) $2,000–$8,000 GPU inference cluster; Redis cluster
Large (> 20M requests/day) $10,000–$30,000 Multi-region GPU inference; custom NER fine-tuning

12. Trade-Off Analysis

Option Comparison

Option Description Pros Cons Best For
A: Pattern-only detection Regex-based PII detection (SSN, phone, email patterns) Very fast; deterministic; zero ML dependencies Misses unstructured PII (free-text names, addresses); high false-negative rate Non-regulated applications; fast PoC
B: NER-based detection (this pattern) spaCy/Presidio NER + patterns High recall for structured and unstructured PII; industry standard Requires ML model; language-dependent; some false positives Regulated applications; production privacy-by-design
C: Cloud-native DLP AWS Comprehend, Google DLP, Azure Purview Managed; continuously updated; low operational overhead Sends prompt content to cloud (data residency risk); per-request cost; limited customisation Cloud-committed organisations; non-sensitive baseline
D: LLM-based PII detection Use a smaller LLM to detect PII in the input prompt Flexible; handles complex context Adds significant latency (LLM call before LLM call); cost; introduces recursive risk Research; specialised high-accuracy requirements

Architectural Tensions

Tension Trade-Off
Recall vs Latency Higher-accuracy NER models (larger, slower) detect more PII but add more latency. Resolution: use distilled NER models (spaCy sm) for high-throughput paths; full models for sensitive data pathways.
Redaction vs Utility Aggressive redaction reduces PII risk but may reduce the model's ability to provide useful responses (e.g., replacing a customer's name makes personalisation impossible). Resolution: pseudonymisation preserves reasoning utility while removing identifying values.
Centralisation vs Application Context A shared sanitisation pipeline lacks knowledge of what PII is intentional vs accidental in a specific application's context. Resolution: per-application redaction profiles that can whitelist certain entity types for specific use cases.

13. Failure Modes

Failure Likelihood Impact Detection Recovery
NER model false negative (misses PII) Medium High (PII reaches model provider) Post-hoc audit of sampled prompts; model output PII detection Update entity detection patterns; retrain NER model
Pseudonym store full (Redis OOM) Low Medium (pseudonymisation falls back to replacement) Redis memory metrics Evict oldest sessions; scale Redis memory
Token budget too tight (excessive truncation) Medium Medium (degraded AI output quality) Truncation frequency metric; quality regression Review and increase token budgets; improve prompt efficiency
Pipeline latency spike (NER model overloaded) Medium High (AI Gateway SLO breach) Pipeline latency metric Autoscale NER inference; horizontal scaling
Schema false positive (blocks legitimate prompt) Low Medium (user-facing error) 400 error rate from schema validation Tune schema; add to schema allow list

14. Regulatory Considerations

Regulation Requirement Implementation
Australian Privacy Act 1988 — APP 11 Take reasonable steps to protect personal information Automated PII detection and redaction before third-party model provider submission
GDPR Art. 25 (Privacy by Design) Implement appropriate technical measures to implement data protection principles PII detection and pseudonymisation pipeline implements technical data protection by design
GDPR Art. 28 (Processor obligations) Data processor must implement appropriate security measures Model provider is a processor; sanitisation limits personal data shared with processor
EU AI Act Art. 10 (Data Governance) Training and input data must meet quality criteria; data governance practices Sanitisation pipeline implements input data governance
HIPAA Technical Safeguards Technical safeguards to protect PHI in electronic transmissions Automatic PHI detection and redaction before external model API calls
APRA CPS234 §21 Information security controls for third-party dependencies Sanitisation limits sensitive data exposure to model provider third parties

15. Reference Implementations

AWS

Component AWS Service / OSS
PII detection Amazon Comprehend PII + custom entity types, or Presidio on ECS
Redaction engine Custom Lambda + Presidio anonymiser
Token counting tiktoken (Lambda layer)
Pseudonym store ElastiCache Redis
Audit logging Kinesis Firehose → S3 (Object Lock)

Azure

Component Azure Service / OSS
PII detection Azure AI Language PII detection + Presidio
Redaction Custom Azure Function
Token counting Custom tiktoken deployment
Pseudonym store Azure Cache for Redis
Audit logging Event Hub → Immutable Blob Storage

On-Premises

Component Technology
PII detection Microsoft Presidio (self-hosted) + spaCy models
Redaction engine Presidio anonymiser with custom operators
Token counting tiktoken + HuggingFace tokenisers
Pseudonym store Redis Cluster
Audit logging Kafka → Elasticsearch

Pattern ID Relationship
AI Gateway EAAPL-SEC001 SEC005 is a pipeline stage within the gateway
Prompt Firewall EAAPL-SEC002 Complementary: SEC002 detects adversarial intent; SEC005 handles data governance
AI Output Filtering EAAPL-SEC006 Defence pair: SEC005 prevents PII entering; SEC006 detects PII leaking in outputs
AI Data Classification EAAPL-SEC009 Classification labels from SEC009 inform SEC005 redaction policy selection
Zero-Trust AI Pipeline EAAPL-SEC007 SEC005 implements the data-governance stage of the zero-trust pipeline

17. Maturity Assessment

Overall Maturity: Proven

Dimension Score (1–5) Rationale
Pattern definition clarity 5 Well-defined stages and clear privacy objective
Technology availability 5 Microsoft Presidio, AWS Comprehend, Google DLP are all production-ready
Industry adoption 4 Widely adopted in financial services and healthcare AI deployments
NER model quality 4 Strong for English; multilingual support requires additional configuration
Regulatory alignment 5 Directly addresses Privacy Act, GDPR, and HIPAA requirements
Operational tooling 4 Presidio provides strong operational foundation; custom entity types require engineering

18. Revision History

Version Date Author Changes
1.0 2024-02-20 Security Architecture Team Initial pattern definition
1.1 2024-06-10 Security Architecture Team Added pseudonymisation strategy; token budget enforcement detail
1.2 2024-12-01 Security Architecture Team Updated regulatory mapping; added Australian Privacy Act specific guidance; expanded failure modes
← Back to LibraryMore AI Security