EAAPL-SEC005Proven

LLM Input Sanitisation

🔐 AI SecurityAPRA CPS234EU AI Act🏭 Field-tested in AU

[EAAPL-SEC005] LLM Input Sanitisation

Category: Security / Data Protection Sub-category: Pre-Processing Pipeline Version: 1.2 Maturity: Proven Tags: pii-detection data-redaction input-validation token-budget schema-validation content-inspection privacy Regulatory Relevance: Australian Privacy Act 1988, GDPR Art. 25 (Privacy by Design), APRA CPS234, EU AI Act Art. 10, NIST AI RMF MAP 1.5

1. Executive Summary

LLM Input Sanitisation is a pre-processing pipeline that transforms raw application inputs into safe, compliant, and policy-conformant prompts before they reach a large language model. Where the Prompt Firewall (EAAPL-SEC002) focuses on adversarial intent detection, Input Sanitisation focuses on data governance: ensuring that sensitive information, PII, and confidential context does not reach model providers without appropriate controls.

For organisations in regulated industries, the business imperative is clear: sending personally identifiable information, protected health information, or financial account details to a commercial LLM API may constitute an unauthorised disclosure under privacy legislation. Input sanitisation enforces privacy by design at the AI layer — detecting and redacting sensitive data fields before they leave the organisation's security boundary.

Beyond privacy, this pattern provides: token budget enforcement (preventing prompt bloat that drives cost overruns), schema validation (ensuring prompts conform to expected structure, preventing context injection), and malicious content detection (detecting attempts to smuggle harmful content into the model's context through user-provided data). The pattern is deployed as a pipeline stage within the AI Gateway and operates on assembled prompts, after application-level prompt construction but before model provider submission.

2. Problem Statement

Business Problem

Organisations building AI features routinely construct prompts that include user data: customer names, email addresses, account numbers, medical histories, financial transactions. In many implementations, this data flows directly into model API calls without sanitisation. The consequences:

Privacy breaches when PII is included in prompts sent to commercial model providers whose data handling agreements may not cover the specific use.
Regulatory violations (Privacy Act, GDPR) if personal information is disclosed to third parties without appropriate consent or contractual protection.
Confidential business information in prompts potentially available to model provider staff during safety review processes.
Token waste from verbose, unsanitised prompts inflating costs.

Technical Problem

Application developers building AI features rarely have deep expertise in PII detection or data classification. They construct prompts using string templates that include whatever fields are available in the data object — often including fields that should not be sent to the model. Without a centralised sanitisation layer, each application team must independently implement PII detection, which leads to inconsistent coverage and inevitable gaps.

Additionally, prompts can grow unboundedly through: accumulated conversation history, large document chunks from RAG systems, verbose user inputs, and multiple context injections. Without token budget enforcement, prompts exceed model context windows (causing errors) or consume excessive tokens (driving cost overruns).

Symptoms

Customer PII appearing in model provider usage logs (discovered during vendor audit).
Prompts regularly exceeding model context limits, causing application errors.
Different applications sending different categories of data to models with no central policy.
No mechanism to audit what data has been sent to model providers.
Token costs unexpectedly high due to verbose, unsanitised prompts.

Cost of Inaction

Dimension	Impact
Regulatory	Privacy Act / GDPR breach from PII disclosure; potential notification obligation and regulatory fine
Reputational	Customer trust erosion if PII disclosure becomes public
Financial	Token cost overruns from unsanitised verbose prompts; regulatory fines
Security	Confidential business logic, credentials, or trade secrets embedded in prompts
Operational	Application errors from context window overflow; no visibility into what data reaches models

3. Context

When to Apply

Any AI application that constructs prompts including user-provided data, database records, or document content.
Applications sending prompts to external (commercial) model provider APIs.
RAG systems where document chunks are injected into prompts.
Conversational AI systems accumulating multi-turn context.
Regulated industries where data handling obligations apply to AI pipelines.

When NOT to Apply

Fully offline inference where the model is deployed within the organisation's own security boundary and no data leaves.
AI applications processing only non-sensitive, fully public data with no user context.
Development/sandbox environments processing synthetic data only.

Prerequisites

Prerequisite	Detail
AI Gateway (EAAPL-SEC001)	Sanitisation pipeline is a stage within the gateway
PII Detection Library	Microsoft Presidio, AWS Comprehend PII, or equivalent
Data Classification Schema	Organisation's data classification policy codified into detectable entity types
Token Counter	Tokeniser for each supported model family (tiktoken for OpenAI, Anthropic tokeniser, etc.)

Industry Applicability

Industry	Applicability	Key Driver
Financial Services	Critical	Account numbers, transaction data, financial advice context
Healthcare	Critical	PHI (names, DOB, diagnoses, medications, insurance) — HIPAA/Privacy Act
Legal / Professional Services	High	Privileged information; client confidentiality
Government	High	Citizen data; classified information controls
Retail / E-commerce	High	Customer PII; payment card data
HR / Talent Management	High	Employee data; performance reviews; compensation

4. Architecture Overview

The LLM Input Sanitisation pipeline operates on the fully assembled prompt — after application code has constructed it but before it is forwarded to the model provider. This placement is intentional: sanitisation must occur at a point where the complete prompt context is available (system message + conversation history + user input + RAG context), not in the application before assembly, because PII can appear in any component.

Stage 1: Structural Analysis

The pipeline begins by parsing the prompt structure: identifying system message, user turns, assistant turns, and injected context blocks. This structural awareness is critical because different prompt components warrant different sanitisation policies — system messages may intentionally contain structured data while user inputs should be more aggressively sanitised.

Stage 2: Entity Detection and Classification

The PII detection engine runs structured entity recognition across all prompt components. Detection uses multiple techniques in combination:

Pattern-based detection: Regular expressions for highly structured PII (credit card numbers, Australian Tax File Numbers, Medicare numbers, phone numbers, email addresses, IP addresses).
NER-based detection: Named Entity Recognition models (spaCy, Presidio) for names, organisations, addresses, dates.
Context-aware detection: Recognising that "My name is [X]" followed by a word in a name format context is likely a personal name.
Custom entity types: Organisation-specific entity types (internal account numbers, employee IDs, proprietary codes) registered in the detection configuration.

Detected entities are classified by type and sensitivity. Not all PII is treated identically — a first name in a customer service context may be acceptable while a full name combined with account number and DOB constitutes high-sensitivity data.

Stage 3: Redaction Strategy

Detected sensitive entities are processed according to the configured redaction strategy per entity type:

Replacement: Entity replaced with a type label [PERSON_NAME], [ACCOUNT_NUMBER], [EMAIL_ADDRESS]. The model can still understand the structure of the prompt without the sensitive value.
Pseudonymisation: Entity replaced with a consistent pseudonym (same entity gets the same pseudonym within a session), allowing the model to reason about relationships without knowing the actual value. Pseudonym mapping stored server-side, not in the prompt.
Hashing: Entity replaced with a short hash for entity correlation without disclosure.
Removal: Entity removed entirely (for very high-sensitivity fields that add no reasoning value).

Redaction decisions are logged for audit: which entities were detected, which redaction strategy was applied, and a hash of the original value for post-hoc investigation if needed.

Stage 4: Token Budget Enforcement

After PII detection, the sanitised prompt is tokenised and measured against the configured token budget for the request type. If the prompt exceeds the budget:

Conversation history is truncated (oldest turns removed first) while preserving the system message and current user turn.
RAG context blocks are truncated (lowest-relevance chunks removed first, if relevance scores are available).
If the prompt still exceeds budget after truncation, the request is rejected with a clear error to the calling application.

Token budget enforcement protects both cost (over-budget requests waste tokens) and model performance (prompts near the context limit produce degraded outputs).

Stage 5: Schema Validation

The final sanitised prompt is validated against a schema for the specific use case. Schema validation catches context injection attempts: if the application template expects {user_question} but the user has provided content that looks like an additional system instruction, schema validation detects the structural anomaly. This is a lightweight but effective complement to the Prompt Firewall's semantic injection detection.

5. Architecture Diagram

ARCHITECTURE DIAGRAM

flowchart TD subgraph Input["Prompt Input"] A[Assembled Prompt] B[Structural Parser] end subgraph Sanitisation["Sanitisation Pipeline"] C[Entity Detection] D[Redaction Engine] E{Token Budget Check} F[Schema Validation] end subgraph Output["Delivery + Audit"] G[Sanitised Prompt] H[Redaction Audit Log] I[Reject + Alert] end A --> B --> C C -->|PII found| D --> E C -->|clean| E E -->|over budget| D E -->|within budget| F F -->|valid| G F -->|invalid| I C --> H G --> J[AI Gateway] style A fill:#dbeafe,stroke:#3b82f6 style B fill:#f0fdf4,stroke:#22c55e style C fill:#f0fdf4,stroke:#22c55e style D fill:#f0fdf4,stroke:#22c55e style E fill:#f3e8ff,stroke:#a855f7 style F fill:#f0fdf4,stroke:#22c55e style G fill:#d1fae5,stroke:#10b981 style H fill:#fef9c3,stroke:#eab308 style I fill:#fee2e2,stroke:#ef4444 style J fill:#fef9c3,stroke:#eab308

6. Components

Component	Type	Responsibility	Technology Options	Criticality
Structural Parser	Parsing	Identifies prompt components (system/user/assistant/context) for targeted sanitisation	Custom parser, OpenAI message format parser, Anthropic XML tag parser	High
PII Detection Engine	NLP	Multi-technique PII and sensitive entity detection	Microsoft Presidio, AWS Comprehend PII, Google DLP, spaCy + custom models	Critical
Redaction Engine	Transformation	Applies configured redaction strategy per entity type; maintains pseudonym mapping	Custom transformation layer, Presidio anonymiser, custom in-memory pseudonym store	Critical
Token Counter	Measurement	Counts tokens in assembled prompt using model-specific tokeniser	tiktoken (OpenAI), Anthropic tokeniser, HuggingFace tokenisers	High
Truncation Engine	Transformation	Removes conversation history and RAG context in priority order to meet token budget	Custom priority truncator (history by age, RAG by relevance score)	High
Schema Validator	Security	Validates structural integrity of prompt against registered template schema	JSON Schema validator, Pydantic, custom template validator	Medium
Entity Type Registry	Configuration	Catalogue of detectable entity types with detection patterns and custom types	YAML/JSON config, Presidio registry, custom configuration service	Critical
Redaction Policy Store	Configuration	Maps entity types to redaction strategies per data classification level	YAML/JSON config, OPA data document	Critical
Pseudonym Store	State	Session-scoped mapping of original values to consistent pseudonyms	Redis (session TTL), in-memory map	Medium
Sanitisation Audit Log	Compliance	Records all redaction events for audit and investigation	Kafka → immutable log, same pipeline as AI Gateway audit log	Critical

7. Data Flow

Primary Flow

Step	Actor	Action	Output
1	Application / Gateway	Submits assembled prompt to sanitisation pipeline	Full prompt text with all components
2	Structural Parser	Identifies prompt components; extracts system message, user turns, RAG context blocks	Tagged prompt structure
3	PII Detection Engine	Runs pattern matching + NER across all components; identifies entity spans with type and confidence	List of detected entities: (type, span, confidence, component)
4	Redaction Engine	Applies redaction strategy per entity type; generates consistent pseudonyms if required	Sanitised prompt with entities replaced; audit records of each redaction
5	Token Counter	Counts tokens in sanitised prompt using model-appropriate tokeniser	Token count
6	Truncation Engine	If over budget: removes oldest history turns, then lowest-relevance RAG chunks; re-counts tokens	Truncated prompt within budget
7	Schema Validator	Validates final prompt structure against registered template schema	VALID or INVALID (with violation detail)
8	Sanitisation Audit Logger	Records: entity types detected, redaction strategies applied, value hashes, token count before/after, truncations	Audit record
9	AI Gateway	Receives sanitised, schema-valid, on-budget prompt	Forwards to model provider

Error Flow

Error	Handling	Status	Alert
Critical PII detected (SSN, passport) and redaction fails	Reject request	400	Security: failed redaction of critical entity
Token budget exceeded after maximum truncation	Reject request	400	Warning: prompt too large even after truncation
Schema validation failure (injection indicator)	Reject request	400	Security: context injection detected
PII detection model unavailable	Fail closed: block request if PII detection is required by policy; or fail-open with alert for non-regulated paths	503 / degraded	Critical: PII detection unavailable

8. Security Considerations

Authentication & Authorisation

Sanitisation pipeline is an internal component; access controlled by AI Gateway (not exposed directly to applications).
Pseudonym mapping store access restricted to the sanitisation service; no application can retrieve original values from pseudonyms.

Secrets Management

Commercial PII detection API credentials (if used) managed per EAAPL-SEC008.
Pseudonym mapping keys encrypted at rest; scoped to session TTL.

Data Classification

Sanitisation policies are classification-aware: data at higher sensitivity levels triggers more aggressive redaction.
Prompt classification label is attached after sanitisation (indicating residual sensitivity after redaction).

Encryption

All pipeline communication in transit over TLS 1.3.
Sanitisation audit log encrypted at rest; includes entity type and value hash but not original sensitive values.
Pseudonym store contents encrypted at rest.

OWASP LLM Top 10 Coverage

OWASP LLM Risk	Input Sanitisation Mitigation	Coverage
LLM01: Prompt Injection	Schema validation provides structural injection detection; complements SEC002 semantic detection	Medium
LLM02: Insecure Output Handling	Prevents PII from entering prompts, reducing PII leakage risk in outputs	High (upstream)
LLM03: Training Data Poisoning	Not applicable to inference-time pipeline	None
LLM04: Model Denial of Service	Token budget enforcement prevents resource-exhausting over-long prompts	High
LLM05: Supply Chain Vulnerabilities	Not directly applicable	None
LLM06: Sensitive Information Disclosure	Core purpose: remove PII before prompt reaches model provider	Critical
LLM07: Insecure Plugin Design	Not directly applicable	None
LLM08: Excessive Agency	Removing PII from context limits agent's ability to act on personal information	Medium
LLM09: Overreliance	Not applicable	None
LLM10: Model Theft	Pseudonymisation prevents training data extraction by removing identifying information	Medium

9. Governance Considerations

Responsible AI

Privacy by design: PII is removed from AI inputs by default, not as an afterthought.
Redaction decisions must be reviewed for fairness: aggressive redaction that removes cultural names or non-Western name formats may degrade AI quality for certain users disproportionately.

Governance Artefacts

Artefact	Owner	Frequency	Purpose
PII Detection Coverage Report	Privacy Team	Quarterly	Documents which entity types are detected; coverage gaps
Redaction Audit Log	Compliance	Continuous; monthly review	Evidence of PII sanitisation for Privacy Act compliance
Token Budget Review	AI Platform	Monthly	Ensures budgets are appropriate; reviews truncation frequency
False-Negative Analysis	Privacy + AI Platform	Quarterly	Samples of prompts to verify PII not slipping through detection
Entity Registry Update Log	AI Platform	With each update	Records new entity types added; rationale

10. Operational Considerations

SLOs

SLO	Target	Measurement
Sanitisation pipeline latency p99	<50ms	Pipeline entry → exit span
PII detection recall (known entity types)	>98%	Monthly test suite against labelled samples
Token budget enforcement accuracy	100% (no over-budget prompts reach model)	Token count metric on outbound prompts
Redaction audit record durability	100%	Dead-letter queue monitoring

Incident Management

PII detected in model provider response (indicating sanitisation miss) → P1: Privacy incident; investigate detection gap; notify privacy team.
Sanitisation pipeline degraded (PII detection unavailable) → P2 if fail-open; P1 if regulated data pathway.
Unusual spike in redaction volume → Investigate: may indicate a new data integration sending unexpected PII.

11. Cost Considerations

Cost Drivers

Cost Driver	Description	Relative Impact
NER model inference	CPU/GPU compute for PII detection; dominates pipeline cost	High
Token counting	Trivial CPU cost	Very Low
Pseudonym store	Redis memory; modest at typical session volumes	Low
Commercial PII API (if used)	AWS Comprehend, Google DLP per-request pricing	Medium

Indicative Cost Range

Scale	Monthly Cost (USD)	Notes
Small (< 1M requests/day)	$300–$800	CPU inference (Presidio); local NER model
Medium (1M–20M requests/day)	$2,000–$8,000	GPU inference cluster; Redis cluster
Large (> 20M requests/day)	$10,000–$30,000	Multi-region GPU inference; custom NER fine-tuning

12. Trade-Off Analysis

Option Comparison

Option	Description	Pros	Cons	Best For
A: Pattern-only detection	Regex-based PII detection (SSN, phone, email patterns)	Very fast; deterministic; zero ML dependencies	Misses unstructured PII (free-text names, addresses); high false-negative rate	Non-regulated applications; fast PoC
B: NER-based detection (this pattern)	spaCy/Presidio NER + patterns	High recall for structured and unstructured PII; industry standard	Requires ML model; language-dependent; some false positives	Regulated applications; production privacy-by-design
C: Cloud-native DLP	AWS Comprehend, Google DLP, Azure Purview	Managed; continuously updated; low operational overhead	Sends prompt content to cloud (data residency risk); per-request cost; limited customisation	Cloud-committed organisations; non-sensitive baseline
D: LLM-based PII detection	Use a smaller LLM to detect PII in the input prompt	Flexible; handles complex context	Adds significant latency (LLM call before LLM call); cost; introduces recursive risk	Research; specialised high-accuracy requirements

Architectural Tensions

Tension	Trade-Off
Recall vs Latency	Higher-accuracy NER models (larger, slower) detect more PII but add more latency. Resolution: use distilled NER models (spaCy sm) for high-throughput paths; full models for sensitive data pathways.
Redaction vs Utility	Aggressive redaction reduces PII risk but may reduce the model's ability to provide useful responses (e.g., replacing a customer's name makes personalisation impossible). Resolution: pseudonymisation preserves reasoning utility while removing identifying values.
Centralisation vs Application Context	A shared sanitisation pipeline lacks knowledge of what PII is intentional vs accidental in a specific application's context. Resolution: per-application redaction profiles that can whitelist certain entity types for specific use cases.

13. Failure Modes

Failure	Likelihood	Impact	Detection	Recovery
NER model false negative (misses PII)	Medium	High (PII reaches model provider)	Post-hoc audit of sampled prompts; model output PII detection	Update entity detection patterns; retrain NER model
Pseudonym store full (Redis OOM)	Low	Medium (pseudonymisation falls back to replacement)	Redis memory metrics	Evict oldest sessions; scale Redis memory
Token budget too tight (excessive truncation)	Medium	Medium (degraded AI output quality)	Truncation frequency metric; quality regression	Review and increase token budgets; improve prompt efficiency
Pipeline latency spike (NER model overloaded)	Medium	High (AI Gateway SLO breach)	Pipeline latency metric	Autoscale NER inference; horizontal scaling
Schema false positive (blocks legitimate prompt)	Low	Medium (user-facing error)	400 error rate from schema validation	Tune schema; add to schema allow list

14. Regulatory Considerations

Regulation	Requirement	Implementation
Australian Privacy Act 1988 — APP 11	Take reasonable steps to protect personal information	Automated PII detection and redaction before third-party model provider submission
GDPR Art. 25 (Privacy by Design)	Implement appropriate technical measures to implement data protection principles	PII detection and pseudonymisation pipeline implements technical data protection by design
GDPR Art. 28 (Processor obligations)	Data processor must implement appropriate security measures	Model provider is a processor; sanitisation limits personal data shared with processor
EU AI Act Art. 10 (Data Governance)	Training and input data must meet quality criteria; data governance practices	Sanitisation pipeline implements input data governance
HIPAA Technical Safeguards	Technical safeguards to protect PHI in electronic transmissions	Automatic PHI detection and redaction before external model API calls
APRA CPS234 §21	Information security controls for third-party dependencies	Sanitisation limits sensitive data exposure to model provider third parties

15. Reference Implementations

AWS

Component	AWS Service / OSS
PII detection	Amazon Comprehend PII + custom entity types, or Presidio on ECS
Redaction engine	Custom Lambda + Presidio anonymiser
Token counting	tiktoken (Lambda layer)
Pseudonym store	ElastiCache Redis
Audit logging	Kinesis Firehose → S3 (Object Lock)

Azure

Component	Azure Service / OSS
PII detection	Azure AI Language PII detection + Presidio
Redaction	Custom Azure Function
Token counting	Custom tiktoken deployment
Pseudonym store	Azure Cache for Redis
Audit logging	Event Hub → Immutable Blob Storage

On-Premises

Component	Technology
PII detection	Microsoft Presidio (self-hosted) + spaCy models
Redaction engine	Presidio anonymiser with custom operators
Token counting	tiktoken + HuggingFace tokenisers
Pseudonym store	Redis Cluster
Audit logging	Kafka → Elasticsearch

Pattern	ID	Relationship
AI Gateway	EAAPL-SEC001	SEC005 is a pipeline stage within the gateway
Prompt Firewall	EAAPL-SEC002	Complementary: SEC002 detects adversarial intent; SEC005 handles data governance
AI Output Filtering	EAAPL-SEC006	Defence pair: SEC005 prevents PII entering; SEC006 detects PII leaking in outputs
AI Data Classification	EAAPL-SEC009	Classification labels from SEC009 inform SEC005 redaction policy selection
Zero-Trust AI Pipeline	EAAPL-SEC007	SEC005 implements the data-governance stage of the zero-trust pipeline

17. Maturity Assessment

Overall Maturity: Proven

Dimension	Score (1–5)	Rationale
Pattern definition clarity	5	Well-defined stages and clear privacy objective
Technology availability	5	Microsoft Presidio, AWS Comprehend, Google DLP are all production-ready
Industry adoption	4	Widely adopted in financial services and healthcare AI deployments
NER model quality	4	Strong for English; multilingual support requires additional configuration
Regulatory alignment	5	Directly addresses Privacy Act, GDPR, and HIPAA requirements
Operational tooling	4	Presidio provides strong operational foundation; custom entity types require engineering

18. Revision History

Version	Date	Author	Changes
1.0	2024-02-20	Security Architecture Team	Initial pattern definition
1.1	2024-06-10	Security Architecture Team	Added pseudonymisation strategy; token budget enforcement detail
1.2	2024-12-01	Security Architecture Team	Updated regulatory mapping; added Australian Privacy Act specific guidance; expanded failure modes

← Back to Library More AI Security →