EAAPLEnterprise AI Architecture Pattern Library
EAAPLLibraryObservability & Monitoring
Proven
⇄ Compare

EAAPL-OBS002 · Prompt Monitoring

📊 Observability & Monitoring🏭 Field-tested in AU

EAAPL-OBS002 · Prompt Monitoring

Pattern ID: EAAPL-OBS002 Status: Proven Complexity: Medium Tags: observability prompt-engineering alerting pii-handling medium-complexity Version: 1.0.0 Last Reviewed: 2026-06-12


1. Executive Summary

Prompts sent to large language models in production are the primary control surface for AI system behaviour, yet most organisations have no systematic visibility into what prompts are actually being sent, how they change over time, and whether they carry sensitive data. Prompt engineering changes are often deployed without telemetry, creating a class of silent regressions that appear as customer complaints rather than metric alerts.

This pattern defines continuous monitoring of prompts sent to LLMs in production environments. It covers: drift detection to identify unintended prompt distribution changes; anomaly alerting for PII exposure, injection attempts, jailbreak signatures, and abnormal prompt lengths; cost anomaly detection tied to prompt length trends; sensitive data protection through pre-log PII scanning; prompt version tracking with traffic distribution and performance comparison; and prompt performance analytics correlating prompt versions to success rates and user satisfaction. Together, these capabilities give engineering teams the same observability over their prompts that they expect over their code.

Target Audience: CIO, CTO, AI Engineering Lead, Platform Engineering Lead Time to Implement: 4–8 weeks


2. Problem Statement

Business Problem

Prompt engineering is the fastest-changing layer in most AI systems, yet it has no equivalent of git blame, deployment monitoring, or regression alerting. When a prompt change degrades user experience, the signal comes from user complaints or NPS drops — not from a dashboard alert within minutes of deployment. Organisations cannot demonstrate to regulators which prompt version was active when a disputed AI output was generated.

Technical Problem

Prompts are assembled dynamically from templates, retrieved context, and user inputs. The result is that no two prompts are identical, making traditional change detection (file diffing) inapplicable. Statistical monitoring is required to detect when the distribution of prompts has shifted beyond normal variance. Additionally, user inputs injected into prompts can carry PII or adversarial content that bypasses application-layer controls and reaches the model API — where it may be logged by the provider in violation of data agreements.

Symptoms

  • Prompt template changes deployed to production with no performance comparison
  • Prompt injection attacks detected only through customer complaints or model output review, not automated detection
  • Average prompt length increasing 30% over 3 months, driving cost increases, with no alert triggered
  • Data breach inquiry reveals customer PII was included in prompts sent to third-party model API
  • Different API gateway replicas running different prompt versions simultaneously with no visibility

Cost of Inaction

  • Silent prompt regressions persist for days to weeks, degrading user experience for all affected requests
  • PII in prompts sent to third-party providers constitutes a data breach under Privacy Act APP 11
  • Prompt injection attacks succeed silently, potentially exfiltrating context window data
  • Inability to demonstrate version-controlled AI behaviour to regulators constitutes CPS 234 finding

3. Context

When to Apply

  • Any production system using dynamic prompt templates with variable context injection
  • Systems where multiple prompt versions may be active simultaneously (A/B testing, staged rollouts)
  • Any system sending user-provided content as part of prompts to external model APIs
  • Organisations subject to APRA, Privacy Act, EU AI Act, or internal AI governance requirements
  • Prerequisites: EAAPL-OBS001 (AI Telemetry Architecture) must be in place for log ingestion

When NOT to Apply

  • Systems using only static, fixed prompts with no variable content (extremely rare in practice)
  • Internal developer tools where all users are trusted and PII exposure risk is accepted
  • Proof-of-concept systems with < 30-day planned lifespan

Prerequisites

Prerequisite Required Notes
EAAPL-OBS001 AI Telemetry Infrastructure Required Log ingestion pipeline and structured log schema required
Prompt template versioning system Required Templates must be version-tagged before monitoring is meaningful
Statistical analysis runtime (Python/Spark) Required Drift detection requires statistical compute
PII detection library Required Presidio, AWS Comprehend, or equivalent
Secrets management Required Keys must not appear in prompt logs

Industry Applicability

Industry Applicability Primary Driver
Financial Services Critical Regulatory audit, PII in prompts, version control for disputes
Healthcare Critical PHI in prompts is HIPAA/Privacy Act violation
Legal Services High Privilege leak in prompts, version accountability
Government High FOI obligations, prompt injection as attack vector
Retail / E-Commerce Medium Cost anomaly detection, personalisation prompt quality
Technology / SaaS High Multi-tenant PII separation, A/B prompt testing

4. Architecture Overview

The Prompt Monitoring Architecture operates as an analytical overlay on the AI telemetry stream established by EAAPL-OBS001. It does not sit in the critical path of AI request processing; all analysis is performed asynchronously on telemetry data to avoid adding latency to inference calls.

Prompt Sanitisation and Metadata Capture

At the instrumentation layer, the AI Client Wrapper (from EAAPL-OBS001) captures prompt metadata before the prompt is sent to the model. The wrapper computes a SHA-256 hash of the prompt template (without variable content) to identify which template version generated the prompt. It records the template identifier, template version, and the token count of the assembled prompt. Raw prompt content is NOT logged by default. If prompt content logging is approved (for regulated audit purposes), the PII scanner runs synchronously before logging, replacing detected PII with category tokens (e.g., [PERSON_NAME], [CREDIT_CARD]).

Prompt Version Tracking

Every prompt request record includes a promptTemplateId and promptTemplateVersion, enabling the system to track which template versions are active in production at any point. A prompt version registry service maintains the authoritative mapping of templateId+version to the actual template text (stored securely, not in the telemetry stream). The registry exposes APIs used by dashboards to show: current active versions by environment, traffic distribution across versions in A/B tests, and deployment history.

Drift Detection Engine

The drift detection engine runs as a scheduled batch job (every 15 minutes for high-volume systems, hourly for lower volume). For each prompt template, it computes statistical features over the rolling window of prompt instances: mean and standard deviation of input token counts, distribution of context length, vocabulary distribution of injected user content (if content logging approved), and template version mix. These features are compared to a reference baseline established from a rolling 7-day window prior to the analysis period. The Jensen-Shannon divergence between current and baseline distributions is computed for each feature. A divergence score exceeding configurable thresholds triggers a drift alert with the affected template ID, the diverging feature, and the magnitude of divergence.

Anomaly Detection Engine

The anomaly detection engine processes the prompt metadata stream in near-real-time (1-minute micro-batches). It applies four detection rules. First, unusually long prompts: if assembled prompt token count exceeds 3 standard deviations above the rolling mean for that template, the request is flagged. Second, PII detection: a synchronous PII scanner checks assembled prompts (or prompt hashes plus input-field metadata if full content logging is disabled) for PII patterns before they leave the application perimeter. Third, prompt injection signatures: a pattern matcher scans for known injection phrases (ignore previous instructions, you are now, act as, disregard your system prompt, etc.) and for instruction-boundary overrides. Fourth, suspicious structural patterns: prompts with unusual ratios of special characters, base64-encoded content, or role-alternation patterns that do not match the expected template structure.

Cost Anomaly Detection

Prompt token counts are correlated with cost data from the cost telemetry stream. The cost anomaly engine computes a rolling 7-day baseline for average prompt token count per template. If the 1-hour rolling average for any template increases by more than 50% above baseline, a cost anomaly alert is triggered. This catches scenarios where a prompt template change or data pipeline malfunction causes prompts to grow unexpectedly — a common cause of sudden 2–5x cost spikes.

Prompt Performance Analytics

Quality metrics are tracked per prompt template version: success rate (non-error completion), user satisfaction signal (thumbs up/down, task completion if measurable), hallucination rate from EAAPL-OBS003, and latency. When a new template version is deployed, a statistical comparison is automatically initiated between the outgoing and incoming versions using Mann-Whitney U test for latency and proportion z-test for success rate. If the incoming version is statistically significantly worse on any metric at p < 0.05, a deployment gate recommendation is raised.


5. Architecture Diagram

ARCHITECTURE DIAGRAM
flowchart TD subgraph Capture["Prompt Capture"] A[Assembled Prompt] B{PII Scanner} C[Prompt Metadata Logger] end subgraph Analysis["Analysis Layer"] D[(Log Backend)] E[Drift Detection Engine] F[Injection Pattern Matcher] end subgraph Governance["Governance"] G[Prompt Version Registry] H[Performance Comparator] end A --> B B -->|PII found| I[Redact + Alert] B -->|clean| C I --> C C --> F C --> D D --> E D --> G G --> H E --> J[Drift Alert] F --> J H --> K[Version Dashboard] style A fill:#dbeafe,stroke:#3b82f6 style B fill:#f3e8ff,stroke:#a855f7 style C fill:#f0fdf4,stroke:#22c55e style D fill:#fef9c3,stroke:#eab308 style E fill:#f0fdf4,stroke:#22c55e style F fill:#f0fdf4,stroke:#22c55e style G fill:#fef9c3,stroke:#eab308 style H fill:#f0fdf4,stroke:#22c55e style I fill:#fee2e2,stroke:#ef4444 style J fill:#fee2e2,stroke:#ef4444 style K fill:#d1fae5,stroke:#10b981

6. Components

Component Type Responsibility Technology Options Criticality
PII Scanner SDK Library Scan prompt content for PII before logging; redact detected entities Microsoft Presidio, AWS Comprehend (DetectPiiEntities), Google DLP, spaCy NER Critical
Injection Pattern Matcher SDK Library Detect prompt injection signatures in real-time Rule-based regex + embeddings similarity scorer; custom model fine-tuned on injection examples Critical
Prompt Metadata Logger SDK Library Capture templateId, version, token counts, hash; emit to OTel pipeline Custom wrapper on AI Client Wrapper from EAAPL-OBS001 Critical
Prompt Template Registry Service Authoritative version-to-template mapping; deployment history Git-backed service with API; Backstage plugin; custom service on PostgreSQL High
Drift Detection Engine Batch Job Statistical comparison of current vs baseline prompt distributions Python (scipy, numpy); PySpark for high volume; scheduled on Airflow/Prefect High
Anomaly Detection Engine Stream Processor Near-real-time token length and pattern anomaly detection Flink, Spark Streaming, AWS Kinesis Analytics High
Cost Anomaly Engine Stream Processor Correlate prompt token counts with cost; detect cost spikes Joins prompt metadata with cost telemetry; threshold-based alerting Medium
Performance Comparator Batch Job Statistical A/B comparison of prompt versions on quality/latency metrics Python (scipy stats); automated on every version deployment High
Prompt Analytics Dashboard UI Traffic by version, quality trends, anomaly history Grafana, Datadog, custom React dashboard Medium
Alert Router Integration Route alerts to on-call and governance channels PagerDuty, OpsGenie, Slack, Microsoft Teams High

7. Data Flow

Primary Flow

Step Actor Action Output
1 AI Client Wrapper Assembles prompt from template + context + user input Assembled prompt, templateId, templateVersion
2 PII Scanner Scans assembled prompt synchronously for PII entities Clean prompt (PII replaced with category tokens) or PII alert + redacted prompt
3 Injection Pattern Matcher Scans assembled prompt for injection signatures Clean signal or injection alert with matched pattern
4 Prompt Metadata Logger Records templateId, templateVersion, inputTokens, promptHash, timestamp to log record Structured log record with prompt metadata (no raw content unless approved)
5 OTel Collector Receives log record; applies attribute enrichment; forwards to log backend Enriched log record in storage
6 Drift Detection Engine Runs batch analysis on 15-minute window; computes JS divergence vs baseline Drift score per template; alert if threshold exceeded
7 Anomaly Detection Engine Processes micro-batch; evaluates token length and structural anomalies Anomaly flags on flagged requests; counters incremented
8 Performance Comparator On new version deployment: runs statistical comparison vs previous version A/B test result with p-value and recommendation
9 Alert Router Receives alert events; routes to appropriate channel by severity and type Notifications to PagerDuty, Slack, governance channels

Error Flow

Error Scenario Detection Action Recovery
PII scanner unavailable Health check failure; scanner timeout Block prompt from being sent to model API; raise P1 alert Fail closed: no prompt processed without PII scan; restore scanner service
Injection pattern DB out of date Pattern match rate drops to zero for known test patterns Alert to security team Update pattern library; hotfix deployment
Drift detection job fails Job completion metric absent; Airflow failure alert Alert to platform engineering; previous baseline retained Investigate job logs; re-run manually
Template version registry unavailable API timeout from prompt metadata logger Log requests with templateId=UNKNOWN; continue processing Registry restoration; backfill missing version attribution
Cost anomaly false positive spike Alert volume exceeds 20/hour Suppress and escalate to AI engineering for threshold review Adjust thresholds; add per-template baseline recalibration

8. Security Considerations

Authentication: PII scanner and injection matcher services authenticate to the AI Client Wrapper via service-to-service mTLS. Prompt template registry access requires API key + role claim.

Authorisation: Access to prompt content logs (if enabled) requires data governance approval and is restricted to a named set of individuals. Bulk export requires CISO approval. Prompt analytics dashboards showing only aggregated metadata are accessible to AI engineering and product teams.

Secrets Management: Any model API keys or scanner API keys are stored in secrets manager, rotated quarterly. Scanner services running in-process with the AI Client Wrapper inherit the application's secret access; no additional secret scopes required.

Data Classification: Raw prompt content is classified as Confidential if it contains user-provided data. Prompt template text is classified as Internal. PII detected in prompts is classified as Sensitive — alert records are retained but the PII value is never stored, only the entity category and position.

Encryption: Prompt analytics data encrypted at rest (AES-256) and in transit (TLS 1.3). PII alert records stored in a high-security log store with additional access controls beyond the standard telemetry store.

Auditability: Every access to prompt content logs is itself audited. PII detection events are immutable and retained for the full regulatory retention period. Injection attempt logs are retained as security event records.

OWASP LLM Top 10 Coverage

OWASP LLM Risk Prompt Monitoring Control Implementation
LLM01 Prompt Injection Injection pattern matcher; structural anomaly detection Alert on injection signatures within 60 seconds of detection
LLM02 Insecure Output Handling Output monitoring feeds back to prompt analysis Correlate injection detection with unusual output patterns
LLM03 Training Data Poisoning Input distribution drift monitoring Detect when prompt inputs shift toward adversarial patterns
LLM04 Model Denial of Service Abnormally long prompt detection Alert on prompts exceeding 3 sigma token count; rate limit enforcement
LLM05 Supply Chain Vulnerabilities Prompt template version tracking Detect unexpected template changes not matching deployment records
LLM06 Sensitive Information Disclosure PII scanner before prompt leaves application boundary Block or redact PII in prompts before reaching third-party model API
LLM07 Insecure Plugin Design Tool call context in prompt metadata Monitor tool-call instructions injected via prompts
LLM08 Excessive Agency Detect prompts attempting to expand model scope Alert on role-override patterns; monitor for capability escalation instructions
LLM09 Overreliance Prompt quality analytics; version regression detection Surface quality regressions before they cause downstream overreliance
LLM10 Model Theft Monitor for prompt patterns designed to extract system prompts Alert on meta-prompt patterns (tell me your instructions, repeat after me)

9. Governance Considerations

Responsible AI: Prompt monitoring provides the evidence base for responsible AI review processes. Governance teams can audit which prompt versions were active during a specific period, what PII exposure events occurred, and whether injection attempts were detected and blocked.

Model Risk Management: Material prompt changes constitute model risk events. The prompt version registry and performance comparator provide the documentation and evidence required for model risk sign-off on prompt deployments.

Human Approval: Deployment of new prompt template versions to production requires approval from AI engineering lead for changes affecting > 10% of traffic. Changes to system prompts require AI governance committee approval.

Policy: Prompt content logging policy must be documented, approved by legal and privacy, and reviewed annually. The default is no prompt content logging; any deviation requires explicit approval with defined retention limits and access controls.

Traceability: Every PII detection event is traceable from the alert record to the prompt request (via requestId), to the user session (via hashed userId), to the data source that introduced the PII into the prompt context. This chain supports Privacy Act investigation obligations.

Governance Artefacts

Artefact Owner Frequency Format
Prompt Version Registry AI Engineering Continuous (per deployment) Version-controlled database with API
PII Exposure Incident Log Privacy / Data Governance Per incident Immutable event store record
Injection Attempt Report Security Weekly Automated report: count, patterns, severity
Prompt A/B Performance Report AI Engineering Per version deployment Automated statistical comparison document
Drift Alert History AI Platform Monthly review Dashboard export + trend analysis
Prompt Content Logging Authorisation Legal / Privacy Annual Signed policy document

10. Operational Considerations

Monitoring: The PII scanner and injection matcher are in the critical inference path (synchronous). Their latency and availability must be monitored as first-class SLOs. If the PII scanner fails, the system must fail closed (not continue without scanning).

Logging: Monitoring system operational logs are stored separately from the AI audit logs they monitor, to prevent circular dependencies and to allow independent access control.

Incident Response: PII-in-prompt incidents are treated as data breach candidates and immediately escalate to the privacy officer. Injection attack incidents escalate to the security operations centre. Drift alerts escalate to AI engineering.

Disaster Recovery: PII scanner can run in degraded mode (regex-only, without NER model) during model service outage. This reduces detection accuracy but maintains baseline protection. Injection pattern matcher can fail open for availability (with alert) only if system is behind a WAF with injection rules.

Capacity Planning: PII scanner adds synchronous latency. Benchmarking required to establish acceptable throughput. At 1,000 requests/second, PII scanner must complete in < 10ms to avoid adding perceptible latency. Presidio with spaCy small model achieves 2–5ms for typical prompt lengths.

SLO Table

SLO Target Measurement Alert Threshold
PII scanner latency < 10ms p99 Instrumented scanner response time > 20ms for 5 minutes
PII scanner availability > 99.9% Health check pass rate < 99.5% for 5 minutes
Injection detection latency < 5ms p99 Pattern matcher response time > 15ms for 5 minutes
Drift detection freshness Runs within 20 minutes of schedule Job completion timestamp > 30 minutes behind schedule
Alert delivery from detection to notification < 5 minutes Alert timestamp vs. detection timestamp > 10 minutes

Disaster Recovery Table

Component RTO RPO Recovery Approach
PII Scanner 2 minutes (fail closed) N/A (stateless) Auto-restart; fallback to regex-only mode
Injection Matcher 5 minutes (fail open with alert) N/A (stateless) Auto-restart; WAF rules as backup
Drift Detection Engine 60 minutes Last batch Restart job; run catch-up analysis
Prompt Registry 15 minutes 1 hour Database restore; requests continue with UNKNOWN version tag
Alert Router 5 minutes Near-zero Active-active; SMS fallback if primary down

11. Cost Considerations

Cost Drivers

Driver Description Relative Cost
PII scanner compute (synchronous) NER model inference per request; scales linearly with request volume High at large scale
Injection pattern matching Regex fast; embedding similarity slower; regex is recommended for production Low (regex) to High (embeddings)
Drift detection compute Batch statistical computation; cost scales with data volume and feature count Medium
Prompt analytics storage Aggregated metadata (no content); much smaller than full log storage Low
A/B comparison compute Statistical tests on deployment events; infrequent Low

Scaling Risks: At very high request volumes (> 10K requests/second), synchronous PII scanning becomes a bottleneck. Mitigation: use streaming architecture where PII scanning happens asynchronously with a short (50ms) buffer before forwarding to model API; fail-closed if buffer not cleared.

Optimisations:

  • Use regex-first PII scanning (fast) with NER model as fallback for regex-unconfident cases
  • Cache injection pattern compilation (patterns are static; no runtime recompilation)
  • Aggregate drift detection metrics at collector before storage; store distribution summaries not raw token counts

Indicative Cost Range

Scale Requests/Day Estimated Prompt Monitoring Cost/Month
Small 10,000 $100–$300
Medium 500,000 $800–$2,000
Large 5,000,000 $3,000–$8,000
Enterprise 50,000,000+ $15,000–$40,000 (with batched PII scanning)

12. Trade-Off Analysis

Approach Comparison

Approach Pros Cons Best For
Synchronous PII scan + injection match in critical path Fail-closed; guaranteed pre-delivery protection; no data escapes without scan Adds latency (2–10ms); availability dependency Regulated industries; customer-facing AI; any external model API
Asynchronous post-delivery analysis only Zero latency impact; simpler architecture PII already sent to model provider before detection; too late for injection blocking Internal tools only; no external model API; low-risk use cases
Provider-side content filtering (e.g., Azure Content Safety, Bedrock Guardrails) Managed service; no infrastructure overhead Vendor lock-in; limited customisation; PII still traverses provider network; limited telemetry Organisations without platform engineering capacity; greenfield deployments

Architectural Tensions

Tension Description Resolution
Safety vs. Latency Synchronous scanning adds latency to every request Use fast regex-first scanning; NER only for regex-uncertain cases; <10ms budget enforced by SLO
Privacy vs. Debuggability Full prompt logging enables root-cause debugging but risks PII storage Log prompt metadata only by default; content logging requires governance approval + PII scrubbing
Sensitivity vs. False Positives Aggressive injection detection triggers false positives on legitimate complex prompts Tiered detection: regex-flagged prompts reviewed by NLP classifier before alerting
Completeness vs. Cost Monitoring every prompt provides full coverage but scales cost Sample monitoring at 100% for anomaly detection; full analysis on flagged subset

13. Failure Modes

Failure Likelihood Impact Detection Recovery
PII scanner false negative (misses PII) Medium Critical (data breach) Regular audit with labeled PII test set Improve scanner; notify privacy officer of exposure risk
Injection attack evades pattern matcher Medium High (prompt manipulation) Output monitoring; user reports Update pattern library; add embedding-based detection
Drift detection baseline staleness Medium Medium (false drift alerts) Alert volume spike; all templates flagged simultaneously Recalibrate baseline; add seasonal adjustment
Template registry unavailable at deployment Low Medium (version attribution lost) Deployment pipeline health check Queue version registration; backfill when registry recovers
PII scanner causes 50ms+ latency spikes Medium High (user experience degradation) p99 latency alert; scanner latency SLO breach Switch to regex-only mode; alert platform engineering

Cascading Scenarios

  • Scenario 1: PII scanner disabled for maintenance → PII reaches external model API → Provider logs PII → Privacy Act breach notification required. Mitigation: no maintenance window without fail-closed alternative; scanner redundancy mandatory.
  • Scenario 2: Injection attack evades detection → System prompt exfiltrated → Attacker crafts targeted follow-up attacks → Escalating security incident. Mitigation: monitor output for system prompt content; WAF rules as secondary control.

14. Regulatory Considerations

Regulation Clause Requirement Prompt Monitoring Implementation
Privacy Act 1988 (AU) APP 11.1 (Security) Personal information must not be disclosed to third parties without consent PII scanner prevents PII reaching external model APIs; detection events logged
Privacy Act 1988 (AU) APP 11.2 (Destruction) PII no longer needed must be destroyed Prompt metadata retained without PII content; destruction schedule enforced
APRA CPS 234 Para 36 (Cyber Incident Response) Security incidents (injection attacks) detected and reported within defined timeframes Injection alerts within 60s; escalation to SOC per incident management runbook
EU AI Act Article 12 (Record-keeping) High-risk AI: inputs that led to a decision must be logged promptTemplateId + templateVersion + requestId provides traceable record
EU AI Act Article 9.5 (Risk Management) Identify and analyse known risks of AI systems Prompt injection classified as known risk; detection and response procedure documented
ISO/IEC 42001 Clause 6.1.2 (AI Risk Assessment) Risks from AI inputs must be assessed and treated Prompt injection and PII risk documented; controls (scanner, matcher) implemented
NIST AI RMF GOVERN 4.2, MAP 1.5 Document and monitor AI-specific risks including adversarial inputs Prompt monitoring directly addresses adversarial input risk mapping requirement

15. Reference Implementations

AWS

  • PII Scanner: Amazon Comprehend DetectPiiEntities API (synchronous, < 5ms for short prompts)
  • Injection Matcher: Custom Lambda with regex + Amazon Bedrock Guardrails prompt attack detection
  • Drift Detection: AWS Glue job with PySpark; scheduled via EventBridge
  • Prompt Registry: DynamoDB table with version history; API Gateway + Lambda
  • Analytics: CloudWatch Logs Insights; Amazon QuickSight dashboards
  • Alerts: CloudWatch Alarms → SNS → PagerDuty

Azure

  • PII Scanner: Azure AI Language PII Detection (Synchronous REST call)
  • Injection Matcher: Azure Content Safety Prompt Shield (detects direct and indirect injection)
  • Drift Detection: Azure Databricks job; scheduled via Azure Data Factory
  • Prompt Registry: Azure Cosmos DB with change feed; Azure API Management
  • Analytics: Azure Monitor Logs; Power BI dashboards
  • Alerts: Azure Monitor Alerts → Action Groups → Teams / PagerDuty

GCP

  • PII Scanner: Google Cloud DLP (Data Loss Prevention API) with synchronous content inspection
  • Injection Matcher: Custom Cloud Function with Vertex AI Safety filters
  • Drift Detection: BigQuery scheduled queries; Dataflow streaming job
  • Prompt Registry: Firestore with version history; Cloud Endpoints
  • Analytics: Looker dashboards; BigQuery for ad-hoc analysis
  • Alerts: Cloud Monitoring Alerting → PagerDuty / Cloud Pub/Sub

On-Premises

  • PII Scanner: Microsoft Presidio (open source, Python); deploy as sidecar service
  • Injection Matcher: Custom rule engine with OWASP injection signature library
  • Drift Detection: Apache Spark on Hadoop/Kubernetes; Airflow scheduling
  • Prompt Registry: PostgreSQL with versioning schema; FastAPI service
  • Analytics: Grafana dashboards against ClickHouse analytics store
  • Alerts: Alertmanager → PagerDuty / Opsgenie / Email

Pattern ID Pattern Name Relationship Notes
EAAPL-OBS001 AI Telemetry Architecture Foundation Provides log ingestion pipeline; structured log schema required
EAAPL-OBS003 Hallucination Detection Sibling Both are quality monitoring layers; hallucination detection uses output; this uses input
EAAPL-OBS004 AI Incident Management Depends On Injection attacks and PII events feed into incident management lifecycle
EAAPL-OBS006 AI Cost Observability Sibling Cost anomaly detection here (prompt token spikes); broader cost attribution in OBS006
EAAPL-OBS007 Distributed AI Tracing Extends Trace context from OBS007 enables linking prompt anomalies to full request traces

17. Maturity Assessment

Overall Maturity: Proven

Dimension Score (1–5) Rationale
Adoption Breadth 3 Adopted by security-conscious and regulated organisations; emerging in general market
Tooling Ecosystem 4 Presidio, AWS Comprehend, Azure Content Safety are mature; injection detection tooling improving rapidly
Operational Runbook Coverage 3 PII incident runbooks well-defined; injection attack runbooks organisation-specific
Regulatory Evidence 4 Privacy Act and APRA audit findings confirm necessity; EU AI Act requirements emerging
Cost Predictability 4 Cost scales predictably with request volume; PII scanner cost is well-characterised
Team Skill Availability 3 NLP/NER skills required for custom scanner tuning; regex-only implementations accessible to all teams

18. Revision History

Version Date Author Changes
1.0.0 2026-06-12 EAAPL Working Group Initial publication
← Back to LibraryMore Observability & Monitoring