Proven

EAAPL-OBS002 · Prompt Monitoring

📊 Observability & Monitoring🏭 Field-tested in AU

EAAPL-OBS002 · Prompt Monitoring

Pattern ID: EAAPL-OBS002 Status: Proven Complexity: Medium Tags: observability prompt-engineering alerting pii-handling medium-complexity Version: 1.0.0 Last Reviewed: 2026-06-12

1. Executive Summary

Prompts sent to large language models in production are the primary control surface for AI system behaviour, yet most organisations have no systematic visibility into what prompts are actually being sent, how they change over time, and whether they carry sensitive data. Prompt engineering changes are often deployed without telemetry, creating a class of silent regressions that appear as customer complaints rather than metric alerts.

This pattern defines continuous monitoring of prompts sent to LLMs in production environments. It covers: drift detection to identify unintended prompt distribution changes; anomaly alerting for PII exposure, injection attempts, jailbreak signatures, and abnormal prompt lengths; cost anomaly detection tied to prompt length trends; sensitive data protection through pre-log PII scanning; prompt version tracking with traffic distribution and performance comparison; and prompt performance analytics correlating prompt versions to success rates and user satisfaction. Together, these capabilities give engineering teams the same observability over their prompts that they expect over their code.

Target Audience: CIO, CTO, AI Engineering Lead, Platform Engineering Lead Time to Implement: 4–8 weeks

2. Problem Statement

Business Problem

Prompt engineering is the fastest-changing layer in most AI systems, yet it has no equivalent of git blame, deployment monitoring, or regression alerting. When a prompt change degrades user experience, the signal comes from user complaints or NPS drops — not from a dashboard alert within minutes of deployment. Organisations cannot demonstrate to regulators which prompt version was active when a disputed AI output was generated.

Technical Problem

Prompts are assembled dynamically from templates, retrieved context, and user inputs. The result is that no two prompts are identical, making traditional change detection (file diffing) inapplicable. Statistical monitoring is required to detect when the distribution of prompts has shifted beyond normal variance. Additionally, user inputs injected into prompts can carry PII or adversarial content that bypasses application-layer controls and reaches the model API — where it may be logged by the provider in violation of data agreements.

Symptoms

Prompt template changes deployed to production with no performance comparison
Prompt injection attacks detected only through customer complaints or model output review, not automated detection
Average prompt length increasing 30% over 3 months, driving cost increases, with no alert triggered
Data breach inquiry reveals customer PII was included in prompts sent to third-party model API
Different API gateway replicas running different prompt versions simultaneously with no visibility

Cost of Inaction

Silent prompt regressions persist for days to weeks, degrading user experience for all affected requests
PII in prompts sent to third-party providers constitutes a data breach under Privacy Act APP 11
Prompt injection attacks succeed silently, potentially exfiltrating context window data
Inability to demonstrate version-controlled AI behaviour to regulators constitutes CPS 234 finding

3. Context

When to Apply

Any production system using dynamic prompt templates with variable context injection
Systems where multiple prompt versions may be active simultaneously (A/B testing, staged rollouts)
Any system sending user-provided content as part of prompts to external model APIs
Organisations subject to APRA, Privacy Act, EU AI Act, or internal AI governance requirements
Prerequisites: EAAPL-OBS001 (AI Telemetry Architecture) must be in place for log ingestion

When NOT to Apply

Systems using only static, fixed prompts with no variable content (extremely rare in practice)
Internal developer tools where all users are trusted and PII exposure risk is accepted
Proof-of-concept systems with < 30-day planned lifespan

Prerequisites

Prerequisite	Required	Notes
EAAPL-OBS001 AI Telemetry Infrastructure	Required	Log ingestion pipeline and structured log schema required
Prompt template versioning system	Required	Templates must be version-tagged before monitoring is meaningful
Statistical analysis runtime (Python/Spark)	Required	Drift detection requires statistical compute
PII detection library	Required	Presidio, AWS Comprehend, or equivalent
Secrets management	Required	Keys must not appear in prompt logs

Industry Applicability

Industry	Applicability	Primary Driver
Financial Services	Critical	Regulatory audit, PII in prompts, version control for disputes
Healthcare	Critical	PHI in prompts is HIPAA/Privacy Act violation
Legal Services	High	Privilege leak in prompts, version accountability
Government	High	FOI obligations, prompt injection as attack vector
Retail / E-Commerce	Medium	Cost anomaly detection, personalisation prompt quality
Technology / SaaS	High	Multi-tenant PII separation, A/B prompt testing

4. Architecture Overview

The Prompt Monitoring Architecture operates as an analytical overlay on the AI telemetry stream established by EAAPL-OBS001. It does not sit in the critical path of AI request processing; all analysis is performed asynchronously on telemetry data to avoid adding latency to inference calls.

Prompt Sanitisation and Metadata Capture

At the instrumentation layer, the AI Client Wrapper (from EAAPL-OBS001) captures prompt metadata before the prompt is sent to the model. The wrapper computes a SHA-256 hash of the prompt template (without variable content) to identify which template version generated the prompt. It records the template identifier, template version, and the token count of the assembled prompt. Raw prompt content is NOT logged by default. If prompt content logging is approved (for regulated audit purposes), the PII scanner runs synchronously before logging, replacing detected PII with category tokens (e.g., [PERSON_NAME], [CREDIT_CARD]).

Prompt Version Tracking

Every prompt request record includes a promptTemplateId and promptTemplateVersion, enabling the system to track which template versions are active in production at any point. A prompt version registry service maintains the authoritative mapping of templateId+version to the actual template text (stored securely, not in the telemetry stream). The registry exposes APIs used by dashboards to show: current active versions by environment, traffic distribution across versions in A/B tests, and deployment history.

Drift Detection Engine

The drift detection engine runs as a scheduled batch job (every 15 minutes for high-volume systems, hourly for lower volume). For each prompt template, it computes statistical features over the rolling window of prompt instances: mean and standard deviation of input token counts, distribution of context length, vocabulary distribution of injected user content (if content logging approved), and template version mix. These features are compared to a reference baseline established from a rolling 7-day window prior to the analysis period. The Jensen-Shannon divergence between current and baseline distributions is computed for each feature. A divergence score exceeding configurable thresholds triggers a drift alert with the affected template ID, the diverging feature, and the magnitude of divergence.

Anomaly Detection Engine

The anomaly detection engine processes the prompt metadata stream in near-real-time (1-minute micro-batches). It applies four detection rules. First, unusually long prompts: if assembled prompt token count exceeds 3 standard deviations above the rolling mean for that template, the request is flagged. Second, PII detection: a synchronous PII scanner checks assembled prompts (or prompt hashes plus input-field metadata if full content logging is disabled) for PII patterns before they leave the application perimeter. Third, prompt injection signatures: a pattern matcher scans for known injection phrases (ignore previous instructions, you are now, act as, disregard your system prompt, etc.) and for instruction-boundary overrides. Fourth, suspicious structural patterns: prompts with unusual ratios of special characters, base64-encoded content, or role-alternation patterns that do not match the expected template structure.

Cost Anomaly Detection

Prompt token counts are correlated with cost data from the cost telemetry stream. The cost anomaly engine computes a rolling 7-day baseline for average prompt token count per template. If the 1-hour rolling average for any template increases by more than 50% above baseline, a cost anomaly alert is triggered. This catches scenarios where a prompt template change or data pipeline malfunction causes prompts to grow unexpectedly — a common cause of sudden 2–5x cost spikes.

Prompt Performance Analytics

Quality metrics are tracked per prompt template version: success rate (non-error completion), user satisfaction signal (thumbs up/down, task completion if measurable), hallucination rate from EAAPL-OBS003, and latency. When a new template version is deployed, a statistical comparison is automatically initiated between the outgoing and incoming versions using Mann-Whitney U test for latency and proportion z-test for success rate. If the incoming version is statistically significantly worse on any metric at p < 0.05, a deployment gate recommendation is raised.

5. Architecture Diagram

ARCHITECTURE DIAGRAM

flowchart TD subgraph Capture["Prompt Capture"] A[Assembled Prompt] B{PII Scanner} C[Prompt Metadata Logger] end subgraph Analysis["Analysis Layer"] D[(Log Backend)] E[Drift Detection Engine] F[Injection Pattern Matcher] end subgraph Governance["Governance"] G[Prompt Version Registry] H[Performance Comparator] end A --> B B -->|PII found| I[Redact + Alert] B -->|clean| C I --> C C --> F C --> D D --> E D --> G G --> H E --> J[Drift Alert] F --> J H --> K[Version Dashboard] style A fill:#dbeafe,stroke:#3b82f6 style B fill:#f3e8ff,stroke:#a855f7 style C fill:#f0fdf4,stroke:#22c55e style D fill:#fef9c3,stroke:#eab308 style E fill:#f0fdf4,stroke:#22c55e style F fill:#f0fdf4,stroke:#22c55e style G fill:#fef9c3,stroke:#eab308 style H fill:#f0fdf4,stroke:#22c55e style I fill:#fee2e2,stroke:#ef4444 style J fill:#fee2e2,stroke:#ef4444 style K fill:#d1fae5,stroke:#10b981

6. Components

Component	Type	Responsibility	Technology Options	Criticality
PII Scanner	SDK Library	Scan prompt content for PII before logging; redact detected entities	Microsoft Presidio, AWS Comprehend (DetectPiiEntities), Google DLP, spaCy NER	Critical
Injection Pattern Matcher	SDK Library	Detect prompt injection signatures in real-time	Rule-based regex + embeddings similarity scorer; custom model fine-tuned on injection examples	Critical
Prompt Metadata Logger	SDK Library	Capture templateId, version, token counts, hash; emit to OTel pipeline	Custom wrapper on AI Client Wrapper from EAAPL-OBS001	Critical
Prompt Template Registry	Service	Authoritative version-to-template mapping; deployment history	Git-backed service with API; Backstage plugin; custom service on PostgreSQL	High
Drift Detection Engine	Batch Job	Statistical comparison of current vs baseline prompt distributions	Python (scipy, numpy); PySpark for high volume; scheduled on Airflow/Prefect	High
Anomaly Detection Engine	Stream Processor	Near-real-time token length and pattern anomaly detection	Flink, Spark Streaming, AWS Kinesis Analytics	High
Cost Anomaly Engine	Stream Processor	Correlate prompt token counts with cost; detect cost spikes	Joins prompt metadata with cost telemetry; threshold-based alerting	Medium
Performance Comparator	Batch Job	Statistical A/B comparison of prompt versions on quality/latency metrics	Python (scipy stats); automated on every version deployment	High
Prompt Analytics Dashboard	UI	Traffic by version, quality trends, anomaly history	Grafana, Datadog, custom React dashboard	Medium
Alert Router	Integration	Route alerts to on-call and governance channels	PagerDuty, OpsGenie, Slack, Microsoft Teams	High

7. Data Flow

Primary Flow

Step	Actor	Action	Output
1	AI Client Wrapper	Assembles prompt from template + context + user input	Assembled prompt, templateId, templateVersion
2	PII Scanner	Scans assembled prompt synchronously for PII entities	Clean prompt (PII replaced with category tokens) or PII alert + redacted prompt
3	Injection Pattern Matcher	Scans assembled prompt for injection signatures	Clean signal or injection alert with matched pattern
4	Prompt Metadata Logger	Records templateId, templateVersion, inputTokens, promptHash, timestamp to log record	Structured log record with prompt metadata (no raw content unless approved)
5	OTel Collector	Receives log record; applies attribute enrichment; forwards to log backend	Enriched log record in storage
6	Drift Detection Engine	Runs batch analysis on 15-minute window; computes JS divergence vs baseline	Drift score per template; alert if threshold exceeded
7	Anomaly Detection Engine	Processes micro-batch; evaluates token length and structural anomalies	Anomaly flags on flagged requests; counters incremented
8	Performance Comparator	On new version deployment: runs statistical comparison vs previous version	A/B test result with p-value and recommendation
9	Alert Router	Receives alert events; routes to appropriate channel by severity and type	Notifications to PagerDuty, Slack, governance channels

Error Flow

Error Scenario	Detection	Action	Recovery
PII scanner unavailable	Health check failure; scanner timeout	Block prompt from being sent to model API; raise P1 alert	Fail closed: no prompt processed without PII scan; restore scanner service
Injection pattern DB out of date	Pattern match rate drops to zero for known test patterns	Alert to security team	Update pattern library; hotfix deployment
Drift detection job fails	Job completion metric absent; Airflow failure alert	Alert to platform engineering; previous baseline retained	Investigate job logs; re-run manually
Template version registry unavailable	API timeout from prompt metadata logger	Log requests with templateId=UNKNOWN; continue processing	Registry restoration; backfill missing version attribution
Cost anomaly false positive spike	Alert volume exceeds 20/hour	Suppress and escalate to AI engineering for threshold review	Adjust thresholds; add per-template baseline recalibration

8. Security Considerations

Authentication: PII scanner and injection matcher services authenticate to the AI Client Wrapper via service-to-service mTLS. Prompt template registry access requires API key + role claim.

Authorisation: Access to prompt content logs (if enabled) requires data governance approval and is restricted to a named set of individuals. Bulk export requires CISO approval. Prompt analytics dashboards showing only aggregated metadata are accessible to AI engineering and product teams.

Secrets Management: Any model API keys or scanner API keys are stored in secrets manager, rotated quarterly. Scanner services running in-process with the AI Client Wrapper inherit the application's secret access; no additional secret scopes required.

Data Classification: Raw prompt content is classified as Confidential if it contains user-provided data. Prompt template text is classified as Internal. PII detected in prompts is classified as Sensitive — alert records are retained but the PII value is never stored, only the entity category and position.

Encryption: Prompt analytics data encrypted at rest (AES-256) and in transit (TLS 1.3). PII alert records stored in a high-security log store with additional access controls beyond the standard telemetry store.

Auditability: Every access to prompt content logs is itself audited. PII detection events are immutable and retained for the full regulatory retention period. Injection attempt logs are retained as security event records.

OWASP LLM Top 10 Coverage

OWASP LLM Risk	Prompt Monitoring Control	Implementation
LLM01 Prompt Injection	Injection pattern matcher; structural anomaly detection	Alert on injection signatures within 60 seconds of detection
LLM02 Insecure Output Handling	Output monitoring feeds back to prompt analysis	Correlate injection detection with unusual output patterns
LLM03 Training Data Poisoning	Input distribution drift monitoring	Detect when prompt inputs shift toward adversarial patterns
LLM04 Model Denial of Service	Abnormally long prompt detection	Alert on prompts exceeding 3 sigma token count; rate limit enforcement
LLM05 Supply Chain Vulnerabilities	Prompt template version tracking	Detect unexpected template changes not matching deployment records
LLM06 Sensitive Information Disclosure	PII scanner before prompt leaves application boundary	Block or redact PII in prompts before reaching third-party model API
LLM07 Insecure Plugin Design	Tool call context in prompt metadata	Monitor tool-call instructions injected via prompts
LLM08 Excessive Agency	Detect prompts attempting to expand model scope	Alert on role-override patterns; monitor for capability escalation instructions
LLM09 Overreliance	Prompt quality analytics; version regression detection	Surface quality regressions before they cause downstream overreliance
LLM10 Model Theft	Monitor for prompt patterns designed to extract system prompts	Alert on meta-prompt patterns (tell me your instructions, repeat after me)

9. Governance Considerations

Responsible AI: Prompt monitoring provides the evidence base for responsible AI review processes. Governance teams can audit which prompt versions were active during a specific period, what PII exposure events occurred, and whether injection attempts were detected and blocked.

Model Risk Management: Material prompt changes constitute model risk events. The prompt version registry and performance comparator provide the documentation and evidence required for model risk sign-off on prompt deployments.

Human Approval: Deployment of new prompt template versions to production requires approval from AI engineering lead for changes affecting > 10% of traffic. Changes to system prompts require AI governance committee approval.

Policy: Prompt content logging policy must be documented, approved by legal and privacy, and reviewed annually. The default is no prompt content logging; any deviation requires explicit approval with defined retention limits and access controls.

Traceability: Every PII detection event is traceable from the alert record to the prompt request (via requestId), to the user session (via hashed userId), to the data source that introduced the PII into the prompt context. This chain supports Privacy Act investigation obligations.

Governance Artefacts

Artefact	Owner	Frequency	Format
Prompt Version Registry	AI Engineering	Continuous (per deployment)	Version-controlled database with API
PII Exposure Incident Log	Privacy / Data Governance	Per incident	Immutable event store record
Injection Attempt Report	Security	Weekly	Automated report: count, patterns, severity
Prompt A/B Performance Report	AI Engineering	Per version deployment	Automated statistical comparison document
Drift Alert History	AI Platform	Monthly review	Dashboard export + trend analysis
Prompt Content Logging Authorisation	Legal / Privacy	Annual	Signed policy document

10. Operational Considerations

Monitoring: The PII scanner and injection matcher are in the critical inference path (synchronous). Their latency and availability must be monitored as first-class SLOs. If the PII scanner fails, the system must fail closed (not continue without scanning).

Logging: Monitoring system operational logs are stored separately from the AI audit logs they monitor, to prevent circular dependencies and to allow independent access control.

Incident Response: PII-in-prompt incidents are treated as data breach candidates and immediately escalate to the privacy officer. Injection attack incidents escalate to the security operations centre. Drift alerts escalate to AI engineering.

Disaster Recovery: PII scanner can run in degraded mode (regex-only, without NER model) during model service outage. This reduces detection accuracy but maintains baseline protection. Injection pattern matcher can fail open for availability (with alert) only if system is behind a WAF with injection rules.

Capacity Planning: PII scanner adds synchronous latency. Benchmarking required to establish acceptable throughput. At 1,000 requests/second, PII scanner must complete in < 10ms to avoid adding perceptible latency. Presidio with spaCy small model achieves 2–5ms for typical prompt lengths.

SLO Table

SLO	Target	Measurement	Alert Threshold
PII scanner latency	< 10ms p99	Instrumented scanner response time	> 20ms for 5 minutes
PII scanner availability	> 99.9%	Health check pass rate	< 99.5% for 5 minutes
Injection detection latency	< 5ms p99	Pattern matcher response time	> 15ms for 5 minutes
Drift detection freshness	Runs within 20 minutes of schedule	Job completion timestamp	> 30 minutes behind schedule
Alert delivery from detection to notification	< 5 minutes	Alert timestamp vs. detection timestamp	> 10 minutes

Disaster Recovery Table

Component	RTO	RPO	Recovery Approach
PII Scanner	2 minutes (fail closed)	N/A (stateless)	Auto-restart; fallback to regex-only mode
Injection Matcher	5 minutes (fail open with alert)	N/A (stateless)	Auto-restart; WAF rules as backup
Drift Detection Engine	60 minutes	Last batch	Restart job; run catch-up analysis
Prompt Registry	15 minutes	1 hour	Database restore; requests continue with UNKNOWN version tag
Alert Router	5 minutes	Near-zero	Active-active; SMS fallback if primary down

11. Cost Considerations

Cost Drivers

Driver	Description	Relative Cost
PII scanner compute (synchronous)	NER model inference per request; scales linearly with request volume	High at large scale
Injection pattern matching	Regex fast; embedding similarity slower; regex is recommended for production	Low (regex) to High (embeddings)
Drift detection compute	Batch statistical computation; cost scales with data volume and feature count	Medium
Prompt analytics storage	Aggregated metadata (no content); much smaller than full log storage	Low
A/B comparison compute	Statistical tests on deployment events; infrequent	Low

Scaling Risks: At very high request volumes (> 10K requests/second), synchronous PII scanning becomes a bottleneck. Mitigation: use streaming architecture where PII scanning happens asynchronously with a short (50ms) buffer before forwarding to model API; fail-closed if buffer not cleared.

Optimisations:

Use regex-first PII scanning (fast) with NER model as fallback for regex-unconfident cases
Cache injection pattern compilation (patterns are static; no runtime recompilation)
Aggregate drift detection metrics at collector before storage; store distribution summaries not raw token counts

Indicative Cost Range

Scale	Requests/Day	Estimated Prompt Monitoring Cost/Month
Small	10,000	$100–$300
Medium	500,000	$800–$2,000
Large	5,000,000	$3,000–$8,000
Enterprise	50,000,000+	$15,000–$40,000 (with batched PII scanning)

12. Trade-Off Analysis

Approach Comparison

Approach	Pros	Cons	Best For
Synchronous PII scan + injection match in critical path	Fail-closed; guaranteed pre-delivery protection; no data escapes without scan	Adds latency (2–10ms); availability dependency	Regulated industries; customer-facing AI; any external model API
Asynchronous post-delivery analysis only	Zero latency impact; simpler architecture	PII already sent to model provider before detection; too late for injection blocking	Internal tools only; no external model API; low-risk use cases
Provider-side content filtering (e.g., Azure Content Safety, Bedrock Guardrails)	Managed service; no infrastructure overhead	Vendor lock-in; limited customisation; PII still traverses provider network; limited telemetry	Organisations without platform engineering capacity; greenfield deployments

Architectural Tensions

Tension	Description	Resolution
Safety vs. Latency	Synchronous scanning adds latency to every request	Use fast regex-first scanning; NER only for regex-uncertain cases; <10ms budget enforced by SLO
Privacy vs. Debuggability	Full prompt logging enables root-cause debugging but risks PII storage	Log prompt metadata only by default; content logging requires governance approval + PII scrubbing
Sensitivity vs. False Positives	Aggressive injection detection triggers false positives on legitimate complex prompts	Tiered detection: regex-flagged prompts reviewed by NLP classifier before alerting
Completeness vs. Cost	Monitoring every prompt provides full coverage but scales cost	Sample monitoring at 100% for anomaly detection; full analysis on flagged subset

13. Failure Modes

Failure	Likelihood	Impact	Detection	Recovery
PII scanner false negative (misses PII)	Medium	Critical (data breach)	Regular audit with labeled PII test set	Improve scanner; notify privacy officer of exposure risk
Injection attack evades pattern matcher	Medium	High (prompt manipulation)	Output monitoring; user reports	Update pattern library; add embedding-based detection
Drift detection baseline staleness	Medium	Medium (false drift alerts)	Alert volume spike; all templates flagged simultaneously	Recalibrate baseline; add seasonal adjustment
Template registry unavailable at deployment	Low	Medium (version attribution lost)	Deployment pipeline health check	Queue version registration; backfill when registry recovers
PII scanner causes 50ms+ latency spikes	Medium	High (user experience degradation)	p99 latency alert; scanner latency SLO breach	Switch to regex-only mode; alert platform engineering

Cascading Scenarios

Scenario 1: PII scanner disabled for maintenance → PII reaches external model API → Provider logs PII → Privacy Act breach notification required. Mitigation: no maintenance window without fail-closed alternative; scanner redundancy mandatory.
Scenario 2: Injection attack evades detection → System prompt exfiltrated → Attacker crafts targeted follow-up attacks → Escalating security incident. Mitigation: monitor output for system prompt content; WAF rules as secondary control.

14. Regulatory Considerations

Regulation	Clause	Requirement	Prompt Monitoring Implementation
Privacy Act 1988 (AU)	APP 11.1 (Security)	Personal information must not be disclosed to third parties without consent	PII scanner prevents PII reaching external model APIs; detection events logged
Privacy Act 1988 (AU)	APP 11.2 (Destruction)	PII no longer needed must be destroyed	Prompt metadata retained without PII content; destruction schedule enforced
APRA CPS 234	Para 36 (Cyber Incident Response)	Security incidents (injection attacks) detected and reported within defined timeframes	Injection alerts within 60s; escalation to SOC per incident management runbook
EU AI Act	Article 12 (Record-keeping)	High-risk AI: inputs that led to a decision must be logged	promptTemplateId + templateVersion + requestId provides traceable record
EU AI Act	Article 9.5 (Risk Management)	Identify and analyse known risks of AI systems	Prompt injection classified as known risk; detection and response procedure documented
ISO/IEC 42001	Clause 6.1.2 (AI Risk Assessment)	Risks from AI inputs must be assessed and treated	Prompt injection and PII risk documented; controls (scanner, matcher) implemented
NIST AI RMF	GOVERN 4.2, MAP 1.5	Document and monitor AI-specific risks including adversarial inputs	Prompt monitoring directly addresses adversarial input risk mapping requirement

15. Reference Implementations

AWS

PII Scanner: Amazon Comprehend DetectPiiEntities API (synchronous, < 5ms for short prompts)
Injection Matcher: Custom Lambda with regex + Amazon Bedrock Guardrails prompt attack detection
Drift Detection: AWS Glue job with PySpark; scheduled via EventBridge
Prompt Registry: DynamoDB table with version history; API Gateway + Lambda
Analytics: CloudWatch Logs Insights; Amazon QuickSight dashboards
Alerts: CloudWatch Alarms → SNS → PagerDuty

Azure

PII Scanner: Azure AI Language PII Detection (Synchronous REST call)
Injection Matcher: Azure Content Safety Prompt Shield (detects direct and indirect injection)
Drift Detection: Azure Databricks job; scheduled via Azure Data Factory
Prompt Registry: Azure Cosmos DB with change feed; Azure API Management
Analytics: Azure Monitor Logs; Power BI dashboards
Alerts: Azure Monitor Alerts → Action Groups → Teams / PagerDuty

GCP

PII Scanner: Google Cloud DLP (Data Loss Prevention API) with synchronous content inspection
Injection Matcher: Custom Cloud Function with Vertex AI Safety filters
Drift Detection: BigQuery scheduled queries; Dataflow streaming job
Prompt Registry: Firestore with version history; Cloud Endpoints
Analytics: Looker dashboards; BigQuery for ad-hoc analysis
Alerts: Cloud Monitoring Alerting → PagerDuty / Cloud Pub/Sub

On-Premises

PII Scanner: Microsoft Presidio (open source, Python); deploy as sidecar service
Injection Matcher: Custom rule engine with OWASP injection signature library
Drift Detection: Apache Spark on Hadoop/Kubernetes; Airflow scheduling
Prompt Registry: PostgreSQL with versioning schema; FastAPI service
Analytics: Grafana dashboards against ClickHouse analytics store
Alerts: Alertmanager → PagerDuty / Opsgenie / Email

Pattern ID	Pattern Name	Relationship	Notes
EAAPL-OBS001	AI Telemetry Architecture	Foundation	Provides log ingestion pipeline; structured log schema required
EAAPL-OBS003	Hallucination Detection	Sibling	Both are quality monitoring layers; hallucination detection uses output; this uses input
EAAPL-OBS004	AI Incident Management	Depends On	Injection attacks and PII events feed into incident management lifecycle
EAAPL-OBS006	AI Cost Observability	Sibling	Cost anomaly detection here (prompt token spikes); broader cost attribution in OBS006
EAAPL-OBS007	Distributed AI Tracing	Extends	Trace context from OBS007 enables linking prompt anomalies to full request traces

17. Maturity Assessment

Overall Maturity: Proven

Dimension	Score (1–5)	Rationale
Adoption Breadth	3	Adopted by security-conscious and regulated organisations; emerging in general market
Tooling Ecosystem	4	Presidio, AWS Comprehend, Azure Content Safety are mature; injection detection tooling improving rapidly
Operational Runbook Coverage	3	PII incident runbooks well-defined; injection attack runbooks organisation-specific
Regulatory Evidence	4	Privacy Act and APRA audit findings confirm necessity; EU AI Act requirements emerging
Cost Predictability	4	Cost scales predictably with request volume; PII scanner cost is well-characterised
Team Skill Availability	3	NLP/NER skills required for custom scanner tuning; regex-only implementations accessible to all teams

18. Revision History

Version	Date	Author	Changes
1.0.0	2026-06-12	EAAPL Working Group	Initial publication

← Back to Library More Observability & Monitoring →

EAAPL-OBS002 · Prompt Monitoring

EAAPL-OBS002 · Prompt Monitoring

1. Executive Summary

2. Problem Statement

Business Problem

Technical Problem

Symptoms

Cost of Inaction

3. Context

When to Apply

When NOT to Apply

Prerequisites

Industry Applicability

4. Architecture Overview

5. Architecture Diagram

6. Components

7. Data Flow

Primary Flow

Error Flow

8. Security Considerations

OWASP LLM Top 10 Coverage

9. Governance Considerations

Governance Artefacts

10. Operational Considerations

SLO Table

Disaster Recovery Table

11. Cost Considerations

Indicative Cost Range

12. Trade-Off Analysis

Approach Comparison

Architectural Tensions

13. Failure Modes

Cascading Scenarios

14. Regulatory Considerations

15. Reference Implementations

AWS

Azure

GCP

On-Premises

16. Related Patterns

17. Maturity Assessment

18. Revision History