EAAPL-AGT002Proven

Stateful Agent Memory

🤖 Agentic AIEU AI ActISO/IEC 42001

[EAAPL-AGT002] Stateful Agent Memory

Category: Agentic AI Sub-category: Memory Architecture Version: 2.0 Maturity: Proven Tags: memory, episodic-memory, semantic-memory, vector-store, memory-consolidation, privacy, context-window Regulatory Relevance: EU AI Act (Art. 10, 12), Privacy Act 1988, GDPR Art. 17, ISO 42001 §8.4, APRA CPS 234

1. Executive Summary

The Stateful Agent Memory Pattern defines the multi-tier memory architecture that allows AI agents to retain, recall, and apply knowledge across task invocations. Without persistent memory, agents are amnesia machines — each task starts from zero, ignoring valuable context accumulated from prior interactions and producing generic, contextually shallow outputs.

For CIO/CTO audiences: this pattern is the difference between an AI agent that improves over time and one that stays static. It enables agents to remember customer preferences, learn from past task outcomes, and build an evolving understanding of your organisation's domain — delivering compounding value rather than flat productivity gains. This is the architectural foundation for genuinely intelligent automation.

The pattern addresses four distinct memory types: in-context (what the agent knows right now), episodic (what happened in past sessions), semantic (domain knowledge crystallised from experience), and procedural (learned skills for specific task patterns). It governs how memories are created, consolidated, retrieved, and deleted — including the privacy-critical capability to purge individual memories on subject access requests, a hard requirement under GDPR and the Australian Privacy Act. Correct implementation delivers up to 35% improvement in task quality scores as agents build domain-specific knowledge while maintaining compliance with data residency and retention obligations.

2. Problem Statement

Business Problem

Organisations deploy AI agents that handle customer interactions, process documents, or manage workflows, but each invocation is stateless. An agent cannot remember that a specific customer dislikes a particular communication format, that a specific document type contains a recurring error pattern, or that a prior task failed because a tool returned stale data. This statelessness produces generic outputs and forces humans to repeat context on every interaction.

Technical Problem

LLM context windows are bounded (8K–200K tokens) and ephemeral. An agent cannot hold the full history of all past interactions in its context for each new task. There is no native mechanism in foundation models for persistent, structured, queryable memory. Ad-hoc solutions (appending all history to every prompt) fail at scale: context becomes too large, costs explode, and the model's attention degrades over long contexts.

Symptoms of Absence

Agents repeat the same mistakes across tasks without correcting
Users must re-explain context on every new interaction
Agent outputs show no improvement over time despite large interaction volumes
Memory-related PII is retained indefinitely with no purge mechanism, creating GDPR/Privacy Act exposure
High token costs because full conversation history is injected indiscriminately

Cost of Inaction

Quality: Task quality remains static; no compounding improvement
Cost: Full-history injection at scale consumes 3–10× more tokens than selective retrieval
Risk: Uncontrolled memory retention creates undiscovered PII liability; right-to-erasure requests cannot be honoured
Competitive: Agents without memory cannot personalise; competitors with memory deliver measurably better outcomes

3. Context

When to Apply

Agents serving repeat users or recurring task types where prior context improves quality
Agents that benefit from domain-specific learning accumulated over many task executions
Production environments where context window costs at scale require optimisation
Any agent handling personal data where right-to-erasure compliance is required
Agents where task quality must demonstrably improve over time (subject to monitoring)

When NOT to Apply

Single-use or one-shot agents with no recurring interaction pattern
Tasks where using prior context creates unfair bias (e.g., re-using a prior unfavourable risk assessment on a new application)
Environments where all data must be stateless by design (some secure/classified contexts)
Agent tasks where every invocation must be fully independent for audit reasons

Prerequisites

EAAPL-AGT001 (Single Agent Pattern) baseline implemented
Vector store infrastructure provisioned and accessible from agent runtime
PII detection service available (for memory ingestion filtering)
Data classification and retention policy defined for memory content
Subject access request (SAR) and right-to-erasure (RTE) operational process defined

Industry Applicability

Industry	Use Case	Memory Type Priority	Risk Level
Financial Services	Relationship manager AI — recalls client preferences, prior advice context	Episodic + Semantic	High
Healthcare	Clinical AI — recalls patient-specific notes, medication history	Episodic + Semantic	Very High
Retail / E-commerce	Customer service AI — recalls purchase history, preferences	Episodic	Medium
Legal Services	Matter AI — recalls case-specific precedents, client instructions	Episodic + Procedural	High
HR / People Analytics	HR AI — recalls employee lifecycle context	Episodic	Very High
Software Engineering	Code review / generation AI — recalls codebase conventions, past review feedback	Semantic + Procedural	Medium

4. Architecture Overview

The Stateful Agent Memory Pattern organises agent memory into four tiers, each with different temporal scope, retrieval semantics, and storage technology. The tiers are layered: in-context memory is fastest but smallest; procedural memory is slowest to change but most reusable.

Why four tiers rather than a single store? Human cognitive science informs this design. Different types of knowledge have different update frequencies, retrieval patterns, and lifetime requirements. Forcing all memory into a single vector store creates retrieval conflicts (recent episodic memories crowd out stable semantic knowledge), cost inefficiency (re-embedding unchanged procedural knowledge), and privacy complexity (mixing PII-bearing episodic memories with PII-free domain knowledge complicates erasure). Four distinct tiers with their own schemas, retention policies, and retrieval strategies is not architecture astronautics — it is an engineering response to real retrieval quality problems observed at scale.

Tier 1: In-Context Memory (Working Memory) The current context window is the agent's working memory — everything it knows for this iteration of this task. The Context Window Manager is responsible for actively managing what occupies this finite space. It implements a sliding window for conversation history (retaining the N most recent exchanges), importance-weighted retention (important tool results are kept longer than routine ones), and a hard token budget. The in-context memory is ephemeral; it does not persist beyond the current task invocation.

Tier 2: Episodic Memory (Autobiographical Store) Episodic memory stores the record of what happened in past task invocations — a structured log of conversations, decisions made, outcomes observed, and feedback received. Each episodic record is timestamped, tagged with the task type, associated with an entity identifier (user ID, document ID, account ID), and stored as both a structured record (for exact lookups) and an embedding (for semantic similarity retrieval). Retrieval is via semantic search against the current task's context: the top-K most similar past episodes are retrieved and injected into the context.

Episodic memory is the most privacy-sensitive tier. It can contain PII, personal preferences, and sensitive interaction history. It must be linked to the data subject identifier and must support targeted purge operations. Retention policies must be defined per data sensitivity class.

Tier 3: Semantic Memory (Knowledge Store) Semantic memory stores domain knowledge that has been crystallised from episodic experience. Unlike episodic memories (specific events), semantic memories are generalisations: "this customer segment prefers formal communication," "this document type has a recurring pattern in Section 3.2," "this API call fails under these conditions." Semantic memories are created by the Memory Consolidation Engine, which periodically processes episodic memories and extracts stable patterns.

Semantic memory is typically PII-free (personal identifiers are stripped during consolidation), enabling it to be shared across agents and retained longer. Retrieval is via dense vector search, with results ranked by relevance score and recency weight.

Tier 4: Procedural Memory (Skill Library) Procedural memory stores learned task execution strategies — sequences of tool calls and decision logic that have been validated as effective for specific task types. These are analogous to stored procedures: when the agent encounters a task that matches a known procedural template, it can retrieve the template and adapt it rather than planning from scratch. Procedural memories are versioned, have explicit success rate metadata (how often this procedure achieved the task goal), and are retired when their success rate drops below a threshold.

Memory Consolidation The Memory Consolidation Engine runs asynchronously after task completion. It processes the episodic record of the just-completed task, extracts semantic learnings, and updates the semantic and procedural stores. Consolidation is governed by a quality threshold — only memories that meet a minimum relevance and novelty score are written. Duplicate detection prevents redundant knowledge accumulation. The consolidation pipeline includes PII stripping before writing to semantic memory.

Retrieval Orchestration The Memory Retrieval Orchestrator executes retrieval across all relevant tiers in parallel at the start of each task. It ranks results from each tier, applies a cross-tier deduplication pass, and assembles the final memory injection within the context window budget. The orchestrator tracks retrieval quality metrics (retrieval precision proxied by subsequent task quality) to continuously tune retrieval hyperparameters (K, similarity threshold, recency weight).

5. Architecture Diagram

ARCHITECTURE DIAGRAM

flowchart TD subgraph Input["Task Input"] A[Task Request] B[Context Window Manager] end subgraph MemoryTiers["Memory Tiers"] C[(Episodic Store)] D[(Semantic Store)] E[(Procedural Library)] end subgraph Processing["Consolidation and Privacy"] F[Retrieval Orchestrator] G[Consolidation Engine] H[PII Detector] end A --> B B --> F F -->|semantic search| C F -->|vector search| D F -->|pattern lookup| E F -->|ranked context| B B -->|bounded context| A A -->|task result| G G --> H H -->|episodic write| C H -->|PII-free write| D G -->|skill pattern| E style A fill:#dbeafe,stroke:#3b82f6 style B fill:#f0fdf4,stroke:#22c55e style C fill:#fef9c3,stroke:#eab308 style D fill:#fef9c3,stroke:#eab308 style E fill:#fef9c3,stroke:#eab308 style F fill:#f0fdf4,stroke:#22c55e style G fill:#f0fdf4,stroke:#22c55e style H fill:#f3e8ff,stroke:#a855f7

6. Components

Component	Type	Responsibility	Technology Options	Criticality
Context Window Manager	Orchestration	Manages token budget; implements sliding window and importance-weighted retention	Custom, LangChain memory manager, Mem0	Critical
Memory Retrieval Orchestrator	Orchestration	Parallel retrieval across tiers; ranking; deduplication; context assembly	Custom Python/TS, LlamaIndex retrieval pipeline	Critical
Episodic Store	Persistence (Structured + Vector)	Stores full interaction records with embeddings; supports exact and semantic retrieval; per-subject purge	PostgreSQL + pgvector, MongoDB Atlas Vector Search, Supabase	High
Semantic Store	Vector Store	Stores domain knowledge embeddings; dense vector search; long retention	Pinecone, Weaviate, Azure AI Search, Qdrant	High
Procedural Skill Library	Knowledge Base	Stores versioned task execution templates with success metadata	Redis JSON, PostgreSQL, object store	Medium
Memory Consolidation Engine	Background Processor	Extracts semantic learnings from episodic records; strips PII; writes to semantic store	Custom async worker, Celery, Azure Durable Functions	High
PII Detector / Masker	Privacy	Detects and masks PII before memory writes	AWS Comprehend, Azure AI Language, Microsoft Presidio	Critical
Retrieval Ranker	ML Component	Scores and ranks retrieved memories by relevance + recency; deduplicates	Custom scoring function, LLM-as-reranker (cohere rerank)	High
Data Classification Enforcer	Security/Governance	Enforces classification-based access controls on memory read/write	OPA, custom policy middleware	High
Subject Access Request Handler	Privacy Compliance	Generates complete record of all memory held for a given data subject	Custom API, integrated with identity store	Critical
Right-to-Erasure Engine	Privacy Compliance	Purges all episodic records for a subject; cascades to derived semantic memories; logs erasure	Custom async process with cascading delete logic	Critical
Retention Policy Scheduler	Operations	Enforces time-based retention policies per memory classification	APScheduler, AWS EventBridge, Azure Logic Apps	High
Memory Audit Log	Compliance	Immutable log of all memory reads, writes, purges, and consolidations	Append-only store (CloudTrail, Kafka, S3 WORM)	Critical

7. Data Flow

Primary Write Flow (Memory Creation)

Step	Actor	Action	Output
1	Agent (Post-Task)	Submits completed task record: task_id, user_id, conversation_turns, tool_calls, outcome, feedback	Raw task record
2	PII Detector	Scans raw record; detects PII entities (names, IDs, financial data); applies masking rules per classification policy	PII-annotated record; masked episodic record
3	Episodic Store	Writes masked episodic record with full metadata; generates embedding of record summary	Episodic record ID; embedding written to vector index
4	Memory Audit Log	Logs write event: record_id, agent_id, user_id (hashed), classification, timestamp	Audit entry
5	Memory Consolidation Engine	Asynchronously processes episodic record; identifies stable patterns and novel knowledge	Candidate semantic memories; candidate procedural patterns
6	PII Detector (second pass)	Strips any residual PII from candidate semantic memories	PII-free semantic candidates
7	Duplicate Detector	Compares candidates against existing semantic store; calculates novelty score	Deduped candidates with novelty scores
8	Semantic Store	Writes new/updated semantic memories above novelty threshold	Semantic record IDs; updated vector index
9	Procedural Library	Writes new procedural templates with initial success rate metadata	Skill record with version

Primary Read Flow (Memory Retrieval)

Step	Actor	Action	Output
1	Memory Retrieval Orchestrator	Receives task context embedding + entity identifiers	Retrieval request
2	Episodic Store	Semantic search for top-K episodes by similarity to task context, filtered by entity ID	K episodic records
3	Semantic Store	Dense vector search for top-K domain knowledge chunks by similarity	K semantic records
4	Procedural Library	Pattern lookup by task type classification	Matching skill templates
5	Retrieval Ranker	Scores all retrieved items: relevance × recency × quality; deduplicates	Ranked, deduplicated memory list
6	Context Window Manager	Selects top items within token budget; formats for injection	Memory-enriched context block
7	Memory Audit Log	Logs retrieval event: items retrieved, agent_id, task_id, scores	Audit entry

Error Flow

Error Condition	Detection	Recovery
Vector store unavailable	Health check on retrieval orchestrator startup	Degrade to in-context memory only; alert; log degraded mode
PII detection service unavailable	Service health check before write	Block memory write; queue for retry; do not write unscanned memory
Erasure request fails partially	RTE engine transaction log	Retry failed deletes; flag record for manual review; report status to SAR handler
Consolidation writes duplicate	Duplicate detector	Idempotent upsert with cosine similarity gate; no duplicate written

8. Security Considerations

Data Classification

Episodic memories are classified at ingestion based on content scan; classification label governs retention period, access scope, and erasure priority
Semantic memories inherit the classification of their source episodes, downgraded if PII has been confirmed stripped
Cross-tenant memory isolation: episodic and semantic stores are partitioned by tenant_id; queries are always filtered by tenant scope — no cross-tenant retrieval is architecturally possible

Encryption

Episodic store encrypted at rest with CMK; CMK is per-tenant for strong isolation
Semantic store encrypted at rest; embeddings are opaque binary representations but may leak information through proximity queries — store-level encryption is the minimum; consider homomorphic retrieval for highest-sensitivity deployments
All memory tier connections use TLS 1.3; mTLS within service mesh

Auditability

Every memory read and write event is logged in the immutable audit log with: agent_id, task_id, user_id (hashed), record_id, operation, classification, timestamp
Erasure events produce a certificate: subject_id, erasure timestamp, records deleted count, derived semantic records affected, executor identity

OWASP LLM Top 10 — Memory-Specific

OWASP LLM Risk	Memory-Specific Applicability	Mitigation
LLM01 Prompt Injection	Memory content could carry injected instructions from a malicious prior interaction	Output validation on all memory content before injection into context; anomaly detection on semantic shifts in retrieved memory
LLM06 Sensitive Information Disclosure	Episodic memories contain PII that could leak to a different user via cross-tenant retrieval or incorrect retrieval	Strict tenant partitioning; PII masking; retrieval audit with anomaly detection; output filtering
LLM08 Excessive Agency	Procedural memories could encode overly broad action patterns from prior high-permission tasks	Procedural library requires explicit approval for patterns involving irreversible actions; success rate decay mechanism retires stale patterns
LLM09 Overreliance	Agent may over-weight retrieved memories without considering staleness	Recency scoring; explicit staleness flag on retrieved memories older than configurable threshold; agent prompted to question stale memories
LLM10 Model Theft	Semantic memory store contains valuable domain knowledge extracted from customer interactions	Store behind private networking; strict IAM; export controls on vector store API

9. Governance Considerations

Responsible AI — Memory-Specific

Agents must not use episodic memory to make decisions about individuals without disclosure (transparency principle)
Memory content must not encode or amplify bias; the consolidation engine must include bias audit on extracted semantic memories
Memory provenance must be traceable: every semantic memory records which episodic records contributed to it

GDPR / Privacy Act Compliance

Data Subject Access Requests: The SAR Handler must produce a complete, human-readable export of all episodic memory records associated with a subject identifier within 30 days (GDPR) / 30 days (Australian Privacy Act)
Right to Erasure: The RTE Engine must delete all episodic records for a subject AND identify and purge or anonymise any semantic memories derived from those episodes; cascading delete logic must be tested in the erasure runbook
Data Minimisation: Memory consolidation must only retain what is necessary for future task quality; retention policies must enforce deletion of episodic records beyond the retention window
Purpose Limitation: Memory retrieved for one task type must not be used for unrelated task types without explicit permission scoping

Governance Artefacts

Artefact	Owner	Frequency	Purpose
Memory Classification Register	Data Governance	On each new task type	Documents classification of memory content per task type
Retention Policy Schedule	Legal + Data Governance	Annually	Defines retention periods per classification class
SAR/RTE Runbook	Privacy Officer	On process change	Step-by-step erasure and export procedure with test evidence
Memory Quality Report	ML Engineering	Monthly	Retrieval precision metrics; memory freshness scores; consolidation quality
Erasure Certificate Archive	Privacy Officer	Per erasure	Immutable record of completed erasure operations
Bias Audit Report	AI Ethics Team	Quarterly	Audit of semantic memories for demographic bias, unfair generalisation

10. Operational Considerations

Monitoring

Retrieval quality metric: relevance feedback score (agent explicitly rates retrieved memories as useful/not useful on task completion)
Memory store sizes and growth rate per tier — alert when semantic store exceeds defined size ceiling (triggers consolidation review)
PII detection false positive/negative rate — monitored on sample; false negatives mean PII entering semantic store

SLOs

SLO	Target	Window	Alert
Episodic retrieval latency	≤ 150ms p95	1-hour rolling	> 300ms triggers P2
Semantic retrieval latency	≤ 200ms p95	1-hour rolling	> 400ms triggers P2
PII detection throughput	≥ 100 records/sec	Per batch	Backpressure alert at < 50/sec
Erasure completion time	≤ 24 hours from request	Per request	> 48 hours triggers P1 escalation
Memory audit log availability	99.99%	Monthly	Any gap triggers P0

Incident Response

Incident Type	Detection	Response	Escalation
PII found in semantic store	Automated scan or audit report	Immediate quarantine of affected records; retrace consolidation pipeline; purge confirmed PII	Privacy Officer + CISO within 4 hours; regulator notification if material
Cross-tenant retrieval detected	Retrieval audit anomaly detection	Immediate suspension of affected agent; investigation of query filter logic	P0 incident; CISO + Legal
SAR/RTE not fulfilled within deadline	SAR tracking system	Escalate to Privacy Officer; manual intervention in erasure pipeline	Privacy Officer; potential regulator notification

Capacity

Vector store index size grows as O(n × d) where n = records and d = embedding dimensions; plan for 5× current index size in provisioned capacity to support query performance
Consolidation engine is CPU/memory intensive during batch processing; schedule during off-peak hours

11. Cost Considerations

Cost Drivers

Cost Driver	Description	Control Lever
Embedding API	Embedding calls for each memory write and retrieval query	Embedding cache for repeated queries; batch writes; self-hosted embedding model
Vector store operations	Index writes and ANN query costs (cloud-managed vector DBs charge per operation)	Consolidated batch writes; query result caching for common patterns
PII detection API	Per-record PII scanning on all writes	Self-hosted Presidio; sampling for low-risk task types
Memory consolidation compute	Async LLM calls for pattern extraction	Smaller model for consolidation; batching; scheduled off-peak runs
Episodic store storage	Long-term storage grows with interaction volume	Retention policies; tiered storage (hot/warm/cold); compression

Indicative Cost Range (USD, per 1M memory operations)

Operation	Cloud-Managed Vector DB	Self-Hosted Qdrant/Weaviate	Delta
Write (embed + index)	~$3–8	~$0.50–1.50	4–8× cloud premium
Read (query + retrieve)	~$1–4	~$0.20–0.80	3–6× cloud premium
PII detection (per record)	~$0.001–0.005	~$0.0001 (Presidio)	10–50× cloud premium

12. Trade-Off Analysis

Memory Architecture Options

Option	Description	Pros	Cons	Best For
A: Four-Tier (Recommended)	Separate in-context, episodic, semantic, and procedural stores	Best retrieval quality; clear retention policies; privacy-compliant erasure	Higher operational complexity; 4 stores to manage and monitor	Production agents with recurring users and compliance requirements
B: Single Vector Store	All memory in one vector DB with metadata tags for tier differentiation	Simpler infrastructure; single query interface	PII and non-PII mixed; erasure is complex; retrieval quality degrades as store grows	Proof-of-concept or simple single-user agents
C: In-Context Only (No Persistence)	No persistent memory; rely solely on context window	Zero privacy risk; maximum simplicity; lowest cost	No learning; no personalisation; full history injection is expensive	Stateless, compliance-sensitive, single-use agents
D: External Memory Service (e.g., Mem0)	Managed memory service handles all tiers	Fastest time-to-value; reduces engineering burden	Vendor lock-in; data sovereignty concerns; limited customisation of consolidation logic	Startups or teams with limited ML/infra expertise

Architectural Tensions

Tension	Left Pole	Right Pole	Balance
Memory richness vs. Privacy	Maximum memory retention for best agent quality	Minimum retention to reduce privacy risk	Tiered retention with explicit policies; PII-stripped semantic memories retained longer than PII-bearing episodic
Retrieval recall vs. Precision	Retrieve broad context (high recall)	Retrieve only highly relevant context (high precision)	Tuned K and similarity threshold per task type; reranker to boost precision
Real-time consolidation vs. Task latency	Consolidate synchronously; memory up-to-date immediately	Async consolidation; task completes faster	Async consolidation as the standard; sync consolidation only for high-priority learning scenarios

13. Failure Modes

Failure Mode	Likelihood	Impact	Detection	Recovery
Memory injection attack (poisoned episodic record)	Medium	High — agent acts on attacker-injected false memory	Content validation on retrieval; anomaly scoring on retrieved memory vs. task context	Quarantine suspicious memory; re-run task without memory; alert security
Stale memory degrades quality	High	Medium — agent uses outdated patterns	Recency scoring; explicit staleness alerts; retrieval quality feedback loop	Retire memories below freshness threshold; force re-consolidation
PII leakage to semantic store	Medium	Critical — privacy violation	Automated PII scan on semantic store; periodic audit	Immediate quarantine; purge; Privacy Officer notification
Memory store unavailability	Low	High — agents lose all memory context	Health check monitoring; circuit breaker	Degrade to in-context only; restore from backup; alert
Erasure failure for SAR	Low	Critical — regulatory non-compliance	SAR tracking system deadline monitoring	Manual erasure procedure; Privacy Officer + Legal involvement
Runaway memory growth	Medium	Medium — cost overrun; performance degradation	Memory store size monitoring; growth rate alerts	Retention policy enforcement; emergency pruning

14. Regulatory Considerations

GDPR / Privacy Act

Episodic store is a personal data store; all GDPR/Privacy Act obligations apply: purpose limitation, data minimisation, retention limits, security, right of access, right to erasure
Right to erasure is technically complex for vector stores: embeddings cannot be selectively deleted in some managed services — select a vector store that supports metadata-filtered deletion or record-level deletion
Cross-border transfers of episodic memories containing EU/Australian personal data must comply with transfer mechanisms (SCCs, IDTA, adequacy decisions)

EU AI Act

Art. 10 (Data Governance): the four-tier memory architecture with PII filtering and classification satisfies the data governance requirements for high-risk AI systems
Art. 12 (Record Keeping): the memory audit log and task execution traces satisfy the record-keeping requirement
If the memory system supports a high-risk AI use case, the consolidation pipeline (which trains the semantic store) may be treated as training data and requires bias assessment

ISO 42001

§8.4 (AI System Lifecycle): memory consolidation is part of the learning lifecycle and must be governed under the AI system lifecycle procedures
§6.1 (Risk Assessment): memory-specific risks (injection, leakage, staleness) must be identified and treated in the AI risk assessment

APRA CPS 234

Memory stores containing customer data are information assets under CPS 234; information security capability requirements apply
CMK encryption and strict access controls on episodic store satisfy the information asset protection requirement

15. Reference Implementations

AWS

Component	AWS Service
Episodic Store	Amazon Aurora PostgreSQL + pgvector; or Amazon OpenSearch k-NN
Semantic Store	Amazon OpenSearch Service (vector engine); or Amazon Bedrock Knowledge Bases
Procedural Library	Amazon DynamoDB (JSON document store)
PII Detection	Amazon Comprehend (entity detection + PII classification)
Consolidation Engine	AWS Lambda + Amazon SQS (async trigger on task completion)
Memory Audit Log	Amazon S3 (WORM) + AWS CloudTrail
SAR/RTE Handler	Custom Lambda with Aurora queries + OpenSearch delete-by-query

Azure

Component	Azure Service
Episodic Store	Azure Cosmos DB for PostgreSQL + pgvector
Semantic Store	Azure AI Search (vector + hybrid search)
PII Detection	Azure AI Language (PII detection)
Consolidation Engine	Azure Functions (event-triggered) + Azure Service Bus
Memory Audit Log	Azure Immutable Blob Storage + Azure Monitor

GCP

Component	GCP Service
Episodic Store	Cloud Spanner (OLTP) + Vertex AI Matching Engine (vector)
Semantic Store	Vertex AI Vector Search
PII Detection	Cloud DLP (Data Loss Prevention)
Consolidation Engine	Cloud Run (triggered by Pub/Sub)

On-Premises

Component	Technology
Episodic + Semantic Store	Weaviate or Qdrant on Kubernetes
PII Detection	Microsoft Presidio (open-source)
Consolidation Engine	Celery + Redis on Kubernetes
Audit Log	Apache Kafka + MinIO (S3-compatible WORM)

Pattern	ID	Relationship Type	Notes
Single Agent Pattern	EAAPL-AGT001	Extended By	AGT001 defines the agent loop; this pattern provides the memory architecture it depends on
Agent Tool Registry	EAAPL-AGT003	Peer	Procedural memory stores tool call patterns; interacts with registry for tool resolution
Agent Checkpoint and Recovery	EAAPL-AGT005	Peer	Checkpoint state includes references to memory records; recovery restores memory context
Reflexive Agent	EAAPL-AGT006	Extends	Reflection outputs may be written to episodic and semantic memory as quality learnings
Long-Running Agent	EAAPL-AGT007	Depends On	Long-running agents rely on episodic memory to maintain context across async task segments
Human-in-the-Loop Agent	EAAPL-MAG003	Peer	Human feedback events are written to episodic memory to improve future similar tasks

17. Maturity Assessment

Overall Maturity: Proven

Dimension	Score (1–5)	Evidence
Adoption Breadth	4	Deployed in production by leading enterprises; managed services (Bedrock KB, Azure AI Search) widely adopted
Tooling Ecosystem	4	Mature vector stores; Mem0, LangChain memory; LlamaIndex retrieval; consolidation tooling still maturing
Privacy Compliance Tooling	3	SAR/RTE tooling requires custom implementation; vector-store-native selective delete inconsistently supported
Security Hardening	3	Memory injection attacks still emerging threat class; mitigations developing
Retrieval Quality	4	Reranking techniques mature; multi-tier retrieval well-understood; long-tail edge cases remain
Regulatory Clarity	3	GDPR/Privacy Act obligations clear in principle; technical implementation guidance still evolving

18. Revision History

Version	Date	Author	Changes
1.0	2024-04-01	Architecture Board	Initial publication
1.1	2024-07-15	Privacy Engineering	Added GDPR right-to-erasure vector store guidance; PII cascade purge logic
2.0	2025-01-20	Architecture Board	Major revision: four-tier model replacing two-tier; bias audit added to consolidation pipeline; EU AI Act Art. 10 mapping; full OWASP LLM table

← Back to Library More Agentic AI →