EAAPLEnterprise AI Architecture Pattern Library
EAAPLLibraryAgentic AIEAAPL-AGT002
EAAPL-AGT002Proven
⇄ Compare

Stateful Agent Memory

🤖 Agentic AIEU AI ActISO/IEC 42001

[EAAPL-AGT002] Stateful Agent Memory

Category: Agentic AI Sub-category: Memory Architecture Version: 2.0 Maturity: Proven Tags: memory, episodic-memory, semantic-memory, vector-store, memory-consolidation, privacy, context-window Regulatory Relevance: EU AI Act (Art. 10, 12), Privacy Act 1988, GDPR Art. 17, ISO 42001 §8.4, APRA CPS 234


1. Executive Summary

The Stateful Agent Memory Pattern defines the multi-tier memory architecture that allows AI agents to retain, recall, and apply knowledge across task invocations. Without persistent memory, agents are amnesia machines — each task starts from zero, ignoring valuable context accumulated from prior interactions and producing generic, contextually shallow outputs.

For CIO/CTO audiences: this pattern is the difference between an AI agent that improves over time and one that stays static. It enables agents to remember customer preferences, learn from past task outcomes, and build an evolving understanding of your organisation's domain — delivering compounding value rather than flat productivity gains. This is the architectural foundation for genuinely intelligent automation.

The pattern addresses four distinct memory types: in-context (what the agent knows right now), episodic (what happened in past sessions), semantic (domain knowledge crystallised from experience), and procedural (learned skills for specific task patterns). It governs how memories are created, consolidated, retrieved, and deleted — including the privacy-critical capability to purge individual memories on subject access requests, a hard requirement under GDPR and the Australian Privacy Act. Correct implementation delivers up to 35% improvement in task quality scores as agents build domain-specific knowledge while maintaining compliance with data residency and retention obligations.


2. Problem Statement

Business Problem

Organisations deploy AI agents that handle customer interactions, process documents, or manage workflows, but each invocation is stateless. An agent cannot remember that a specific customer dislikes a particular communication format, that a specific document type contains a recurring error pattern, or that a prior task failed because a tool returned stale data. This statelessness produces generic outputs and forces humans to repeat context on every interaction.

Technical Problem

LLM context windows are bounded (8K–200K tokens) and ephemeral. An agent cannot hold the full history of all past interactions in its context for each new task. There is no native mechanism in foundation models for persistent, structured, queryable memory. Ad-hoc solutions (appending all history to every prompt) fail at scale: context becomes too large, costs explode, and the model's attention degrades over long contexts.

Symptoms of Absence

  • Agents repeat the same mistakes across tasks without correcting
  • Users must re-explain context on every new interaction
  • Agent outputs show no improvement over time despite large interaction volumes
  • Memory-related PII is retained indefinitely with no purge mechanism, creating GDPR/Privacy Act exposure
  • High token costs because full conversation history is injected indiscriminately

Cost of Inaction

  • Quality: Task quality remains static; no compounding improvement
  • Cost: Full-history injection at scale consumes 3–10× more tokens than selective retrieval
  • Risk: Uncontrolled memory retention creates undiscovered PII liability; right-to-erasure requests cannot be honoured
  • Competitive: Agents without memory cannot personalise; competitors with memory deliver measurably better outcomes

3. Context

When to Apply

  • Agents serving repeat users or recurring task types where prior context improves quality
  • Agents that benefit from domain-specific learning accumulated over many task executions
  • Production environments where context window costs at scale require optimisation
  • Any agent handling personal data where right-to-erasure compliance is required
  • Agents where task quality must demonstrably improve over time (subject to monitoring)

When NOT to Apply

  • Single-use or one-shot agents with no recurring interaction pattern
  • Tasks where using prior context creates unfair bias (e.g., re-using a prior unfavourable risk assessment on a new application)
  • Environments where all data must be stateless by design (some secure/classified contexts)
  • Agent tasks where every invocation must be fully independent for audit reasons

Prerequisites

  • EAAPL-AGT001 (Single Agent Pattern) baseline implemented
  • Vector store infrastructure provisioned and accessible from agent runtime
  • PII detection service available (for memory ingestion filtering)
  • Data classification and retention policy defined for memory content
  • Subject access request (SAR) and right-to-erasure (RTE) operational process defined

Industry Applicability

Industry Use Case Memory Type Priority Risk Level
Financial Services Relationship manager AI — recalls client preferences, prior advice context Episodic + Semantic High
Healthcare Clinical AI — recalls patient-specific notes, medication history Episodic + Semantic Very High
Retail / E-commerce Customer service AI — recalls purchase history, preferences Episodic Medium
Legal Services Matter AI — recalls case-specific precedents, client instructions Episodic + Procedural High
HR / People Analytics HR AI — recalls employee lifecycle context Episodic Very High
Software Engineering Code review / generation AI — recalls codebase conventions, past review feedback Semantic + Procedural Medium

4. Architecture Overview

The Stateful Agent Memory Pattern organises agent memory into four tiers, each with different temporal scope, retrieval semantics, and storage technology. The tiers are layered: in-context memory is fastest but smallest; procedural memory is slowest to change but most reusable.

Why four tiers rather than a single store? Human cognitive science informs this design. Different types of knowledge have different update frequencies, retrieval patterns, and lifetime requirements. Forcing all memory into a single vector store creates retrieval conflicts (recent episodic memories crowd out stable semantic knowledge), cost inefficiency (re-embedding unchanged procedural knowledge), and privacy complexity (mixing PII-bearing episodic memories with PII-free domain knowledge complicates erasure). Four distinct tiers with their own schemas, retention policies, and retrieval strategies is not architecture astronautics — it is an engineering response to real retrieval quality problems observed at scale.

Tier 1: In-Context Memory (Working Memory) The current context window is the agent's working memory — everything it knows for this iteration of this task. The Context Window Manager is responsible for actively managing what occupies this finite space. It implements a sliding window for conversation history (retaining the N most recent exchanges), importance-weighted retention (important tool results are kept longer than routine ones), and a hard token budget. The in-context memory is ephemeral; it does not persist beyond the current task invocation.

Tier 2: Episodic Memory (Autobiographical Store) Episodic memory stores the record of what happened in past task invocations — a structured log of conversations, decisions made, outcomes observed, and feedback received. Each episodic record is timestamped, tagged with the task type, associated with an entity identifier (user ID, document ID, account ID), and stored as both a structured record (for exact lookups) and an embedding (for semantic similarity retrieval). Retrieval is via semantic search against the current task's context: the top-K most similar past episodes are retrieved and injected into the context.

Episodic memory is the most privacy-sensitive tier. It can contain PII, personal preferences, and sensitive interaction history. It must be linked to the data subject identifier and must support targeted purge operations. Retention policies must be defined per data sensitivity class.

Tier 3: Semantic Memory (Knowledge Store) Semantic memory stores domain knowledge that has been crystallised from episodic experience. Unlike episodic memories (specific events), semantic memories are generalisations: "this customer segment prefers formal communication," "this document type has a recurring pattern in Section 3.2," "this API call fails under these conditions." Semantic memories are created by the Memory Consolidation Engine, which periodically processes episodic memories and extracts stable patterns.

Semantic memory is typically PII-free (personal identifiers are stripped during consolidation), enabling it to be shared across agents and retained longer. Retrieval is via dense vector search, with results ranked by relevance score and recency weight.

Tier 4: Procedural Memory (Skill Library) Procedural memory stores learned task execution strategies — sequences of tool calls and decision logic that have been validated as effective for specific task types. These are analogous to stored procedures: when the agent encounters a task that matches a known procedural template, it can retrieve the template and adapt it rather than planning from scratch. Procedural memories are versioned, have explicit success rate metadata (how often this procedure achieved the task goal), and are retired when their success rate drops below a threshold.

Memory Consolidation The Memory Consolidation Engine runs asynchronously after task completion. It processes the episodic record of the just-completed task, extracts semantic learnings, and updates the semantic and procedural stores. Consolidation is governed by a quality threshold — only memories that meet a minimum relevance and novelty score are written. Duplicate detection prevents redundant knowledge accumulation. The consolidation pipeline includes PII stripping before writing to semantic memory.

Retrieval Orchestration The Memory Retrieval Orchestrator executes retrieval across all relevant tiers in parallel at the start of each task. It ranks results from each tier, applies a cross-tier deduplication pass, and assembles the final memory injection within the context window budget. The orchestrator tracks retrieval quality metrics (retrieval precision proxied by subsequent task quality) to continuously tune retrieval hyperparameters (K, similarity threshold, recency weight).


5. Architecture Diagram

ARCHITECTURE DIAGRAM
flowchart TD subgraph Input["Task Input"] A[Task Request] B[Context Window Manager] end subgraph MemoryTiers["Memory Tiers"] C[(Episodic Store)] D[(Semantic Store)] E[(Procedural Library)] end subgraph Processing["Consolidation and Privacy"] F[Retrieval Orchestrator] G[Consolidation Engine] H[PII Detector] end A --> B B --> F F -->|semantic search| C F -->|vector search| D F -->|pattern lookup| E F -->|ranked context| B B -->|bounded context| A A -->|task result| G G --> H H -->|episodic write| C H -->|PII-free write| D G -->|skill pattern| E style A fill:#dbeafe,stroke:#3b82f6 style B fill:#f0fdf4,stroke:#22c55e style C fill:#fef9c3,stroke:#eab308 style D fill:#fef9c3,stroke:#eab308 style E fill:#fef9c3,stroke:#eab308 style F fill:#f0fdf4,stroke:#22c55e style G fill:#f0fdf4,stroke:#22c55e style H fill:#f3e8ff,stroke:#a855f7

6. Components

Component Type Responsibility Technology Options Criticality
Context Window Manager Orchestration Manages token budget; implements sliding window and importance-weighted retention Custom, LangChain memory manager, Mem0 Critical
Memory Retrieval Orchestrator Orchestration Parallel retrieval across tiers; ranking; deduplication; context assembly Custom Python/TS, LlamaIndex retrieval pipeline Critical
Episodic Store Persistence (Structured + Vector) Stores full interaction records with embeddings; supports exact and semantic retrieval; per-subject purge PostgreSQL + pgvector, MongoDB Atlas Vector Search, Supabase High
Semantic Store Vector Store Stores domain knowledge embeddings; dense vector search; long retention Pinecone, Weaviate, Azure AI Search, Qdrant High
Procedural Skill Library Knowledge Base Stores versioned task execution templates with success metadata Redis JSON, PostgreSQL, object store Medium
Memory Consolidation Engine Background Processor Extracts semantic learnings from episodic records; strips PII; writes to semantic store Custom async worker, Celery, Azure Durable Functions High
PII Detector / Masker Privacy Detects and masks PII before memory writes AWS Comprehend, Azure AI Language, Microsoft Presidio Critical
Retrieval Ranker ML Component Scores and ranks retrieved memories by relevance + recency; deduplicates Custom scoring function, LLM-as-reranker (cohere rerank) High
Data Classification Enforcer Security/Governance Enforces classification-based access controls on memory read/write OPA, custom policy middleware High
Subject Access Request Handler Privacy Compliance Generates complete record of all memory held for a given data subject Custom API, integrated with identity store Critical
Right-to-Erasure Engine Privacy Compliance Purges all episodic records for a subject; cascades to derived semantic memories; logs erasure Custom async process with cascading delete logic Critical
Retention Policy Scheduler Operations Enforces time-based retention policies per memory classification APScheduler, AWS EventBridge, Azure Logic Apps High
Memory Audit Log Compliance Immutable log of all memory reads, writes, purges, and consolidations Append-only store (CloudTrail, Kafka, S3 WORM) Critical

7. Data Flow

Primary Write Flow (Memory Creation)

Step Actor Action Output
1 Agent (Post-Task) Submits completed task record: task_id, user_id, conversation_turns, tool_calls, outcome, feedback Raw task record
2 PII Detector Scans raw record; detects PII entities (names, IDs, financial data); applies masking rules per classification policy PII-annotated record; masked episodic record
3 Episodic Store Writes masked episodic record with full metadata; generates embedding of record summary Episodic record ID; embedding written to vector index
4 Memory Audit Log Logs write event: record_id, agent_id, user_id (hashed), classification, timestamp Audit entry
5 Memory Consolidation Engine Asynchronously processes episodic record; identifies stable patterns and novel knowledge Candidate semantic memories; candidate procedural patterns
6 PII Detector (second pass) Strips any residual PII from candidate semantic memories PII-free semantic candidates
7 Duplicate Detector Compares candidates against existing semantic store; calculates novelty score Deduped candidates with novelty scores
8 Semantic Store Writes new/updated semantic memories above novelty threshold Semantic record IDs; updated vector index
9 Procedural Library Writes new procedural templates with initial success rate metadata Skill record with version

Primary Read Flow (Memory Retrieval)

Step Actor Action Output
1 Memory Retrieval Orchestrator Receives task context embedding + entity identifiers Retrieval request
2 Episodic Store Semantic search for top-K episodes by similarity to task context, filtered by entity ID K episodic records
3 Semantic Store Dense vector search for top-K domain knowledge chunks by similarity K semantic records
4 Procedural Library Pattern lookup by task type classification Matching skill templates
5 Retrieval Ranker Scores all retrieved items: relevance × recency × quality; deduplicates Ranked, deduplicated memory list
6 Context Window Manager Selects top items within token budget; formats for injection Memory-enriched context block
7 Memory Audit Log Logs retrieval event: items retrieved, agent_id, task_id, scores Audit entry

Error Flow

Error Condition Detection Recovery
Vector store unavailable Health check on retrieval orchestrator startup Degrade to in-context memory only; alert; log degraded mode
PII detection service unavailable Service health check before write Block memory write; queue for retry; do not write unscanned memory
Erasure request fails partially RTE engine transaction log Retry failed deletes; flag record for manual review; report status to SAR handler
Consolidation writes duplicate Duplicate detector Idempotent upsert with cosine similarity gate; no duplicate written

8. Security Considerations

Data Classification

  • Episodic memories are classified at ingestion based on content scan; classification label governs retention period, access scope, and erasure priority
  • Semantic memories inherit the classification of their source episodes, downgraded if PII has been confirmed stripped
  • Cross-tenant memory isolation: episodic and semantic stores are partitioned by tenant_id; queries are always filtered by tenant scope — no cross-tenant retrieval is architecturally possible

Encryption

  • Episodic store encrypted at rest with CMK; CMK is per-tenant for strong isolation
  • Semantic store encrypted at rest; embeddings are opaque binary representations but may leak information through proximity queries — store-level encryption is the minimum; consider homomorphic retrieval for highest-sensitivity deployments
  • All memory tier connections use TLS 1.3; mTLS within service mesh

Auditability

  • Every memory read and write event is logged in the immutable audit log with: agent_id, task_id, user_id (hashed), record_id, operation, classification, timestamp
  • Erasure events produce a certificate: subject_id, erasure timestamp, records deleted count, derived semantic records affected, executor identity

OWASP LLM Top 10 — Memory-Specific

OWASP LLM Risk Memory-Specific Applicability Mitigation
LLM01 Prompt Injection Memory content could carry injected instructions from a malicious prior interaction Output validation on all memory content before injection into context; anomaly detection on semantic shifts in retrieved memory
LLM06 Sensitive Information Disclosure Episodic memories contain PII that could leak to a different user via cross-tenant retrieval or incorrect retrieval Strict tenant partitioning; PII masking; retrieval audit with anomaly detection; output filtering
LLM08 Excessive Agency Procedural memories could encode overly broad action patterns from prior high-permission tasks Procedural library requires explicit approval for patterns involving irreversible actions; success rate decay mechanism retires stale patterns
LLM09 Overreliance Agent may over-weight retrieved memories without considering staleness Recency scoring; explicit staleness flag on retrieved memories older than configurable threshold; agent prompted to question stale memories
LLM10 Model Theft Semantic memory store contains valuable domain knowledge extracted from customer interactions Store behind private networking; strict IAM; export controls on vector store API

9. Governance Considerations

Responsible AI — Memory-Specific

  • Agents must not use episodic memory to make decisions about individuals without disclosure (transparency principle)
  • Memory content must not encode or amplify bias; the consolidation engine must include bias audit on extracted semantic memories
  • Memory provenance must be traceable: every semantic memory records which episodic records contributed to it

GDPR / Privacy Act Compliance

  • Data Subject Access Requests: The SAR Handler must produce a complete, human-readable export of all episodic memory records associated with a subject identifier within 30 days (GDPR) / 30 days (Australian Privacy Act)
  • Right to Erasure: The RTE Engine must delete all episodic records for a subject AND identify and purge or anonymise any semantic memories derived from those episodes; cascading delete logic must be tested in the erasure runbook
  • Data Minimisation: Memory consolidation must only retain what is necessary for future task quality; retention policies must enforce deletion of episodic records beyond the retention window
  • Purpose Limitation: Memory retrieved for one task type must not be used for unrelated task types without explicit permission scoping

Governance Artefacts

Artefact Owner Frequency Purpose
Memory Classification Register Data Governance On each new task type Documents classification of memory content per task type
Retention Policy Schedule Legal + Data Governance Annually Defines retention periods per classification class
SAR/RTE Runbook Privacy Officer On process change Step-by-step erasure and export procedure with test evidence
Memory Quality Report ML Engineering Monthly Retrieval precision metrics; memory freshness scores; consolidation quality
Erasure Certificate Archive Privacy Officer Per erasure Immutable record of completed erasure operations
Bias Audit Report AI Ethics Team Quarterly Audit of semantic memories for demographic bias, unfair generalisation

10. Operational Considerations

Monitoring

  • Retrieval quality metric: relevance feedback score (agent explicitly rates retrieved memories as useful/not useful on task completion)
  • Memory store sizes and growth rate per tier — alert when semantic store exceeds defined size ceiling (triggers consolidation review)
  • PII detection false positive/negative rate — monitored on sample; false negatives mean PII entering semantic store

SLOs

SLO Target Window Alert
Episodic retrieval latency ≤ 150ms p95 1-hour rolling > 300ms triggers P2
Semantic retrieval latency ≤ 200ms p95 1-hour rolling > 400ms triggers P2
PII detection throughput ≥ 100 records/sec Per batch Backpressure alert at < 50/sec
Erasure completion time ≤ 24 hours from request Per request > 48 hours triggers P1 escalation
Memory audit log availability 99.99% Monthly Any gap triggers P0

Incident Response

Incident Type Detection Response Escalation
PII found in semantic store Automated scan or audit report Immediate quarantine of affected records; retrace consolidation pipeline; purge confirmed PII Privacy Officer + CISO within 4 hours; regulator notification if material
Cross-tenant retrieval detected Retrieval audit anomaly detection Immediate suspension of affected agent; investigation of query filter logic P0 incident; CISO + Legal
SAR/RTE not fulfilled within deadline SAR tracking system Escalate to Privacy Officer; manual intervention in erasure pipeline Privacy Officer; potential regulator notification

Capacity

  • Vector store index size grows as O(n × d) where n = records and d = embedding dimensions; plan for 5× current index size in provisioned capacity to support query performance
  • Consolidation engine is CPU/memory intensive during batch processing; schedule during off-peak hours

11. Cost Considerations

Cost Drivers

Cost Driver Description Control Lever
Embedding API Embedding calls for each memory write and retrieval query Embedding cache for repeated queries; batch writes; self-hosted embedding model
Vector store operations Index writes and ANN query costs (cloud-managed vector DBs charge per operation) Consolidated batch writes; query result caching for common patterns
PII detection API Per-record PII scanning on all writes Self-hosted Presidio; sampling for low-risk task types
Memory consolidation compute Async LLM calls for pattern extraction Smaller model for consolidation; batching; scheduled off-peak runs
Episodic store storage Long-term storage grows with interaction volume Retention policies; tiered storage (hot/warm/cold); compression

Indicative Cost Range (USD, per 1M memory operations)

Operation Cloud-Managed Vector DB Self-Hosted Qdrant/Weaviate Delta
Write (embed + index) ~$3–8 ~$0.50–1.50 4–8× cloud premium
Read (query + retrieve) ~$1–4 ~$0.20–0.80 3–6× cloud premium
PII detection (per record) ~$0.001–0.005 ~$0.0001 (Presidio) 10–50× cloud premium

12. Trade-Off Analysis

Memory Architecture Options

Option Description Pros Cons Best For
A: Four-Tier (Recommended) Separate in-context, episodic, semantic, and procedural stores Best retrieval quality; clear retention policies; privacy-compliant erasure Higher operational complexity; 4 stores to manage and monitor Production agents with recurring users and compliance requirements
B: Single Vector Store All memory in one vector DB with metadata tags for tier differentiation Simpler infrastructure; single query interface PII and non-PII mixed; erasure is complex; retrieval quality degrades as store grows Proof-of-concept or simple single-user agents
C: In-Context Only (No Persistence) No persistent memory; rely solely on context window Zero privacy risk; maximum simplicity; lowest cost No learning; no personalisation; full history injection is expensive Stateless, compliance-sensitive, single-use agents
D: External Memory Service (e.g., Mem0) Managed memory service handles all tiers Fastest time-to-value; reduces engineering burden Vendor lock-in; data sovereignty concerns; limited customisation of consolidation logic Startups or teams with limited ML/infra expertise

Architectural Tensions

Tension Left Pole Right Pole Balance
Memory richness vs. Privacy Maximum memory retention for best agent quality Minimum retention to reduce privacy risk Tiered retention with explicit policies; PII-stripped semantic memories retained longer than PII-bearing episodic
Retrieval recall vs. Precision Retrieve broad context (high recall) Retrieve only highly relevant context (high precision) Tuned K and similarity threshold per task type; reranker to boost precision
Real-time consolidation vs. Task latency Consolidate synchronously; memory up-to-date immediately Async consolidation; task completes faster Async consolidation as the standard; sync consolidation only for high-priority learning scenarios

13. Failure Modes

Failure Mode Likelihood Impact Detection Recovery
Memory injection attack (poisoned episodic record) Medium High — agent acts on attacker-injected false memory Content validation on retrieval; anomaly scoring on retrieved memory vs. task context Quarantine suspicious memory; re-run task without memory; alert security
Stale memory degrades quality High Medium — agent uses outdated patterns Recency scoring; explicit staleness alerts; retrieval quality feedback loop Retire memories below freshness threshold; force re-consolidation
PII leakage to semantic store Medium Critical — privacy violation Automated PII scan on semantic store; periodic audit Immediate quarantine; purge; Privacy Officer notification
Memory store unavailability Low High — agents lose all memory context Health check monitoring; circuit breaker Degrade to in-context only; restore from backup; alert
Erasure failure for SAR Low Critical — regulatory non-compliance SAR tracking system deadline monitoring Manual erasure procedure; Privacy Officer + Legal involvement
Runaway memory growth Medium Medium — cost overrun; performance degradation Memory store size monitoring; growth rate alerts Retention policy enforcement; emergency pruning

14. Regulatory Considerations

GDPR / Privacy Act

  • Episodic store is a personal data store; all GDPR/Privacy Act obligations apply: purpose limitation, data minimisation, retention limits, security, right of access, right to erasure
  • Right to erasure is technically complex for vector stores: embeddings cannot be selectively deleted in some managed services — select a vector store that supports metadata-filtered deletion or record-level deletion
  • Cross-border transfers of episodic memories containing EU/Australian personal data must comply with transfer mechanisms (SCCs, IDTA, adequacy decisions)

EU AI Act

  • Art. 10 (Data Governance): the four-tier memory architecture with PII filtering and classification satisfies the data governance requirements for high-risk AI systems
  • Art. 12 (Record Keeping): the memory audit log and task execution traces satisfy the record-keeping requirement
  • If the memory system supports a high-risk AI use case, the consolidation pipeline (which trains the semantic store) may be treated as training data and requires bias assessment

ISO 42001

  • §8.4 (AI System Lifecycle): memory consolidation is part of the learning lifecycle and must be governed under the AI system lifecycle procedures
  • §6.1 (Risk Assessment): memory-specific risks (injection, leakage, staleness) must be identified and treated in the AI risk assessment

APRA CPS 234

  • Memory stores containing customer data are information assets under CPS 234; information security capability requirements apply
  • CMK encryption and strict access controls on episodic store satisfy the information asset protection requirement

15. Reference Implementations

AWS

Component AWS Service
Episodic Store Amazon Aurora PostgreSQL + pgvector; or Amazon OpenSearch k-NN
Semantic Store Amazon OpenSearch Service (vector engine); or Amazon Bedrock Knowledge Bases
Procedural Library Amazon DynamoDB (JSON document store)
PII Detection Amazon Comprehend (entity detection + PII classification)
Consolidation Engine AWS Lambda + Amazon SQS (async trigger on task completion)
Memory Audit Log Amazon S3 (WORM) + AWS CloudTrail
SAR/RTE Handler Custom Lambda with Aurora queries + OpenSearch delete-by-query

Azure

Component Azure Service
Episodic Store Azure Cosmos DB for PostgreSQL + pgvector
Semantic Store Azure AI Search (vector + hybrid search)
PII Detection Azure AI Language (PII detection)
Consolidation Engine Azure Functions (event-triggered) + Azure Service Bus
Memory Audit Log Azure Immutable Blob Storage + Azure Monitor

GCP

Component GCP Service
Episodic Store Cloud Spanner (OLTP) + Vertex AI Matching Engine (vector)
Semantic Store Vertex AI Vector Search
PII Detection Cloud DLP (Data Loss Prevention)
Consolidation Engine Cloud Run (triggered by Pub/Sub)

On-Premises

Component Technology
Episodic + Semantic Store Weaviate or Qdrant on Kubernetes
PII Detection Microsoft Presidio (open-source)
Consolidation Engine Celery + Redis on Kubernetes
Audit Log Apache Kafka + MinIO (S3-compatible WORM)

Pattern ID Relationship Type Notes
Single Agent Pattern EAAPL-AGT001 Extended By AGT001 defines the agent loop; this pattern provides the memory architecture it depends on
Agent Tool Registry EAAPL-AGT003 Peer Procedural memory stores tool call patterns; interacts with registry for tool resolution
Agent Checkpoint and Recovery EAAPL-AGT005 Peer Checkpoint state includes references to memory records; recovery restores memory context
Reflexive Agent EAAPL-AGT006 Extends Reflection outputs may be written to episodic and semantic memory as quality learnings
Long-Running Agent EAAPL-AGT007 Depends On Long-running agents rely on episodic memory to maintain context across async task segments
Human-in-the-Loop Agent EAAPL-MAG003 Peer Human feedback events are written to episodic memory to improve future similar tasks

17. Maturity Assessment

Overall Maturity: Proven

Dimension Score (1–5) Evidence
Adoption Breadth 4 Deployed in production by leading enterprises; managed services (Bedrock KB, Azure AI Search) widely adopted
Tooling Ecosystem 4 Mature vector stores; Mem0, LangChain memory; LlamaIndex retrieval; consolidation tooling still maturing
Privacy Compliance Tooling 3 SAR/RTE tooling requires custom implementation; vector-store-native selective delete inconsistently supported
Security Hardening 3 Memory injection attacks still emerging threat class; mitigations developing
Retrieval Quality 4 Reranking techniques mature; multi-tier retrieval well-understood; long-tail edge cases remain
Regulatory Clarity 3 GDPR/Privacy Act obligations clear in principle; technical implementation guidance still evolving

18. Revision History

Version Date Author Changes
1.0 2024-04-01 Architecture Board Initial publication
1.1 2024-07-15 Privacy Engineering Added GDPR right-to-erasure vector store guidance; PII cascade purge logic
2.0 2025-01-20 Architecture Board Major revision: four-tier model replacing two-tier; bias audit added to consolidation pipeline; EU AI Act Art. 10 mapping; full OWASP LLM table
← Back to LibraryMore Agentic AI