[EAAPL-AGT002] Stateful Agent Memory
Category: Agentic AI
Sub-category: Memory Architecture
Version: 2.0
Maturity: Proven
Tags: memory, episodic-memory, semantic-memory, vector-store, memory-consolidation, privacy, context-window
Regulatory Relevance: EU AI Act (Art. 10, 12), Privacy Act 1988, GDPR Art. 17, ISO 42001 §8.4, APRA CPS 234
1. Executive Summary
The Stateful Agent Memory Pattern defines the multi-tier memory architecture that allows AI agents to retain, recall, and apply knowledge across task invocations. Without persistent memory, agents are amnesia machines — each task starts from zero, ignoring valuable context accumulated from prior interactions and producing generic, contextually shallow outputs.
For CIO/CTO audiences: this pattern is the difference between an AI agent that improves over time and one that stays static. It enables agents to remember customer preferences, learn from past task outcomes, and build an evolving understanding of your organisation's domain — delivering compounding value rather than flat productivity gains. This is the architectural foundation for genuinely intelligent automation.
The pattern addresses four distinct memory types: in-context (what the agent knows right now), episodic (what happened in past sessions), semantic (domain knowledge crystallised from experience), and procedural (learned skills for specific task patterns). It governs how memories are created, consolidated, retrieved, and deleted — including the privacy-critical capability to purge individual memories on subject access requests, a hard requirement under GDPR and the Australian Privacy Act. Correct implementation delivers up to 35% improvement in task quality scores as agents build domain-specific knowledge while maintaining compliance with data residency and retention obligations.
2. Problem Statement
Business Problem
Organisations deploy AI agents that handle customer interactions, process documents, or manage workflows, but each invocation is stateless. An agent cannot remember that a specific customer dislikes a particular communication format, that a specific document type contains a recurring error pattern, or that a prior task failed because a tool returned stale data. This statelessness produces generic outputs and forces humans to repeat context on every interaction.
Technical Problem
LLM context windows are bounded (8K–200K tokens) and ephemeral. An agent cannot hold the full history of all past interactions in its context for each new task. There is no native mechanism in foundation models for persistent, structured, queryable memory. Ad-hoc solutions (appending all history to every prompt) fail at scale: context becomes too large, costs explode, and the model's attention degrades over long contexts.
Symptoms of Absence
- Agents repeat the same mistakes across tasks without correcting
- Users must re-explain context on every new interaction
- Agent outputs show no improvement over time despite large interaction volumes
- Memory-related PII is retained indefinitely with no purge mechanism, creating GDPR/Privacy Act exposure
- High token costs because full conversation history is injected indiscriminately
Cost of Inaction
- Quality: Task quality remains static; no compounding improvement
- Cost: Full-history injection at scale consumes 3–10× more tokens than selective retrieval
- Risk: Uncontrolled memory retention creates undiscovered PII liability; right-to-erasure requests cannot be honoured
- Competitive: Agents without memory cannot personalise; competitors with memory deliver measurably better outcomes
3. Context
When to Apply
- Agents serving repeat users or recurring task types where prior context improves quality
- Agents that benefit from domain-specific learning accumulated over many task executions
- Production environments where context window costs at scale require optimisation
- Any agent handling personal data where right-to-erasure compliance is required
- Agents where task quality must demonstrably improve over time (subject to monitoring)
When NOT to Apply
- Single-use or one-shot agents with no recurring interaction pattern
- Tasks where using prior context creates unfair bias (e.g., re-using a prior unfavourable risk assessment on a new application)
- Environments where all data must be stateless by design (some secure/classified contexts)
- Agent tasks where every invocation must be fully independent for audit reasons
Prerequisites
- EAAPL-AGT001 (Single Agent Pattern) baseline implemented
- Vector store infrastructure provisioned and accessible from agent runtime
- PII detection service available (for memory ingestion filtering)
- Data classification and retention policy defined for memory content
- Subject access request (SAR) and right-to-erasure (RTE) operational process defined
Industry Applicability
| Industry |
Use Case |
Memory Type Priority |
Risk Level |
| Financial Services |
Relationship manager AI — recalls client preferences, prior advice context |
Episodic + Semantic |
High |
| Healthcare |
Clinical AI — recalls patient-specific notes, medication history |
Episodic + Semantic |
Very High |
| Retail / E-commerce |
Customer service AI — recalls purchase history, preferences |
Episodic |
Medium |
| Legal Services |
Matter AI — recalls case-specific precedents, client instructions |
Episodic + Procedural |
High |
| HR / People Analytics |
HR AI — recalls employee lifecycle context |
Episodic |
Very High |
| Software Engineering |
Code review / generation AI — recalls codebase conventions, past review feedback |
Semantic + Procedural |
Medium |
4. Architecture Overview
The Stateful Agent Memory Pattern organises agent memory into four tiers, each with different temporal scope, retrieval semantics, and storage technology. The tiers are layered: in-context memory is fastest but smallest; procedural memory is slowest to change but most reusable.
Why four tiers rather than a single store?
Human cognitive science informs this design. Different types of knowledge have different update frequencies, retrieval patterns, and lifetime requirements. Forcing all memory into a single vector store creates retrieval conflicts (recent episodic memories crowd out stable semantic knowledge), cost inefficiency (re-embedding unchanged procedural knowledge), and privacy complexity (mixing PII-bearing episodic memories with PII-free domain knowledge complicates erasure). Four distinct tiers with their own schemas, retention policies, and retrieval strategies is not architecture astronautics — it is an engineering response to real retrieval quality problems observed at scale.
Tier 1: In-Context Memory (Working Memory)
The current context window is the agent's working memory — everything it knows for this iteration of this task. The Context Window Manager is responsible for actively managing what occupies this finite space. It implements a sliding window for conversation history (retaining the N most recent exchanges), importance-weighted retention (important tool results are kept longer than routine ones), and a hard token budget. The in-context memory is ephemeral; it does not persist beyond the current task invocation.
Tier 2: Episodic Memory (Autobiographical Store)
Episodic memory stores the record of what happened in past task invocations — a structured log of conversations, decisions made, outcomes observed, and feedback received. Each episodic record is timestamped, tagged with the task type, associated with an entity identifier (user ID, document ID, account ID), and stored as both a structured record (for exact lookups) and an embedding (for semantic similarity retrieval). Retrieval is via semantic search against the current task's context: the top-K most similar past episodes are retrieved and injected into the context.
Episodic memory is the most privacy-sensitive tier. It can contain PII, personal preferences, and sensitive interaction history. It must be linked to the data subject identifier and must support targeted purge operations. Retention policies must be defined per data sensitivity class.
Tier 3: Semantic Memory (Knowledge Store)
Semantic memory stores domain knowledge that has been crystallised from episodic experience. Unlike episodic memories (specific events), semantic memories are generalisations: "this customer segment prefers formal communication," "this document type has a recurring pattern in Section 3.2," "this API call fails under these conditions." Semantic memories are created by the Memory Consolidation Engine, which periodically processes episodic memories and extracts stable patterns.
Semantic memory is typically PII-free (personal identifiers are stripped during consolidation), enabling it to be shared across agents and retained longer. Retrieval is via dense vector search, with results ranked by relevance score and recency weight.
Tier 4: Procedural Memory (Skill Library)
Procedural memory stores learned task execution strategies — sequences of tool calls and decision logic that have been validated as effective for specific task types. These are analogous to stored procedures: when the agent encounters a task that matches a known procedural template, it can retrieve the template and adapt it rather than planning from scratch. Procedural memories are versioned, have explicit success rate metadata (how often this procedure achieved the task goal), and are retired when their success rate drops below a threshold.
Memory Consolidation
The Memory Consolidation Engine runs asynchronously after task completion. It processes the episodic record of the just-completed task, extracts semantic learnings, and updates the semantic and procedural stores. Consolidation is governed by a quality threshold — only memories that meet a minimum relevance and novelty score are written. Duplicate detection prevents redundant knowledge accumulation. The consolidation pipeline includes PII stripping before writing to semantic memory.
Retrieval Orchestration
The Memory Retrieval Orchestrator executes retrieval across all relevant tiers in parallel at the start of each task. It ranks results from each tier, applies a cross-tier deduplication pass, and assembles the final memory injection within the context window budget. The orchestrator tracks retrieval quality metrics (retrieval precision proxied by subsequent task quality) to continuously tune retrieval hyperparameters (K, similarity threshold, recency weight).
5. Architecture Diagram
flowchart TD
subgraph Input["Task Input"]
A[Task Request]
B[Context Window Manager]
end
subgraph MemoryTiers["Memory Tiers"]
C[(Episodic Store)]
D[(Semantic Store)]
E[(Procedural Library)]
end
subgraph Processing["Consolidation and Privacy"]
F[Retrieval Orchestrator]
G[Consolidation Engine]
H[PII Detector]
end
A --> B
B --> F
F -->|semantic search| C
F -->|vector search| D
F -->|pattern lookup| E
F -->|ranked context| B
B -->|bounded context| A
A -->|task result| G
G --> H
H -->|episodic write| C
H -->|PII-free write| D
G -->|skill pattern| E
style A fill:#dbeafe,stroke:#3b82f6
style B fill:#f0fdf4,stroke:#22c55e
style C fill:#fef9c3,stroke:#eab308
style D fill:#fef9c3,stroke:#eab308
style E fill:#fef9c3,stroke:#eab308
style F fill:#f0fdf4,stroke:#22c55e
style G fill:#f0fdf4,stroke:#22c55e
style H fill:#f3e8ff,stroke:#a855f7
6. Components
| Component |
Type |
Responsibility |
Technology Options |
Criticality |
| Context Window Manager |
Orchestration |
Manages token budget; implements sliding window and importance-weighted retention |
Custom, LangChain memory manager, Mem0 |
Critical |
| Memory Retrieval Orchestrator |
Orchestration |
Parallel retrieval across tiers; ranking; deduplication; context assembly |
Custom Python/TS, LlamaIndex retrieval pipeline |
Critical |
| Episodic Store |
Persistence (Structured + Vector) |
Stores full interaction records with embeddings; supports exact and semantic retrieval; per-subject purge |
PostgreSQL + pgvector, MongoDB Atlas Vector Search, Supabase |
High |
| Semantic Store |
Vector Store |
Stores domain knowledge embeddings; dense vector search; long retention |
Pinecone, Weaviate, Azure AI Search, Qdrant |
High |
| Procedural Skill Library |
Knowledge Base |
Stores versioned task execution templates with success metadata |
Redis JSON, PostgreSQL, object store |
Medium |
| Memory Consolidation Engine |
Background Processor |
Extracts semantic learnings from episodic records; strips PII; writes to semantic store |
Custom async worker, Celery, Azure Durable Functions |
High |
| PII Detector / Masker |
Privacy |
Detects and masks PII before memory writes |
AWS Comprehend, Azure AI Language, Microsoft Presidio |
Critical |
| Retrieval Ranker |
ML Component |
Scores and ranks retrieved memories by relevance + recency; deduplicates |
Custom scoring function, LLM-as-reranker (cohere rerank) |
High |
| Data Classification Enforcer |
Security/Governance |
Enforces classification-based access controls on memory read/write |
OPA, custom policy middleware |
High |
| Subject Access Request Handler |
Privacy Compliance |
Generates complete record of all memory held for a given data subject |
Custom API, integrated with identity store |
Critical |
| Right-to-Erasure Engine |
Privacy Compliance |
Purges all episodic records for a subject; cascades to derived semantic memories; logs erasure |
Custom async process with cascading delete logic |
Critical |
| Retention Policy Scheduler |
Operations |
Enforces time-based retention policies per memory classification |
APScheduler, AWS EventBridge, Azure Logic Apps |
High |
| Memory Audit Log |
Compliance |
Immutable log of all memory reads, writes, purges, and consolidations |
Append-only store (CloudTrail, Kafka, S3 WORM) |
Critical |
7. Data Flow
Primary Write Flow (Memory Creation)
| Step |
Actor |
Action |
Output |
| 1 |
Agent (Post-Task) |
Submits completed task record: task_id, user_id, conversation_turns, tool_calls, outcome, feedback |
Raw task record |
| 2 |
PII Detector |
Scans raw record; detects PII entities (names, IDs, financial data); applies masking rules per classification policy |
PII-annotated record; masked episodic record |
| 3 |
Episodic Store |
Writes masked episodic record with full metadata; generates embedding of record summary |
Episodic record ID; embedding written to vector index |
| 4 |
Memory Audit Log |
Logs write event: record_id, agent_id, user_id (hashed), classification, timestamp |
Audit entry |
| 5 |
Memory Consolidation Engine |
Asynchronously processes episodic record; identifies stable patterns and novel knowledge |
Candidate semantic memories; candidate procedural patterns |
| 6 |
PII Detector (second pass) |
Strips any residual PII from candidate semantic memories |
PII-free semantic candidates |
| 7 |
Duplicate Detector |
Compares candidates against existing semantic store; calculates novelty score |
Deduped candidates with novelty scores |
| 8 |
Semantic Store |
Writes new/updated semantic memories above novelty threshold |
Semantic record IDs; updated vector index |
| 9 |
Procedural Library |
Writes new procedural templates with initial success rate metadata |
Skill record with version |
Primary Read Flow (Memory Retrieval)
| Step |
Actor |
Action |
Output |
| 1 |
Memory Retrieval Orchestrator |
Receives task context embedding + entity identifiers |
Retrieval request |
| 2 |
Episodic Store |
Semantic search for top-K episodes by similarity to task context, filtered by entity ID |
K episodic records |
| 3 |
Semantic Store |
Dense vector search for top-K domain knowledge chunks by similarity |
K semantic records |
| 4 |
Procedural Library |
Pattern lookup by task type classification |
Matching skill templates |
| 5 |
Retrieval Ranker |
Scores all retrieved items: relevance × recency × quality; deduplicates |
Ranked, deduplicated memory list |
| 6 |
Context Window Manager |
Selects top items within token budget; formats for injection |
Memory-enriched context block |
| 7 |
Memory Audit Log |
Logs retrieval event: items retrieved, agent_id, task_id, scores |
Audit entry |
Error Flow
| Error Condition |
Detection |
Recovery |
| Vector store unavailable |
Health check on retrieval orchestrator startup |
Degrade to in-context memory only; alert; log degraded mode |
| PII detection service unavailable |
Service health check before write |
Block memory write; queue for retry; do not write unscanned memory |
| Erasure request fails partially |
RTE engine transaction log |
Retry failed deletes; flag record for manual review; report status to SAR handler |
| Consolidation writes duplicate |
Duplicate detector |
Idempotent upsert with cosine similarity gate; no duplicate written |
8. Security Considerations
Data Classification
- Episodic memories are classified at ingestion based on content scan; classification label governs retention period, access scope, and erasure priority
- Semantic memories inherit the classification of their source episodes, downgraded if PII has been confirmed stripped
- Cross-tenant memory isolation: episodic and semantic stores are partitioned by tenant_id; queries are always filtered by tenant scope — no cross-tenant retrieval is architecturally possible
Encryption
- Episodic store encrypted at rest with CMK; CMK is per-tenant for strong isolation
- Semantic store encrypted at rest; embeddings are opaque binary representations but may leak information through proximity queries — store-level encryption is the minimum; consider homomorphic retrieval for highest-sensitivity deployments
- All memory tier connections use TLS 1.3; mTLS within service mesh
Auditability
- Every memory read and write event is logged in the immutable audit log with: agent_id, task_id, user_id (hashed), record_id, operation, classification, timestamp
- Erasure events produce a certificate: subject_id, erasure timestamp, records deleted count, derived semantic records affected, executor identity
OWASP LLM Top 10 — Memory-Specific
| OWASP LLM Risk |
Memory-Specific Applicability |
Mitigation |
| LLM01 Prompt Injection |
Memory content could carry injected instructions from a malicious prior interaction |
Output validation on all memory content before injection into context; anomaly detection on semantic shifts in retrieved memory |
| LLM06 Sensitive Information Disclosure |
Episodic memories contain PII that could leak to a different user via cross-tenant retrieval or incorrect retrieval |
Strict tenant partitioning; PII masking; retrieval audit with anomaly detection; output filtering |
| LLM08 Excessive Agency |
Procedural memories could encode overly broad action patterns from prior high-permission tasks |
Procedural library requires explicit approval for patterns involving irreversible actions; success rate decay mechanism retires stale patterns |
| LLM09 Overreliance |
Agent may over-weight retrieved memories without considering staleness |
Recency scoring; explicit staleness flag on retrieved memories older than configurable threshold; agent prompted to question stale memories |
| LLM10 Model Theft |
Semantic memory store contains valuable domain knowledge extracted from customer interactions |
Store behind private networking; strict IAM; export controls on vector store API |
9. Governance Considerations
Responsible AI — Memory-Specific
- Agents must not use episodic memory to make decisions about individuals without disclosure (transparency principle)
- Memory content must not encode or amplify bias; the consolidation engine must include bias audit on extracted semantic memories
- Memory provenance must be traceable: every semantic memory records which episodic records contributed to it
GDPR / Privacy Act Compliance
- Data Subject Access Requests: The SAR Handler must produce a complete, human-readable export of all episodic memory records associated with a subject identifier within 30 days (GDPR) / 30 days (Australian Privacy Act)
- Right to Erasure: The RTE Engine must delete all episodic records for a subject AND identify and purge or anonymise any semantic memories derived from those episodes; cascading delete logic must be tested in the erasure runbook
- Data Minimisation: Memory consolidation must only retain what is necessary for future task quality; retention policies must enforce deletion of episodic records beyond the retention window
- Purpose Limitation: Memory retrieved for one task type must not be used for unrelated task types without explicit permission scoping
Governance Artefacts
| Artefact |
Owner |
Frequency |
Purpose |
| Memory Classification Register |
Data Governance |
On each new task type |
Documents classification of memory content per task type |
| Retention Policy Schedule |
Legal + Data Governance |
Annually |
Defines retention periods per classification class |
| SAR/RTE Runbook |
Privacy Officer |
On process change |
Step-by-step erasure and export procedure with test evidence |
| Memory Quality Report |
ML Engineering |
Monthly |
Retrieval precision metrics; memory freshness scores; consolidation quality |
| Erasure Certificate Archive |
Privacy Officer |
Per erasure |
Immutable record of completed erasure operations |
| Bias Audit Report |
AI Ethics Team |
Quarterly |
Audit of semantic memories for demographic bias, unfair generalisation |
10. Operational Considerations
Monitoring
- Retrieval quality metric: relevance feedback score (agent explicitly rates retrieved memories as useful/not useful on task completion)
- Memory store sizes and growth rate per tier — alert when semantic store exceeds defined size ceiling (triggers consolidation review)
- PII detection false positive/negative rate — monitored on sample; false negatives mean PII entering semantic store
SLOs
| SLO |
Target |
Window |
Alert |
| Episodic retrieval latency |
≤ 150ms p95 |
1-hour rolling |
> 300ms triggers P2 |
| Semantic retrieval latency |
≤ 200ms p95 |
1-hour rolling |
> 400ms triggers P2 |
| PII detection throughput |
≥ 100 records/sec |
Per batch |
Backpressure alert at < 50/sec |
| Erasure completion time |
≤ 24 hours from request |
Per request |
> 48 hours triggers P1 escalation |
| Memory audit log availability |
99.99% |
Monthly |
Any gap triggers P0 |
Incident Response
| Incident Type |
Detection |
Response |
Escalation |
| PII found in semantic store |
Automated scan or audit report |
Immediate quarantine of affected records; retrace consolidation pipeline; purge confirmed PII |
Privacy Officer + CISO within 4 hours; regulator notification if material |
| Cross-tenant retrieval detected |
Retrieval audit anomaly detection |
Immediate suspension of affected agent; investigation of query filter logic |
P0 incident; CISO + Legal |
| SAR/RTE not fulfilled within deadline |
SAR tracking system |
Escalate to Privacy Officer; manual intervention in erasure pipeline |
Privacy Officer; potential regulator notification |
Capacity
- Vector store index size grows as O(n × d) where n = records and d = embedding dimensions; plan for 5× current index size in provisioned capacity to support query performance
- Consolidation engine is CPU/memory intensive during batch processing; schedule during off-peak hours
11. Cost Considerations
Cost Drivers
| Cost Driver |
Description |
Control Lever |
| Embedding API |
Embedding calls for each memory write and retrieval query |
Embedding cache for repeated queries; batch writes; self-hosted embedding model |
| Vector store operations |
Index writes and ANN query costs (cloud-managed vector DBs charge per operation) |
Consolidated batch writes; query result caching for common patterns |
| PII detection API |
Per-record PII scanning on all writes |
Self-hosted Presidio; sampling for low-risk task types |
| Memory consolidation compute |
Async LLM calls for pattern extraction |
Smaller model for consolidation; batching; scheduled off-peak runs |
| Episodic store storage |
Long-term storage grows with interaction volume |
Retention policies; tiered storage (hot/warm/cold); compression |
Indicative Cost Range (USD, per 1M memory operations)
| Operation |
Cloud-Managed Vector DB |
Self-Hosted Qdrant/Weaviate |
Delta |
| Write (embed + index) |
~$3–8 |
~$0.50–1.50 |
4–8× cloud premium |
| Read (query + retrieve) |
~$1–4 |
~$0.20–0.80 |
3–6× cloud premium |
| PII detection (per record) |
~$0.001–0.005 |
~$0.0001 (Presidio) |
10–50× cloud premium |
12. Trade-Off Analysis
Memory Architecture Options
| Option |
Description |
Pros |
Cons |
Best For |
| A: Four-Tier (Recommended) |
Separate in-context, episodic, semantic, and procedural stores |
Best retrieval quality; clear retention policies; privacy-compliant erasure |
Higher operational complexity; 4 stores to manage and monitor |
Production agents with recurring users and compliance requirements |
| B: Single Vector Store |
All memory in one vector DB with metadata tags for tier differentiation |
Simpler infrastructure; single query interface |
PII and non-PII mixed; erasure is complex; retrieval quality degrades as store grows |
Proof-of-concept or simple single-user agents |
| C: In-Context Only (No Persistence) |
No persistent memory; rely solely on context window |
Zero privacy risk; maximum simplicity; lowest cost |
No learning; no personalisation; full history injection is expensive |
Stateless, compliance-sensitive, single-use agents |
| D: External Memory Service (e.g., Mem0) |
Managed memory service handles all tiers |
Fastest time-to-value; reduces engineering burden |
Vendor lock-in; data sovereignty concerns; limited customisation of consolidation logic |
Startups or teams with limited ML/infra expertise |
Architectural Tensions
| Tension |
Left Pole |
Right Pole |
Balance |
| Memory richness vs. Privacy |
Maximum memory retention for best agent quality |
Minimum retention to reduce privacy risk |
Tiered retention with explicit policies; PII-stripped semantic memories retained longer than PII-bearing episodic |
| Retrieval recall vs. Precision |
Retrieve broad context (high recall) |
Retrieve only highly relevant context (high precision) |
Tuned K and similarity threshold per task type; reranker to boost precision |
| Real-time consolidation vs. Task latency |
Consolidate synchronously; memory up-to-date immediately |
Async consolidation; task completes faster |
Async consolidation as the standard; sync consolidation only for high-priority learning scenarios |
13. Failure Modes
| Failure Mode |
Likelihood |
Impact |
Detection |
Recovery |
| Memory injection attack (poisoned episodic record) |
Medium |
High — agent acts on attacker-injected false memory |
Content validation on retrieval; anomaly scoring on retrieved memory vs. task context |
Quarantine suspicious memory; re-run task without memory; alert security |
| Stale memory degrades quality |
High |
Medium — agent uses outdated patterns |
Recency scoring; explicit staleness alerts; retrieval quality feedback loop |
Retire memories below freshness threshold; force re-consolidation |
| PII leakage to semantic store |
Medium |
Critical — privacy violation |
Automated PII scan on semantic store; periodic audit |
Immediate quarantine; purge; Privacy Officer notification |
| Memory store unavailability |
Low |
High — agents lose all memory context |
Health check monitoring; circuit breaker |
Degrade to in-context only; restore from backup; alert |
| Erasure failure for SAR |
Low |
Critical — regulatory non-compliance |
SAR tracking system deadline monitoring |
Manual erasure procedure; Privacy Officer + Legal involvement |
| Runaway memory growth |
Medium |
Medium — cost overrun; performance degradation |
Memory store size monitoring; growth rate alerts |
Retention policy enforcement; emergency pruning |
14. Regulatory Considerations
GDPR / Privacy Act
- Episodic store is a personal data store; all GDPR/Privacy Act obligations apply: purpose limitation, data minimisation, retention limits, security, right of access, right to erasure
- Right to erasure is technically complex for vector stores: embeddings cannot be selectively deleted in some managed services — select a vector store that supports metadata-filtered deletion or record-level deletion
- Cross-border transfers of episodic memories containing EU/Australian personal data must comply with transfer mechanisms (SCCs, IDTA, adequacy decisions)
EU AI Act
- Art. 10 (Data Governance): the four-tier memory architecture with PII filtering and classification satisfies the data governance requirements for high-risk AI systems
- Art. 12 (Record Keeping): the memory audit log and task execution traces satisfy the record-keeping requirement
- If the memory system supports a high-risk AI use case, the consolidation pipeline (which trains the semantic store) may be treated as training data and requires bias assessment
ISO 42001
- §8.4 (AI System Lifecycle): memory consolidation is part of the learning lifecycle and must be governed under the AI system lifecycle procedures
- §6.1 (Risk Assessment): memory-specific risks (injection, leakage, staleness) must be identified and treated in the AI risk assessment
APRA CPS 234
- Memory stores containing customer data are information assets under CPS 234; information security capability requirements apply
- CMK encryption and strict access controls on episodic store satisfy the information asset protection requirement
15. Reference Implementations
AWS
| Component |
AWS Service |
| Episodic Store |
Amazon Aurora PostgreSQL + pgvector; or Amazon OpenSearch k-NN |
| Semantic Store |
Amazon OpenSearch Service (vector engine); or Amazon Bedrock Knowledge Bases |
| Procedural Library |
Amazon DynamoDB (JSON document store) |
| PII Detection |
Amazon Comprehend (entity detection + PII classification) |
| Consolidation Engine |
AWS Lambda + Amazon SQS (async trigger on task completion) |
| Memory Audit Log |
Amazon S3 (WORM) + AWS CloudTrail |
| SAR/RTE Handler |
Custom Lambda with Aurora queries + OpenSearch delete-by-query |
Azure
| Component |
Azure Service |
| Episodic Store |
Azure Cosmos DB for PostgreSQL + pgvector |
| Semantic Store |
Azure AI Search (vector + hybrid search) |
| PII Detection |
Azure AI Language (PII detection) |
| Consolidation Engine |
Azure Functions (event-triggered) + Azure Service Bus |
| Memory Audit Log |
Azure Immutable Blob Storage + Azure Monitor |
GCP
| Component |
GCP Service |
| Episodic Store |
Cloud Spanner (OLTP) + Vertex AI Matching Engine (vector) |
| Semantic Store |
Vertex AI Vector Search |
| PII Detection |
Cloud DLP (Data Loss Prevention) |
| Consolidation Engine |
Cloud Run (triggered by Pub/Sub) |
On-Premises
| Component |
Technology |
| Episodic + Semantic Store |
Weaviate or Qdrant on Kubernetes |
| PII Detection |
Microsoft Presidio (open-source) |
| Consolidation Engine |
Celery + Redis on Kubernetes |
| Audit Log |
Apache Kafka + MinIO (S3-compatible WORM) |
| Pattern |
ID |
Relationship Type |
Notes |
| Single Agent Pattern |
EAAPL-AGT001 |
Extended By |
AGT001 defines the agent loop; this pattern provides the memory architecture it depends on |
| Agent Tool Registry |
EAAPL-AGT003 |
Peer |
Procedural memory stores tool call patterns; interacts with registry for tool resolution |
| Agent Checkpoint and Recovery |
EAAPL-AGT005 |
Peer |
Checkpoint state includes references to memory records; recovery restores memory context |
| Reflexive Agent |
EAAPL-AGT006 |
Extends |
Reflection outputs may be written to episodic and semantic memory as quality learnings |
| Long-Running Agent |
EAAPL-AGT007 |
Depends On |
Long-running agents rely on episodic memory to maintain context across async task segments |
| Human-in-the-Loop Agent |
EAAPL-MAG003 |
Peer |
Human feedback events are written to episodic memory to improve future similar tasks |
17. Maturity Assessment
Overall Maturity: Proven
| Dimension |
Score (1–5) |
Evidence |
| Adoption Breadth |
4 |
Deployed in production by leading enterprises; managed services (Bedrock KB, Azure AI Search) widely adopted |
| Tooling Ecosystem |
4 |
Mature vector stores; Mem0, LangChain memory; LlamaIndex retrieval; consolidation tooling still maturing |
| Privacy Compliance Tooling |
3 |
SAR/RTE tooling requires custom implementation; vector-store-native selective delete inconsistently supported |
| Security Hardening |
3 |
Memory injection attacks still emerging threat class; mitigations developing |
| Retrieval Quality |
4 |
Reranking techniques mature; multi-tier retrieval well-understood; long-tail edge cases remain |
| Regulatory Clarity |
3 |
GDPR/Privacy Act obligations clear in principle; technical implementation guidance still evolving |
18. Revision History
| Version |
Date |
Author |
Changes |
| 1.0 |
2024-04-01 |
Architecture Board |
Initial publication |
| 1.1 |
2024-07-15 |
Privacy Engineering |
Added GDPR right-to-erasure vector store guidance; PII cascade purge logic |
| 2.0 |
2025-01-20 |
Architecture Board |
Major revision: four-tier model replacing two-tier; bias audit added to consolidation pipeline; EU AI Act Art. 10 mapping; full OWASP LLM table |