EAAPL-PLT009Proven

Feature Store Integration

⚙️ Platform EngineeringEU AI ActISO/IEC 42001

[EAAPL-PLT009] Feature Store Integration

Category: Platform Engineering Sub-category: ML Infrastructure / Data Engineering Version: 1.1 Maturity: Proven Tags: feature-store, feature-serving, online-inference, offline-training, feature-pipeline, point-in-time, feature-monitoring, training-serving-skew Regulatory Relevance: EU AI Act Article 10 (Data Governance), ISO 42001 Clause 6, NIST AI RMF MAP 3.5

1. Executive Summary

Feature stores solve a deceptively simple problem: when an ML model needs a feature during inference, how does it get the right value, freshly computed, at low latency? And when training a new model, how does it get the exact same feature values that would have been available at prediction time in the past—preventing the data leakage that invalidates backtests and production evaluations?

The Feature Store Integration pattern establishes a shared infrastructure layer that decouples feature computation from feature consumption, enabling features to be computed once and reused across models, teams, and use cases. The online store serves low-latency feature retrieval for real-time inference; the offline store enables point-in-time correct training data generation. Feature pipelines manage computation and freshness; feature monitoring detects drift that would degrade model performance before it reaches users. For enterprises with multiple ML models consuming overlapping signals, the feature store is the difference between duplicated, inconsistent feature computation and a shared, governed, quality-assured data layer.

2. Problem Statement

Business Problem

Multiple ML models within the same organisation compute the same features independently, consuming redundant engineering effort and producing inconsistent values (e.g., "30-day spend" computed differently for fraud, recommendation, and credit risk models). Business decisions made on these models are implicitly inconsistent. When models are retrained, the historical features used for training may not match what would have been available at prediction time, leading to overoptimistic evaluation metrics and production performance gaps.

Technical Problem

Online inference requires feature values available in <10ms at the model API boundary; this requires a pre-computed, low-latency store. Training requires point-in-time correct historical feature values to avoid look-ahead bias. Without a feature store, teams either accept this bias or build expensive, fragile point-in-time joins from raw data. Feature pipelines are duplicated across teams with no shared infrastructure.

Symptoms

Same feature (e.g., customer 30-day transaction count) computed differently in 3 different model codebases
Production model performance consistently below offline evaluation metrics (training-serving skew)
Feature pipeline failures causing model inference to serve stale or missing features
No visibility into when a feature was last updated or what its current distribution is
Training datasets built from current feature values rather than the values available at the historical prediction time

Cost of Inaction

Training-serving skew causing production models underperforming by 5–20% vs offline evaluation
30–50% of ML engineering time spent on feature engineering that duplicates existing work
Model regressions caused by undetected feature drift going undetected for weeks
Regulatory audits unable to reproduce model predictions due to no record of feature values at decision time

3. Context

When to Apply

Organisation has ≥2 ML models sharing overlapping input features
Real-time inference latency requirements (<50ms) demand pre-computed feature values
Training pipelines require point-in-time correct historical data
Feature reuse across teams is a stated engineering goal
Model performance monitoring requires feature drift detection

When NOT to Apply

Single simple model with unique features: feature store overhead not warranted
LLM-only organisation with no traditional ML models: most LLM use cases don't benefit from traditional feature stores (embeddings have their own infrastructure path)
Research experiments: use pandas and raw data; migrate to feature store when productionising

Prerequisites

Operational data sources (databases, event streams) producing features
Feature computation infrastructure (Spark, Flink, or dbt for offline; streaming processor for online)
Online store infrastructure (Redis or equivalent <10ms lookup)
Offline store infrastructure (data warehouse or object storage for point-in-time joins)
ML model serving infrastructure that can retrieve features at inference time

Industry Applicability

Industry	Applicability	Key Use Case
Financial Services	Very High	Credit risk, fraud detection, CLV, trading signals
E-commerce / Retail	Very High	Personalisation, recommendation, dynamic pricing
Technology / SaaS	High	User behaviour, churn prediction, abuse detection
Healthcare	High	Risk stratification, readmission prediction
Telecommunications	High	Churn, network anomaly, usage prediction
Media / Streaming	High	Content recommendation, engagement prediction

4. Architecture Overview

The feature store architecture is defined by the separation between its online and offline paths, each serving a different consumer with different latency and freshness characteristics.

The Online Store is a low-latency key-value store containing pre-computed feature values, indexed by entity key (e.g., customer_id, product_id, session_id). Lookup latency must be <10ms at P99 to be compatible with real-time inference SLAs. The online store is populated by the feature materialisation pipeline, which computes features from source data and writes them on a schedule (for batch features) or in near-real-time (for streaming features). Redis is the canonical technology for the online store; its GET operation with a compound key (entity_type:entity_id:feature_set) delivers sub-millisecond lookup at scale.

The online store does not store feature history—only the current value for each entity. This makes it fast and cheap. When a model is called for inference, the feature serving layer assembles the feature vector by looking up all required features for the request's entity IDs from the online store, combining them with request-time context (features that cannot be pre-computed because they depend on the current request), and passing the assembled feature vector to the model.

The Offline Store serves training data generation and batch inference. Unlike the online store, the offline store retains historical feature values—specifically, the feature value that was current at any given point in time. This enables point-in-time correct training data generation: given a set of training examples with timestamps, retrieve the feature values that were available just before each timestamp. This prevents look-ahead bias (using future data to predict the past), which is the most common source of training-serving skew. The offline store is implemented as a time-partitioned table in a data warehouse (BigQuery, Redshift, Snowflake) or as Parquet files in object storage, with a time dimension on every feature record.

Feature Pipelines compute and refresh feature values from source data. Batch pipelines run on a schedule (hourly, daily) using Spark or dbt and write to both the offline store (appending the new time-partitioned record) and the online store (overwriting the current value). Streaming pipelines consume event streams (Kafka, Kinesis) and compute features in near-real-time using Flink or Spark Streaming, writing to the online store with low latency. The choice between batch and streaming for a feature depends on its staleness tolerance: fraud detection features require seconds-old values; monthly customer metrics can be daily.

Feature Registry is the metadata layer for the feature store. It records: feature name, description, data type, computation logic (the transformation that produces the feature), data source, update frequency, entity type, business owner, and deprecation status. The feature registry is the discovery mechanism that enables engineers to find existing features before building new ones. It also serves as the configuration source for the feature materialisation pipeline and the feature serving layer.

Feature Monitoring is the operational quality layer. For each feature, monitoring tracks: distribution statistics (mean, std, percentile distribution) on a rolling basis, freshness (time since last update vs. configured threshold), null rate (unexpected nulls indicate pipeline failures), and drift (statistical distance between the current distribution and the training-time distribution, using measures like PSI or Jensen-Shannon divergence). Alerts on feature drift enable proactive model retraining before production performance degrades significantly.

5. Architecture Diagram

ARCHITECTURE DIAGRAM

flowchart TD subgraph Sources["Data Sources"] A[Operational Databases] B[Event Streams] end subgraph Pipelines["Feature Pipelines"] C[Batch Pipeline] D[Streaming Pipeline] end subgraph Store["Feature Store"] E[(Online Store Redis)] F[(Offline Store Point-in-Time)] G[Feature Registry] end subgraph Consumers["Consumers"] H[Real-Time Inference] I[Model Training] end A --> C B --> D C --> E C --> F D --> E G --> C G --> D E --> H F --> I style A fill:#dbeafe,stroke:#3b82f6 style B fill:#dbeafe,stroke:#3b82f6 style C fill:#f0fdf4,stroke:#22c55e style D fill:#f0fdf4,stroke:#22c55e style E fill:#fef9c3,stroke:#eab308 style F fill:#fef9c3,stroke:#eab308 style G fill:#fef9c3,stroke:#eab308 style H fill:#d1fae5,stroke:#10b981 style I fill:#d1fae5,stroke:#10b981

6. Components

Component	Type	Responsibility	Technology Options	Criticality
Online Store	Infrastructure	Sub-10ms feature lookup by entity key	Redis, DynamoDB, Bigtable, Cassandra	Critical
Offline Store	Infrastructure	Point-in-time correct historical feature retrieval	BigQuery, Redshift, Snowflake, Hive, Parquet on S3	Critical
Feature Registry	Service	Metadata catalogue for all features	Feast (open source), Tecton, Hopsworks, custom DB	High
Batch Feature Pipeline	Service	Compute and materialise batch features	Apache Spark, dbt + Airflow, DBT Cloud	Critical
Streaming Feature Pipeline	Service	Compute and materialise near-real-time features	Apache Flink, Spark Structured Streaming, Kafka Streams	High
Feature Materialisation Orchestrator	Service	Schedule and coordinate pipeline execution	Apache Airflow, Prefect, Dagster	High
Feature Server	Service	Assemble multi-feature vectors for inference requests	Feast Feature Server, Tecton Online Serving, custom FastAPI	Critical
Point-in-Time Join Engine	Service	Generate point-in-time correct training datasets	Feast point-in-time join, custom SQL	High
Feature Monitor	Service	Track distribution, drift, freshness, null rate	Evidently AI, WhyLogs, Great Expectations, custom	High
Feature Discovery UI	Service	Search and explore feature registry	Feast UI, Tecton portal, DataHub, custom	Medium

7. Data Flow

Primary Flow — Real-Time Inference with Feature Store

Step	Actor	Action	Output
1	Model API	Receive inference request with entity IDs (customer_id: 12345, product_id: P789)	Entity IDs extracted
2	Feature Server	Look up required features from feature registry for this model version	Required feature list: [customer_30d_spend, customer_risk_score, product_view_count_7d]
3	Feature Server	Batch lookup: MGET customer:12345:spend_features, customer:12345:risk_features, product:P789:engagement	Feature values retrieved from Redis in <5ms
4	Feature Server	Combine pre-computed features with request-time context (e.g., current timestamp, request channel)	Complete feature vector assembled
5	Model Inference	Pass feature vector to model; receive prediction	Prediction
6	Feature Monitor	Log feature values and prediction for drift monitoring	Monitoring record

Error Flow

Error	Detection	Response
Feature missing from online store (entity not materialised)	Redis miss	Return feature default value or null; log missing feature; alert if rate >1%
Stale feature (pipeline hasn't run)	Freshness monitor	Log staleness; serve stale value with staleness metadata; alert pipeline operator
Online store unavailable	Feature server health check	Serve null features or use fallback model without feature enrichment; alert
Feature schema mismatch (pipeline produced wrong type)	Feature monitor type check	Reject feature batch write; alert pipeline owner; serve last-known-good value

8. Security Considerations

Feature data may contain derived personal information (spending patterns, risk scores, health indicators); access to the online store must be restricted to authorised model serving infrastructure
The offline store contains historical PII-derived features; access requires the same data classification controls as the source data
Entity keys in the online store must not leak information about underlying entities; compound keys should use opaque IDs (UUIDs), not readable identifiers

OWASP LLM Controls

OWASP LLM Risk	Feature Store Control
LLM03 Training Data Poisoning	Feature registry enforces approved computation logic; point-in-time joins prevent future-data contamination
LLM09 Overreliance	Feature monitoring detects when input data quality degrades, which would degrade model predictions

9. Governance Considerations

Data Governance

Every feature must have a registered owner responsible for pipeline health and data quality
Features derived from personal information must document the legal basis and retention policy in the feature registry
Deprecated features must be retained in the registry with deprecation date and migration guidance; never silently deleted

Model Risk

Point-in-time join methodology must be validated and documented as part of the model development process; incorrect point-in-time logic is a model risk event
Feature drift alerts must be routed to the model owner, not just the platform team; the model owner is accountable for model performance

Governance Artefacts

Artefact	Owner	Cadence	Location
Feature registry	Feature Owner + Data Team	Continuous	Feature registry service
Feature lineage documentation	Data Engineering	Per feature	Feature registry
Feature monitoring thresholds	Feature Owner	Quarterly review	Monitoring configuration
Privacy impact for PII-derived features	Privacy Team	Per feature with PII	Privacy register
Feature drift incident log	Model Owner	Per incident	Incident management

10. Operational Considerations

Monitoring

Signal	Source	Alert Threshold	Owner
Online store cache miss rate	Feature server metrics	>5% miss (entities not materialised)	Feature Owner
Feature pipeline SLA miss	Pipeline orchestrator	Any pipeline overdue by >2× schedule interval	Feature Owner + Data Eng
Feature distribution drift (PSI)	Feature monitor	PSI > 0.2 (significant drift)	Model Owner
Online store P99 latency	Feature server metrics	>20ms P99	Platform On-Call

SLOs

SLO	Target	Window
Online feature retrieval P99 latency	<10ms	Rolling 7 days
Feature freshness (batch features)	<2× schedule interval	Per feature
Feature pipeline success rate	>99.5%	Rolling 30 days
Online store availability	99.9%	Rolling 30 days

Disaster Recovery

Component	RPO	RTO	Strategy
Online store (Redis)	1 hour	5 min	Redis Sentinel + persistence; rebuild from offline store
Offline store	<1 hour	30 min	Data warehouse replication
Feature pipelines	N/A (stateless)	15 min	Redeploy from IaC; re-run pipeline to catch up

11. Cost Considerations

Cost Drivers

Driver	Description	Relative Weight
Online store (Redis) memory	Proportional to entity count × feature vector size	Medium-High
Batch computation (Spark)	Proportional to data volume and feature count	Medium
Offline store (data warehouse)	Storage + query compute for training data generation	Medium
Streaming computation (Flink)	Always-on for streaming features	Medium

Indicative Cost Range

Scale	Monthly Feature Store Infra Cost
Small (1M entities, 10 features)	$500–$2,000
Medium (100M entities, 50 features)	$5,000–$20,000
Large (1B+ entities, 200+ features)	$30,000–$100,000+

12. Trade-Off Analysis

Feature Store Architecture Options

Option	Description	Pros	Cons	Best For
Open Source (Feast)	Self-managed Feast with Redis + data warehouse	Full control; no vendor lock-in; community support	High operational overhead; less out-of-box tooling	Strong engineering team; cloud-agnostic
Managed (Tecton, Hopsworks)	SaaS feature store with managed pipelines	Low ops overhead; strong tooling	Vendor lock-in; cost at scale	Organisations prioritising velocity over cost optimisation
Cloud-Native (Vertex AI Feature Store, AWS SageMaker Feature Store)	Cloud provider native	Deep integration with cloud ML stack	Tied to cloud provider; variable feature richness	Orgs committed to single cloud

Online Store Technology Options

Option	Latency	Cost	Scalability	Best For
Redis	<1ms	Medium	High (cluster)	Most deployments; canonical choice
DynamoDB	1–5ms	Variable (high at scale)	Very High	AWS-native; serverless operations
Bigtable	1–5ms	High	Extremely High	Google Cloud; very large entity counts

Architectural Tensions

Tension	Option A	Option B	Resolution
Feature freshness vs. computation cost	Streaming (fresh)	Batch (cheap)	Feature-level decision based on staleness tolerance; most features are batch
Centralised feature store vs. team-owned features	Platform team owns all features	Teams own their features in shared store	Teams own features in shared store with platform managing infrastructure
Online store size vs. cost	Store all features for all entities	Store only high-usage features	Tiered: hot features in Redis; warm features in DynamoDB; cold in offline only

13. Failure Modes

Failure	Likelihood	Impact	Detection	Recovery
Online store memory exhaustion (Redis OOM)	Medium	High — feature serving fails	Redis memory metrics	LRU eviction; increase Redis memory; audit feature set for unused features
Batch pipeline failure (features stale)	Medium	High — model consuming stale features	Pipeline SLA monitor; freshness alert	Re-run pipeline; serve stale with staleness flag; alert model owner
Training-serving skew (wrong PIT logic)	Low	Critical — model production performance << offline eval	Production vs offline metric gap	Audit PIT join logic; retrain with corrected data; model risk event
Feature leakage (future data in training)	Low	Critical — optimistic backtests; poor production performance	PIT join timestamp validation	Audit all PIT joins; retrain affected models
Feature drift undetected	Medium	High — gradual model degradation	Production metric monitoring	Improve drift monitoring coverage; lower alert thresholds

14. Regulatory Considerations

EU AI Act Article 10 (Data Governance)

Feature computation logic must be documented (in feature registry) as part of the training data governance requirements for high-risk AI systems
Point-in-time join methodology must be documented to demonstrate absence of data leakage in training data

Privacy Act / GDPR

PII-derived features (spending patterns, health indicators) must have a documented legal basis in the feature registry
Data subject deletion requests must propagate to the online store (delete entity's feature values) and be documented in the offline store (mark as deleted rather than hard delete, to preserve training data integrity)

NIST AI RMF MAP 3.5

Feature monitoring and drift detection implement MAP 3.5's requirement for ongoing monitoring of AI system inputs

15. Reference Implementations

AWS

Component	AWS Service
Online store	Amazon ElastiCache Redis or DynamoDB
Offline store	Amazon Redshift or S3 Parquet
Feature registry	Amazon SageMaker Feature Store (metadata)
Batch pipeline	AWS Glue / EMR (Spark)
Streaming pipeline	Amazon Kinesis Data Analytics (Flink)
Orchestration	Amazon MWAA (Managed Airflow)

GCP

Component	GCP Service
Online store	Vertex AI Feature Store (Online) or Memorystore
Offline store	Vertex AI Feature Store (Offline) or BigQuery
Batch pipeline	Dataflow or BigQuery ML
Streaming pipeline	Dataflow (Apache Beam)

On-Premises / Open Source

Component	Technology
Feature store framework	Feast (open source)
Online store	Redis Enterprise
Offline store	Apache Hive or Delta Lake on MinIO
Batch pipeline	Apache Spark + Apache Airflow
Streaming pipeline	Apache Flink

Pattern ID	Name	Relationship
EAAPL-PLT001	Enterprise AI Platform	Parent — feature store is a platform ML infrastructure component
EAAPL-PLT008	AI Experiment Tracking	Complementary — training datasets generated via feature store feed experiment tracking
EAAPL-INT004	Real-Time AI Stream Processing	Integration — streaming feature pipelines share infrastructure with real-time inference
EAAPL-INT005	Batch AI Processing	Integration — batch feature pipelines share scheduling infrastructure

17. Maturity Assessment

Overall Maturity: Proven Feature stores are production-proven at major technology and financial services companies. Open-source tooling (Feast) and managed services (Tecton, SageMaker Feature Store) are both mature. Point-in-time joins are well-understood. Feature monitoring is less standardised.

Scoring Matrix

Dimension	Score (1–5)	Rationale
Pattern Completeness	5	All sections documented
Implementation Evidence	5	Deployed at Netflix, Uber, LinkedIn, major banks at scale
Tooling Maturity	4	Feast/Tecton/SageMaker mature; feature monitoring less so
Regulatory Alignment	4	EU AI Act Article 10 mapping; privacy patterns documented
Operational Complexity	High	Requires data engineering expertise; streaming pipelines operationally demanding

18. Revision History

Version	Date	Author	Changes
1.0	2024-09-01	EAAPL Working Group	Initial publication
1.1	2025-06-12	EAAPL Working Group	Feature monitoring section expanded; privacy Act data deletion patterns added; Vertex AI Feature Store reference updated

← Back to Library More Platform Engineering →