Proven

EAAPL-PLT010 — AI Developer Portal Architecture

Name: EAAPL Pattern Library
Creator: Enterprise AI Architecture Pattern Library
License: https://aipatterns.com.au/terms

⚙️ Platform Engineering

EAAPL-PLT010 — AI Developer Portal Architecture

Status: Proven
Tags: rbac audit-logging cost-optimisation llm medium-complexity
Version: 1.1
Last Updated: 2026-06-12
Author: Enterprise AI Architecture Pattern Library

1. Executive Summary

Enterprise AI capabilities—LLM APIs, embedding services, vector stores, AI agent frameworks, fine-tuning pipelines—are proliferating across organisations faster than platform teams can govern them. Without a centralised internal developer portal, AI capability access becomes a shadow IT problem: teams independently onboard to cloud AI APIs, use personal credit cards or shared API keys, apply inconsistent prompt safety guardrails, and generate costs and compliance exposures that are invisible to the platform and security teams.

The AI Developer Portal is an internal platform that provides product engineering teams with self-service access to the organisation's approved AI capabilities under governed, observable, and cost-attributed conditions. It provides: a searchable API catalogue of all approved AI services; a self-service model access request and approval workflow; per-team usage dashboards showing token consumption, costs, and error rates; a sandbox/playground environment for safe exploration without production impact; AI policy guardrail visibility; documentation standards for AI APIs; a developer onboarding flow; and golden path templates for common AI use cases. This pattern follows the platform engineering principle of reducing cognitive load for product teams while embedding non-negotiable governance controls in the platform itself.

2. Problem Statement

Business Problem

Without an AI developer portal, the typical enterprise AI landscape consists of: multiple teams independently subscribed to the same LLM provider; no consolidated view of AI spend or usage patterns; inconsistent security and compliance practices across teams; duplicated AI infrastructure (each team builds their own prompt management, caching, and monitoring); no mechanism to enforce AI governance policies; and no self-service path for new teams to adopt AI, leading to delays as each team navigates vendor onboarding independently.

Technical Problem

Developers need to discover what AI capabilities are available, understand their constraints (rate limits, token limits, pricing, data handling obligations), experiment safely before committing to production, and access APIs through a consistent, observable path. None of these needs are met by direct LLM vendor API access with individual API keys. Direct API access also bypasses organisational controls: prompt injection guardrails, PII redaction, audit logging, cost attribution, and regulatory compliance middleware are all skipped.

Symptoms

Three different teams are paying for separate OpenAI API subscriptions with no consolidated volume discount
A developer used a personal credit card to access a new LLM because the procurement process takes 6 weeks
A production AI feature was deployed with an API key committed to a public GitHub repository
The security team cannot identify which AI APIs are in use across the organisation
A regulatory audit finds that some AI API calls contain PII that should have been redacted

Cost of Inaction

Dimension	Consequence
Financial	Shadow AI spend uncounted; missed volume discounts; wasted duplicate infrastructure
Security	Ungoverned API keys; PII in AI calls; prompt injection vulnerabilities
Compliance	AI usage without governance review; PII outside approved processing boundaries
Productivity	Each team spends 2–6 weeks setting up AI infrastructure that a portal would provide in 1 day

3. Context

When to Apply

Enterprises with 3+ product teams seeking to use AI capabilities
Organisations with AI governance requirements (regulated industries, government, large enterprise)
Platforms needing to enforce consistent AI policies (prompt safety, PII redaction, data residency) across all AI usage
Organisations seeking to consolidate AI vendor relationships for cost efficiency

When NOT to Apply

Single-team organisations or early-stage startups where overhead of a portal is not justified
Research environments where governance constraints would impede academic freedom (lighter-weight controls suffice)
Organisations with a single, narrowly scoped AI use case

Prerequisites

Prerequisite	Description
Identity and Access Management	SSO/LDAP/AD integration for developer identity; RBAC groups
AI Governance Policy	Documented policy defining which AI services are approved, what data classifications are permitted, and what guardrails are required
AI Budget	Consolidated AI API budget allocated to the platform for redistribution to teams
Platform Engineering Team	Team capable of building and operating the portal infrastructure
API Catalogue Seed	Initial list of approved AI APIs and their constraints

Industry Applicability

Industry	Portal Priority	Key Requirements
Financial Services	High — strict governance required	PII guardrails; data residency; compliance review workflow; cost chargeback
Healthcare	High — PHI handling controls	HIPAA-compliant AI paths; strict data classification enforcement
Government	High — ISM/ASD alignment	Approved cloud services only; classification-based access control
Technology / SaaS	Medium-High — rapid team scaling	Fast self-service; golden path templates; sandbox first
Media / Publishing	Medium	Content policy guardrails; cost attribution by product
Retail / E-commerce	Medium	Personalisation AI; recommendation AI; cost per campaign attribution

4. Architecture Overview

The AI Developer Portal is structured as a platform that wraps enterprise AI capabilities with governance, observability, and developer experience layers. It follows the internal developer portal pattern (Backstage, Port, Cortex) extended with AI-specific capabilities.

Portal Layer 1 — AI API Catalogue The AI API Catalogue is the discovery layer. It lists every AI capability approved for enterprise use, organised by category: foundation models (GPT-4o, Claude Sonnet, Gemini Pro), embedding models, image generation, speech-to-text, code generation, AI agent frameworks, vector store services, and MLOps platform capabilities. Each catalogue entry provides: a human-readable description of what the capability does; the data classification tiers it is approved to process (e.g., Public and Internal, not Confidential); applicable guardrails that are pre-configured; rate limits and token limits; pricing per unit of consumption; example API calls with annotated code snippets; OpenAPI specification; and a "Request Access" button that triggers the self-service access workflow. The catalogue is searchable by capability type, approved data classification, pricing, and guardrail compatibility.

Portal Layer 2 — Self-Service Access Request and Approval Workflow Teams access AI capabilities through a self-service workflow, not through direct vendor API onboarding. The developer completes a brief access request form: team name, use case description, data classification of inputs, expected monthly volume, and compliance considerations. The workflow routes the request for automated or human review based on the capability's risk tier: low-risk capabilities (public data, well-established models, no PII) are auto-approved and credentials issued immediately; medium-risk capabilities require team lead acknowledgment of usage policy; high-risk capabilities (sensitive data, experimental models, agentic capabilities) require platform security team review, which is conducted within 2 business days. On approval, the team receives: a team-scoped API key (with usage tracked to their team); access to the sandbox environment for that capability; usage documentation; and a link to the relevant golden path template.

Portal Layer 3 — Per-Team Usage Dashboards Every approved team has a dedicated usage dashboard showing: monthly token consumption by model; cost breakdown by model and use case; error rates by endpoint; request latency percentiles; budget utilisation vs monthly allocation; quota utilisation vs team limit; and a 90-day trend. The dashboard is accessible by team members and their managers. The platform team has a cross-team view. Finance has a read-only cost-attribution view. Usage data is updated in near-real-time (latency: <2 minutes from API call to dashboard).

Portal Layer 4 — Sandbox/Playground Environment The Sandbox is a production-isolated environment where developers can explore AI capabilities without consuming production quota, incurring production-attributed costs, or risking production data exposure. The Sandbox provides: a prompt playground UI for interactive LLM experimentation; a pre-configured set of example prompts per capability; a mock tool execution environment for testing agent workflows; real-time token counter showing consumption (deducted from a separate sandbox budget, not team production quota); and complete isolation from production: no production data is accessible in the sandbox, and sandbox API calls are never logged with the same identifiers as production calls. The Sandbox is the mandated first step for any developer new to an AI capability.

Portal Layer 5 — AI Policy Guardrails Visibility Every AI capability in the catalogue has a Guardrails Panel that shows the developer which controls are automatically applied to their API calls through the portal proxy layer: PII redaction status (on/off; which categories are redacted); prompt injection detection (on/off; sensitivity level); output content filtering (which content categories are filtered); data residency enforcement (which regions the data may be processed in); audit logging (all calls are logged; what is retained; retention period). Teams cannot disable guardrails; they can only see what is applied. Where a team needs a guardrail configuration not available in the standard catalogue, they can submit a Guardrail Exception Request that goes through the platform security team. This creates a visible and auditable exception path rather than a shadow bypass.

Portal Layer 6 — Documentation, Golden Paths, and Onboarding AI APIs in the portal are documented to a higher standard than raw vendor documentation. Each capability has: an AI-specific OpenAPI extension spec that includes: model name and version, token limits (input and output), pricing per token, data handling declaration, guardrails applied, and known limitations. Golden Path templates—pre-built code patterns for common AI use cases (RAG search, document summarisation, customer support chatbot, code review, structured extraction)—are provided in Python, TypeScript, and Java. The developer onboarding flow is: (1) explore catalogue and sandbox; (2) submit access request; (3) receive credentials and golden path template; (4) set up usage monitoring; (5) deploy to staging with portal-proxied API calls. This flow is designed to take < 1 day end-to-end for low-risk capabilities.

5. Architecture Diagram

ARCHITECTURE DIAGRAM

flowchart TD subgraph Developers["Developers and Teams"] DEV[Developer] TEAMLEAD[Team Lead / Manager] FINANCE[Finance / BI] end subgraph Portal["AI Developer Portal"] CATALOGUE[AI API Catalogue\nSearch + Browse + Filter] REQFLOW[Access Request Workflow\nAuto / Team Lead / Security Review] SANDBOX[Sandbox / Playground\nIsolated from Production] DOCS[Documentation Hub\nOpenAPI + AI Extensions + Golden Paths] GUARDRAILS_VIS[Guardrails Visibility Panel\nPII / Injection / Content / Residency] end subgraph ProxyLayer["AI Gateway / Proxy Layer"] AUTHZ[Auth + RBAC Check\nTeam-Scoped Key Validation] GUARDRAILS_APPLY[Guardrail Middleware\nPII Redact + Injection Detect] RATELIMIT[Rate Limit + Quota\nPer-Team Enforcement] COSTTRACK[Cost Tracker\nToken + Call Attribution] AUDITLOG[Audit Logger\nImmutable Per-Call Record] end subgraph AI_Services["Approved AI Services"] LLM1[OpenAI / Azure OpenAI\nGPT-4o / GPT-4o-mini] LLM2[Anthropic\nClaude Sonnet / Haiku] LLM3[Google Vertex AI\nGemini Pro] EMBED[Embedding Services] AGENT[Agent Frameworks] end subgraph Observability["Observability and Governance"] DASHBOARD[Per-Team Usage Dashboard\nTokens / Cost / Errors / Latency] COSTATTR[Cost Attribution Engine\nBU / Team / Use Case] ALERT[Budget + Quota Alerts\n50% / 80% / 100%] APPROVAL_SVC[Approval Engine\nAuto / Human Review] end DEV --> CATALOGUE DEV --> SANDBOX DEV --> DOCS CATALOGUE --> REQFLOW REQFLOW --> APPROVAL_SVC APPROVAL_SVC -->|Approved| DEV DEV -->|API Call via Portal| AUTHZ AUTHZ --> GUARDRAILS_APPLY GUARDRAILS_APPLY --> RATELIMIT RATELIMIT --> COSTTRACK COSTTRACK --> AUDITLOG AUDITLOG --> LLM1 AUDITLOG --> LLM2 AUDITLOG --> LLM3 AUDITLOG --> EMBED AUDITLOG --> AGENT COSTTRACK --> DASHBOARD COSTTRACK --> COSTATTR COSTATTR --> ALERT TEAMLEAD --> DASHBOARD FINANCE --> COSTATTR DEV --> GUARDRAILS_VIS

6. Components

Component	Type	Responsibility	Technology Options	Criticality
AI API Catalogue	UI + Database	List approved AI services; search; access request trigger	Backstage (open source); Port; Cortex; custom React + PostgreSQL	High
Access Request Workflow Engine	Workflow	Route access requests; auto-approve or human-review; issue credentials	Jira Service Management; ServiceNow; custom Temporal workflow	High
Sandbox / Playground	UI + Infrastructure	Isolated interactive AI experimentation environment	Custom React UI + isolated API proxy; PromptLayer; LangSmith	High
AI Gateway / Proxy	Infrastructure	Single entry point for all AI API calls; enforces controls	Kong; Apigee; AWS API Gateway; LiteLLM Proxy; custom	Critical
Guardrail Middleware	Processing	PII redaction, prompt injection detection, content filtering on all API calls	Microsoft Presidio; NeMo Guardrails; AWS Comprehend; custom	Critical
Rate Limit and Quota Engine	Processing	Enforce per-team token and call limits; prevent quota exhaustion	Redis + sliding window algorithm; Kong rate limiting plugin	High
Cost Tracker + Attribution Engine	Analytics	Per-call cost metering; attribute to team/user/use-case	Custom Kafka consumer + DynamoDB; ClickHouse; BigQuery	High
Audit Logger	Security	Immutable per-call audit record: timestamp, team, model, token counts, guardrail actions	S3 + Object Lock; Azure Immutable Blob; ClickHouse	Critical
Per-Team Usage Dashboard	Reporting	Self-service usage and cost visibility for teams	Grafana + API datasource; Retool; custom React + Chart.js	High
Documentation Hub	Content	AI-extended OpenAPI specs; golden path templates; onboarding guides	Backstage TechDocs; GitBook; Confluence	High
Credential Manager	Security	Issue, rotate, and revoke team-scoped API keys	HashiCorp Vault; AWS Secrets Manager; Azure Key Vault	Critical
Budget and Quota Alert System	Operations	Notify team leads and platform on threshold breach	Email + Slack; PagerDuty; SNS + SES	Medium
Approval Engine	Governance	Auto-approve or route for human review based on capability risk tier	Custom rule engine; Jira workflow; ServiceNow	High

7. Data Flow

Primary Flow (Developer Onboarding and First API Call)

Step	Actor	Action	Output
1	Developer	Browse AI API Catalogue; identify capability needed	Selected capability from catalogue
2	Developer	Explore capability in Sandbox/Playground	Confidence in capability suitability
3	Developer	Submit access request: team, use case, data classification	Access request record in workflow engine
4	Approval Engine	Evaluate risk tier; auto-approve or route for review	Approval decision within configured SLA
5	Credential Manager	Issue team-scoped API key with configured limits	API key delivered to developer via secure channel
6	Developer	Configure application to call AI APIs through portal proxy (not direct vendor)	Application configured with portal endpoint + team API key
7	AI Gateway	Receive API call; validate team API key; check RBAC	Authenticated and authorised request
8	Guardrail Middleware	Apply PII redaction; prompt injection scan; content filter	Sanitised request ready for model
9	Rate Limit Engine	Check team's remaining quota; decrement counter	Permitted or rate-limited response
10	AI Vendor API	Execute inference; return response	Raw model response
11	Cost Tracker	Record tokens consumed; attribute to team; update running total	Cost record attributed to team
12	Audit Logger	Write immutable call record: team, model, tokens, guardrail actions, latency	Audit log entry
13	Usage Dashboard	Update near-real-time dashboard with this call's data	Dashboard updated within 2 minutes

Error Flow

Step	Failure	Detection	Recovery
Guardrail Middleware Outage	Calls not filtered; PII may reach vendor API	Health check on middleware; portal canary test	Block all API calls until middleware restored; alert security team
AI Vendor API Unavailable	Teams cannot reach approved AI capability	Health check; circuit breaker in proxy	Serve cached unavailability notice; route to alternate approved model if configured
Team Quota Exhausted	Rate limit enforced; developer's calls rejected	Rate limit metric; developer receives 429	Developer notified; team lead can request quota increase via self-service
Audit Log Pipeline Failure	Calls not being logged; compliance gap	Log pipeline health check	Block API calls until logging restored (fail-safe); alert compliance team
Credential Compromise	Team API key found in public repository	GitHub secret scanning alert; anomaly in usage patterns	Immediately rotate key; audit calls made with compromised key; notify team

8. Security Considerations

Portal Security Controls

Domain	Control	Implementation	Notes
Authentication	SSO with MFA required to access portal and request access; API keys are team-scoped, not personal	SAML/OIDC SSO; MFA enforced	Prevents shared or personal credential use
Authorisation	RBAC: Developer (read catalogue, submit requests, view own team's dashboard); Team Lead (approve team requests, view team dashboard); Platform Admin (all); Finance (cost reports only)	RBAC in portal application; portal proxy validates team API key against capability permissions
Secrets	Team API keys stored in encrypted credential manager; never shown in UI after initial issuance; rotated every 90 days	HashiCorp Vault; AWS Secrets Manager	Rotation prevents long-lived key exposure
Classification	Catalogue entries tagged with maximum approved data classification; proxy enforces classification — calls containing Confidential data to Public-data-only APIs are blocked	Data classification middleware in proxy
Encryption	All portal traffic TLS 1.3; audit logs encrypted at rest with CMEK; API keys encrypted in credential manager	Cloud-native TLS; CMEK
Auditability	Every access request, approval, credential issuance, rotation, and revocation is logged immutably alongside per-call API audit records	Append-only audit tables; S3 Object Lock for API call logs

OWASP LLM Top 10 — Portal Controls

OWASP LLM Risk	Portal Control	Implementation
LLM01 Prompt Injection	Prompt injection detection applied to all API calls through the portal proxy	NeMo Guardrails; custom pattern matcher; block or flag high-confidence injections
LLM02 Insecure Output Handling	Output validation middleware strips executable content before returning to caller	Output schema validation; content type enforcement
LLM03 Training Data Poisoning	Not applicable to inference portal; addressed in training pipeline pattern	Portal is inference-only; training pipelines are separate
LLM04 Model Denial of Service	Per-team rate limiting and quota prevents any single team from exhausting shared capacity	Sliding window rate limiter; per-team token budget
LLM05 Supply Chain Vulnerabilities	Only approved AI vendors in catalogue; all vendor integrations security-reviewed	Vendor approval process; DPA in place for all catalogue entries
LLM06 Sensitive Information Disclosure	PII redaction applied to all API calls before forwarding to vendor	Microsoft Presidio; AWS Comprehend; configurable per data classification
LLM07 Insecure Plugin Design	Agent framework capabilities reviewed before catalogue listing; tool permissions documented	Tool permission documentation required in catalogue entry; excessive-permission tools rejected
LLM08 Excessive Agency	Agentic capabilities in catalogue have mandatory cost ceiling documentation; human oversight guardrails documented	Catalogue entry requires cost ceiling + human oversight method for agent capabilities
LLM09 Overreliance	Catalogue entries include known limitations and recommended human review guidance	Mandatory "Limitations and Caveats" section in every catalogue entry
LLM10 Model Theft	Portal proxy does not expose model weights or architecture; only inference results	Proxy design: forward only inference requests; no model download capability

9. Governance Considerations

Portal Governance

Domain	Requirement	Owner	Cadence
Catalogue Currency	All catalogue entries reviewed for accuracy; deprecated capabilities removed	Platform Engineering	Quarterly
Access Request SLAs	Low-risk: auto-approve; medium: 24h; high: 48h	Platform Security	Per-request
Guardrail Policy	Guardrail configurations reviewed and updated as new attack vectors emerge	Platform Security + AI Governance	Quarterly + on incident
Budget Allocation	Team AI budgets reviewed and adjusted	Finance + BU heads	Quarterly
Audit Log Review	Audit logs reviewed for anomalous patterns; exported for compliance	Platform Security	Monthly
Golden Path Currency	Templates tested against current API versions; updated when breaking changes occur	Platform Engineering	On model version change

Governance Artefacts

Artefact	Description	Retention
AI API Catalogue Version History	Record of when capabilities were added, modified, or retired	Permanent
Access Request and Approval Records	All access requests with justification and approval decision	7 years
Team API Key Issuance and Rotation Log	When keys were issued, to whom, rotated, or revoked	7 years
Per-Call Audit Logs	Immutable record of every AI API call through the portal	7 years
Monthly Cost Attribution Reports	Per-team, per-capability cost summaries for chargeback	7 years
Guardrail Exception Requests	Approved exceptions to standard guardrail configuration	5 years

10. Operational Considerations

Monitoring and SLOs

SLO	Target	Measurement	Breach Action
Portal Gateway Availability	99.9% per month	Synthetic probes every 60 seconds	P1 incident; investigate; notify all teams
Access Request Processing	Low-risk: <15 minutes; medium: <24h; high: <48h	Request-to-approval duration	SLA breach alert to platform team; manual escalation
Guardrail Middleware Latency	<100ms added to 99th percentile API call	P99 latency of guardrail processing	Investigate; scale horizontally; async mode for non-blocking guardrails
Dashboard Data Freshness	<2 minutes from API call to dashboard	Data pipeline lag metric	Alert data engineering; manual cache refresh
Audit Log Integrity	100% of API calls have corresponding audit log entry	Reconciliation: API call count vs log entry count	Immediately investigate; may be compliance-reportable gap

Capacity Planning

The portal proxy adds 50–150ms to API call latency for the guardrail layer. This is acceptable for most AI use cases (human-facing chatbots: <300ms budget; batch processing: latency-insensitive). For latency-critical applications (<100ms budget), an accelerated path with lightweight guardrails may be needed. Portal proxy should be horizontally scalable behind a load balancer; auto-scaling should be triggered at 60% CPU utilisation.

Disaster Recovery

Scenario	Impact	Recovery
Portal Proxy Outage	All AI API calls fail through portal	Direct vendor access (break-glass credentials) available for P0 production incidents; restore portal within 4 hours
Catalogue Database Failure	Cannot browse or request new capabilities	Restore from backup within 1 hour; existing team credentials unaffected
Audit Log Store Unavailable	Compliance gap during outage	Block new API calls (fail-safe); restore audit store; reconcile calls during gap from proxy access logs

11. Cost Considerations

Cost Drivers

Cost Driver	Indicative Cost	Notes
Portal Proxy Infrastructure	USD 2,000–20,000/month	Scales with request volume; Kong / Apigee licensing or cloud-native API GW
Guardrail Middleware	USD 1,000–10,000/month	PII detection cost scales with text volume
Usage Dashboard Infrastructure	USD 500–5,000/month	Grafana + data store; or Retool licensing
Catalogue and Portal Application	USD 1,000–5,000/month	Hosting; Backstage or Port licensing
Platform Engineering FTE	USD 300,000–600,000/year	1–2 FTE to build and maintain
Audit Log Storage	USD 200–2,000/month	Scales with call volume; Object Lock WORM storage

AI Spend Governance Value (Cost Savings Through Portal)

Benefit	Estimated Value
Volume discount through consolidated API keys	10–30% reduction in per-token pricing
Elimination of shadow AI spend	20–40% reduction in untracked spend
Model tier routing (routing simple tasks to cheaper models)	30–60% reduction in per-task model cost
Duplicate infrastructure elimination	USD 50,000–200,000/year in avoided team-level AI infrastructure spend

Indicative Implementation Cost Range

Organisation	Annual Portal Cost	Notes
Small (3–10 teams, <1M API calls/month)	USD 200,000–500,000	Lightweight stack; Backstage + Kong
Mid-size (10–50 teams, 1M–50M calls/month)	USD 500,000–1,500,000	Full stack; dedicated platform team
Large enterprise (50+ teams, >50M calls/month)	USD 1,500,000–5,000,000	Enterprise licensing; large portal team

12. Trade-Off Analysis

Architecture Options

Option	Description	Pros	Cons	Recommended For
Option A: Build on Backstage + LiteLLM Proxy	Open-source Backstage for portal UI + LiteLLM Proxy for AI gateway	Lowest licensing cost; full customisation; strong community	Requires platform engineering investment; ongoing maintenance	Organisations with strong platform engineering capability
Option B: Commercial Internal Developer Portal (Port, Cortex) + Cloud API GW	Commercial IDP platform + cloud-native API gateway	Faster time to value; managed maintenance; enterprise support	Higher licensing cost; less flexibility	Organisations with limited platform engineering capacity
Option C: Cloud Provider AI Platform (Azure AI Studio, AWS Bedrock console)	Use cloud provider's native AI portal capabilities	Seamlessly integrated with provider ecosystem; no build cost	Vendor lock-in; limited customisation; single-cloud only	Organisations already deeply committed to a single cloud provider

Architectural Tensions

Tension	Trade-Off	Resolution
Governance vs Developer Velocity	Strong controls (approval workflows, guardrails) slow AI adoption	Auto-approve low-risk capabilities; guardrails are invisible (applied in proxy, not visible to developer)
Centralised Platform vs Team Autonomy	Centralised portal removes team control over AI infrastructure	Teams retain control over prompts, use cases, and integration; portal controls only what crosses governance boundary
Completeness vs Time to Value	Comprehensive portal takes 6–12 months to build; teams need AI now	Phase 1 (8 weeks): catalogue + access request + proxy + audit log; Phase 2: sandbox + dashboard; Phase 3: golden paths + advanced features
Proxy Latency vs Control	Every additional middleware layer adds latency	Profile guardrail latency; async processing for non-blocking guardrails; hardware acceleration for high-volume paths

13. Failure Modes

Failure	Likelihood	Impact	Detection	Recovery
Portal Becomes Shadow IT Bypass Target	High	High — teams route around portal to direct vendor APIs	API call origin monitoring; vendor invoice vs portal call count mismatch	Enforce portal use via network egress rules; no direct vendor access from production networks
Guardrail False Positives Block Legitimate Calls	Medium	Medium — developer productivity impact; trust in portal erodes	Developer feedback; error rate spike on guardrail decision	Tune guardrail sensitivity; add team-specific exception with audit
Catalogue Staleness	High	Medium — developers use outdated API specs; misconfigurations	Version mismatch alerts; developer-reported errors	Implement automated API spec refresh from vendor APIs; quarterly manual review
Audit Log Gap	Low	Critical — compliance exposure; cannot demonstrate what AI calls were made	Log pipeline monitoring; reconciliation check	Fail-safe: block API calls if audit log unavailable
Access Request Bottleneck	Medium	Medium — 48h SLA for high-risk capabilities delays AI adoption	Request backlog metric; SLA breach rate	Add reviewers; pre-approve common high-risk patterns; escalate to platform leadership
Portal Single Point of Failure	Low	Critical — all AI calls fail if proxy is down	Availability monitoring; synthetic probes	Multi-AZ deployment; auto-scaling; break-glass direct access for P0 production

Cascading Failure Scenario

The AI Developer Portal is deployed with a proxy that adds PII redaction and audit logging but does not have multi-AZ redundancy. A database maintenance window causes the audit log store to be unavailable for 45 minutes. The proxy is configured to fail-open (allow calls even when audit logging is unavailable) to prevent developer disruption. During the 45-minute window, 50,000 API calls are made without audit logging. A subsequent compliance audit finds the logging gap. Because the portal was logging all team API keys, the 45-minute window means investigators cannot fully reconstruct what calls were made. A data subject's Subject Access Request cannot be fully satisfied because some AI calls made during this period are unknown. The GDPR Article 30 record of processing is incomplete for this period. Remediation: configure fail-safe (block calls if audit unavailable); add multi-AZ audit log with write-ahead buffer.

14. Regulatory Considerations

Regulation	Portal Relevance	Portal Control	Reference
GDPR Article 30 — Records of Processing	Every AI API call via portal creates a processing record	Audit logger generates records for Article 30 compliance	GDPR Article 30
GDPR Article 25 — Privacy by Design	Guardrails (PII redaction) embedded in portal proxy	Privacy-by-default: PII redaction on by default for all data classifications	GDPR Article 25
Privacy Act APP 11 — Security	Portal enforces security controls across all AI API usage	Auth, RBAC, audit logging, guardrails	APP 11
APRA CPS234 ¶17 — Controls	Portal implements preventive and detective controls for all AI API use	Auth + guardrails + audit = preventive + detective	CPS234 Paragraph 17
EU AI Act — Transparency and Documentation	Catalogue entries include transparency information and limitations	Mandatory "Limitations and Caveats" section; data handling declaration	EU AI Act Article 13
ISO 42001 Clause 7 — Support	Portal provides the awareness and documentation support required by Clause 7	Documentation Hub + Golden Paths serve ISO 42001 awareness obligation	ISO/IEC 42001 Clause 7.2–7.3
SOX / Financial Controls	AI spend attribution for financial services companies; audit trail	Cost Attribution Engine + immutable audit logs	SOX Section 302
NIST AI RMF GOVERN 1.3 — Policies	Portal enforces AI usage policies across all teams	Guardrails applied uniformly; no bypasses; policy visible in catalogue	NIST AI RMF GOVERN 1.3

15. Reference Implementations

AWS

Component	AWS Service / Tool
AI API Catalogue	AWS Service Catalog + Backstage (EC2/ECS hosted)
AI Gateway / Proxy	Amazon API Gateway + AWS Lambda (guardrail middleware)
Guardrail Middleware	Amazon Comprehend (PII) + custom Lambda
Rate Limiting	API Gateway usage plans + Lambda token counter in DynamoDB
Cost Tracker	Kinesis Data Streams + Lambda consumer + DynamoDB cost store
Audit Logger	CloudTrail + S3 Object Lock
Usage Dashboard	Amazon QuickSight; or Grafana on EC2
Credential Manager	AWS Secrets Manager
Sandbox Playground	Custom React app on AWS Amplify + isolated API Gateway
Documentation	Backstage TechDocs on S3 + CloudFront

Azure

Component	Azure Service / Tool
AI API Catalogue	Azure Developer Portal (APIM) + custom catalogue extension
AI Gateway / Proxy	Azure API Management (built-in gateway)
Guardrail Middleware	Azure AI Language (PII) + APIM policy
Rate Limiting	APIM built-in rate limiting policies
Cost Tracker	Azure Event Hubs + Azure Function + Cosmos DB
Audit Logger	Azure Monitor + Immutable Blob Storage
Usage Dashboard	Power BI + Azure Monitor
Credential Manager	Azure Key Vault + Managed Identities
Sandbox Playground	Custom app on Azure Static Web Apps + separate APIM instance
Documentation	Azure DevOps Wiki + APIM developer portal

GCP

Component	GCP Service / Tool
AI API Catalogue	Apigee Developer Portal; or Backstage on GKE
AI Gateway / Proxy	Apigee API Management
Guardrail Middleware	Cloud DLP + Cloud Endpoints
Rate Limiting	Apigee quota policies
Cost Tracker	Cloud Pub/Sub + Cloud Functions + BigQuery
Audit Logger	Cloud Audit Logs + Cloud Storage Bucket Lock
Usage Dashboard	Looker + BigQuery
Credential Manager	Secret Manager
Sandbox Playground	Custom app on Cloud Run + separate Apigee environment

On-Premises / Self-Hosted

Component	Technology
AI API Catalogue	Backstage (open source, self-hosted)
AI Gateway / Proxy	Kong Gateway (open source) + custom plugins
Guardrail Middleware	Microsoft Presidio (open source); NeMo Guardrails
Rate Limiting	Kong rate limiting plugin + Redis
Cost Tracker	Apache Kafka + Flink + PostgreSQL
Audit Logger	Splunk Enterprise + WORM storage
Usage Dashboard	Grafana + InfluxDB or PostgreSQL
Credential Manager	HashiCorp Vault
Sandbox Playground	Custom React app + isolated Kong environment
Documentation	Backstage TechDocs + GitBook

Pattern ID	Pattern Name	Relationship	Notes
EAAPL-AGT010	AI Agent Cost Governance	COMPLEMENTARY	Portal provides cost visibility and budget alerts; Agent Cost Governance provides per-execution controls for agentic workloads
EAAPL-CMP004	Privacy-Preserving AI	COMPLEMENTARY	Portal's PII guardrail middleware operationalises privacy-preserving controls across all teams automatically
EAAPL-CMP007	Data Residency for AI	COMPLEMENTARY	Portal proxy enforces data residency routing; teams see which residency rules apply to their approved capabilities
EAAPL-PLT007	AI Observability Platform	PREREQUISITE	Portal's usage dashboards consume metrics from the AI observability platform
EAAPL-SEC001	Zero-Trust Architecture	PREREQUISITE	Portal API keys are verified through zero-trust identity infrastructure
EAAPL-CMP002	APRA CPS234 AI Security	COMPLEMENTARY	Portal's guardrails and audit logging satisfy CPS234 ¶17 detective and preventive control requirements for AI

17. Maturity Assessment

Overall Maturity Label: Proven

Dimension	Level 1	Level 2	Level 3	Level 4	Level 5	Current Level
API Catalogue	No catalogue	Informal list in wiki	Searchable catalogue with AI-extended specs	Catalogue integrated with CMDB; auto-updated	Catalogue self-populating from AI vendor APIs	Level 3
Access Governance	No process	Email request	Workflow with auto-approve / human review	SLA-tracked; exception management	AI-assisted request routing and risk assessment	Level 3
Guardrails	No guardrails	Manual code reviews	Portal proxy applies guardrails to all calls	Guardrails configurable per team classification	Adaptive guardrails based on real-time threat intelligence	Level 3
Usage Observability	No visibility	Monthly billing reports	Near-real-time per-team dashboards	Anomaly detection; proactive budget alerts	Predictive capacity and spend forecasting	Level 3
Developer Experience	Direct vendor onboarding (weeks)	Basic portal (days)	Golden path templates; sandbox; <1 day onboarding	AI assistant for prompt development in portal	Conversational portal with AI-powered capability recommendation	Level 3

18. Revision History

Version	Date	Author	Changes
1.0	2025-08-15	EAAPL Working Group	Initial draft
1.1	2026-06-12	EAAPL Working Group	Added cascading failure scenario; expanded reference implementations; added regulatory considerations for ISO 42001 and NIST AI RMF alignment

← Back to Library More Platform Engineering →

EAAPL-PLT010 — AI Developer Portal Architecture

EAAPL-PLT010 — AI Developer Portal Architecture

1. Executive Summary

2. Problem Statement

Business Problem

Technical Problem

Symptoms

Cost of Inaction

3. Context

When to Apply

When NOT to Apply

Prerequisites

Industry Applicability

4. Architecture Overview

5. Architecture Diagram

6. Components

7. Data Flow

Primary Flow (Developer Onboarding and First API Call)

Error Flow

8. Security Considerations

Portal Security Controls

OWASP LLM Top 10 — Portal Controls

9. Governance Considerations

Portal Governance

Governance Artefacts

10. Operational Considerations

Monitoring and SLOs

Capacity Planning

Disaster Recovery

11. Cost Considerations

Cost Drivers

AI Spend Governance Value (Cost Savings Through Portal)

Indicative Implementation Cost Range

12. Trade-Off Analysis

Architecture Options

Architectural Tensions

13. Failure Modes

Cascading Failure Scenario

14. Regulatory Considerations

15. Reference Implementations

AWS

Azure

GCP

On-Premises / Self-Hosted

16. Related Patterns

17. Maturity Assessment

18. Revision History