EAAPL-PLT010 — AI Developer Portal Architecture
Status: Proven
Tags: rbac audit-logging cost-optimisation llm medium-complexity
Version: 1.1
Last Updated: 2026-06-12
Author: Enterprise AI Architecture Pattern Library
1. Executive Summary
Enterprise AI capabilities—LLM APIs, embedding services, vector stores, AI agent frameworks, fine-tuning pipelines—are proliferating across organisations faster than platform teams can govern them. Without a centralised internal developer portal, AI capability access becomes a shadow IT problem: teams independently onboard to cloud AI APIs, use personal credit cards or shared API keys, apply inconsistent prompt safety guardrails, and generate costs and compliance exposures that are invisible to the platform and security teams.
The AI Developer Portal is an internal platform that provides product engineering teams with self-service access to the organisation's approved AI capabilities under governed, observable, and cost-attributed conditions. It provides: a searchable API catalogue of all approved AI services; a self-service model access request and approval workflow; per-team usage dashboards showing token consumption, costs, and error rates; a sandbox/playground environment for safe exploration without production impact; AI policy guardrail visibility; documentation standards for AI APIs; a developer onboarding flow; and golden path templates for common AI use cases. This pattern follows the platform engineering principle of reducing cognitive load for product teams while embedding non-negotiable governance controls in the platform itself.
2. Problem Statement
Business Problem
Without an AI developer portal, the typical enterprise AI landscape consists of: multiple teams independently subscribed to the same LLM provider; no consolidated view of AI spend or usage patterns; inconsistent security and compliance practices across teams; duplicated AI infrastructure (each team builds their own prompt management, caching, and monitoring); no mechanism to enforce AI governance policies; and no self-service path for new teams to adopt AI, leading to delays as each team navigates vendor onboarding independently.
Technical Problem
Developers need to discover what AI capabilities are available, understand their constraints (rate limits, token limits, pricing, data handling obligations), experiment safely before committing to production, and access APIs through a consistent, observable path. None of these needs are met by direct LLM vendor API access with individual API keys. Direct API access also bypasses organisational controls: prompt injection guardrails, PII redaction, audit logging, cost attribution, and regulatory compliance middleware are all skipped.
Symptoms
- Three different teams are paying for separate OpenAI API subscriptions with no consolidated volume discount
- A developer used a personal credit card to access a new LLM because the procurement process takes 6 weeks
- A production AI feature was deployed with an API key committed to a public GitHub repository
- The security team cannot identify which AI APIs are in use across the organisation
- A regulatory audit finds that some AI API calls contain PII that should have been redacted
Cost of Inaction
| Dimension |
Consequence |
| Financial |
Shadow AI spend uncounted; missed volume discounts; wasted duplicate infrastructure |
| Security |
Ungoverned API keys; PII in AI calls; prompt injection vulnerabilities |
| Compliance |
AI usage without governance review; PII outside approved processing boundaries |
| Productivity |
Each team spends 2–6 weeks setting up AI infrastructure that a portal would provide in 1 day |
3. Context
When to Apply
- Enterprises with 3+ product teams seeking to use AI capabilities
- Organisations with AI governance requirements (regulated industries, government, large enterprise)
- Platforms needing to enforce consistent AI policies (prompt safety, PII redaction, data residency) across all AI usage
- Organisations seeking to consolidate AI vendor relationships for cost efficiency
When NOT to Apply
- Single-team organisations or early-stage startups where overhead of a portal is not justified
- Research environments where governance constraints would impede academic freedom (lighter-weight controls suffice)
- Organisations with a single, narrowly scoped AI use case
Prerequisites
| Prerequisite |
Description |
| Identity and Access Management |
SSO/LDAP/AD integration for developer identity; RBAC groups |
| AI Governance Policy |
Documented policy defining which AI services are approved, what data classifications are permitted, and what guardrails are required |
| AI Budget |
Consolidated AI API budget allocated to the platform for redistribution to teams |
| Platform Engineering Team |
Team capable of building and operating the portal infrastructure |
| API Catalogue Seed |
Initial list of approved AI APIs and their constraints |
Industry Applicability
| Industry |
Portal Priority |
Key Requirements |
| Financial Services |
High — strict governance required |
PII guardrails; data residency; compliance review workflow; cost chargeback |
| Healthcare |
High — PHI handling controls |
HIPAA-compliant AI paths; strict data classification enforcement |
| Government |
High — ISM/ASD alignment |
Approved cloud services only; classification-based access control |
| Technology / SaaS |
Medium-High — rapid team scaling |
Fast self-service; golden path templates; sandbox first |
| Media / Publishing |
Medium |
Content policy guardrails; cost attribution by product |
| Retail / E-commerce |
Medium |
Personalisation AI; recommendation AI; cost per campaign attribution |
4. Architecture Overview
The AI Developer Portal is structured as a platform that wraps enterprise AI capabilities with governance, observability, and developer experience layers. It follows the internal developer portal pattern (Backstage, Port, Cortex) extended with AI-specific capabilities.
Portal Layer 1 — AI API Catalogue
The AI API Catalogue is the discovery layer. It lists every AI capability approved for enterprise use, organised by category: foundation models (GPT-4o, Claude Sonnet, Gemini Pro), embedding models, image generation, speech-to-text, code generation, AI agent frameworks, vector store services, and MLOps platform capabilities. Each catalogue entry provides: a human-readable description of what the capability does; the data classification tiers it is approved to process (e.g., Public and Internal, not Confidential); applicable guardrails that are pre-configured; rate limits and token limits; pricing per unit of consumption; example API calls with annotated code snippets; OpenAPI specification; and a "Request Access" button that triggers the self-service access workflow. The catalogue is searchable by capability type, approved data classification, pricing, and guardrail compatibility.
Portal Layer 2 — Self-Service Access Request and Approval Workflow
Teams access AI capabilities through a self-service workflow, not through direct vendor API onboarding. The developer completes a brief access request form: team name, use case description, data classification of inputs, expected monthly volume, and compliance considerations. The workflow routes the request for automated or human review based on the capability's risk tier: low-risk capabilities (public data, well-established models, no PII) are auto-approved and credentials issued immediately; medium-risk capabilities require team lead acknowledgment of usage policy; high-risk capabilities (sensitive data, experimental models, agentic capabilities) require platform security team review, which is conducted within 2 business days. On approval, the team receives: a team-scoped API key (with usage tracked to their team); access to the sandbox environment for that capability; usage documentation; and a link to the relevant golden path template.
Portal Layer 3 — Per-Team Usage Dashboards
Every approved team has a dedicated usage dashboard showing: monthly token consumption by model; cost breakdown by model and use case; error rates by endpoint; request latency percentiles; budget utilisation vs monthly allocation; quota utilisation vs team limit; and a 90-day trend. The dashboard is accessible by team members and their managers. The platform team has a cross-team view. Finance has a read-only cost-attribution view. Usage data is updated in near-real-time (latency: <2 minutes from API call to dashboard).
Portal Layer 4 — Sandbox/Playground Environment
The Sandbox is a production-isolated environment where developers can explore AI capabilities without consuming production quota, incurring production-attributed costs, or risking production data exposure. The Sandbox provides: a prompt playground UI for interactive LLM experimentation; a pre-configured set of example prompts per capability; a mock tool execution environment for testing agent workflows; real-time token counter showing consumption (deducted from a separate sandbox budget, not team production quota); and complete isolation from production: no production data is accessible in the sandbox, and sandbox API calls are never logged with the same identifiers as production calls. The Sandbox is the mandated first step for any developer new to an AI capability.
Portal Layer 5 — AI Policy Guardrails Visibility
Every AI capability in the catalogue has a Guardrails Panel that shows the developer which controls are automatically applied to their API calls through the portal proxy layer: PII redaction status (on/off; which categories are redacted); prompt injection detection (on/off; sensitivity level); output content filtering (which content categories are filtered); data residency enforcement (which regions the data may be processed in); audit logging (all calls are logged; what is retained; retention period). Teams cannot disable guardrails; they can only see what is applied. Where a team needs a guardrail configuration not available in the standard catalogue, they can submit a Guardrail Exception Request that goes through the platform security team. This creates a visible and auditable exception path rather than a shadow bypass.
Portal Layer 6 — Documentation, Golden Paths, and Onboarding
AI APIs in the portal are documented to a higher standard than raw vendor documentation. Each capability has: an AI-specific OpenAPI extension spec that includes: model name and version, token limits (input and output), pricing per token, data handling declaration, guardrails applied, and known limitations. Golden Path templates—pre-built code patterns for common AI use cases (RAG search, document summarisation, customer support chatbot, code review, structured extraction)—are provided in Python, TypeScript, and Java. The developer onboarding flow is: (1) explore catalogue and sandbox; (2) submit access request; (3) receive credentials and golden path template; (4) set up usage monitoring; (5) deploy to staging with portal-proxied API calls. This flow is designed to take < 1 day end-to-end for low-risk capabilities.
5. Architecture Diagram
flowchart TD
subgraph Developers["Developers and Teams"]
DEV[Developer]
TEAMLEAD[Team Lead / Manager]
FINANCE[Finance / BI]
end
subgraph Portal["AI Developer Portal"]
CATALOGUE[AI API Catalogue\nSearch + Browse + Filter]
REQFLOW[Access Request Workflow\nAuto / Team Lead / Security Review]
SANDBOX[Sandbox / Playground\nIsolated from Production]
DOCS[Documentation Hub\nOpenAPI + AI Extensions + Golden Paths]
GUARDRAILS_VIS[Guardrails Visibility Panel\nPII / Injection / Content / Residency]
end
subgraph ProxyLayer["AI Gateway / Proxy Layer"]
AUTHZ[Auth + RBAC Check\nTeam-Scoped Key Validation]
GUARDRAILS_APPLY[Guardrail Middleware\nPII Redact + Injection Detect]
RATELIMIT[Rate Limit + Quota\nPer-Team Enforcement]
COSTTRACK[Cost Tracker\nToken + Call Attribution]
AUDITLOG[Audit Logger\nImmutable Per-Call Record]
end
subgraph AI_Services["Approved AI Services"]
LLM1[OpenAI / Azure OpenAI\nGPT-4o / GPT-4o-mini]
LLM2[Anthropic\nClaude Sonnet / Haiku]
LLM3[Google Vertex AI\nGemini Pro]
EMBED[Embedding Services]
AGENT[Agent Frameworks]
end
subgraph Observability["Observability and Governance"]
DASHBOARD[Per-Team Usage Dashboard\nTokens / Cost / Errors / Latency]
COSTATTR[Cost Attribution Engine\nBU / Team / Use Case]
ALERT[Budget + Quota Alerts\n50% / 80% / 100%]
APPROVAL_SVC[Approval Engine\nAuto / Human Review]
end
DEV --> CATALOGUE
DEV --> SANDBOX
DEV --> DOCS
CATALOGUE --> REQFLOW
REQFLOW --> APPROVAL_SVC
APPROVAL_SVC -->|Approved| DEV
DEV -->|API Call via Portal| AUTHZ
AUTHZ --> GUARDRAILS_APPLY
GUARDRAILS_APPLY --> RATELIMIT
RATELIMIT --> COSTTRACK
COSTTRACK --> AUDITLOG
AUDITLOG --> LLM1
AUDITLOG --> LLM2
AUDITLOG --> LLM3
AUDITLOG --> EMBED
AUDITLOG --> AGENT
COSTTRACK --> DASHBOARD
COSTTRACK --> COSTATTR
COSTATTR --> ALERT
TEAMLEAD --> DASHBOARD
FINANCE --> COSTATTR
DEV --> GUARDRAILS_VIS
6. Components
| Component |
Type |
Responsibility |
Technology Options |
Criticality |
| AI API Catalogue |
UI + Database |
List approved AI services; search; access request trigger |
Backstage (open source); Port; Cortex; custom React + PostgreSQL |
High |
| Access Request Workflow Engine |
Workflow |
Route access requests; auto-approve or human-review; issue credentials |
Jira Service Management; ServiceNow; custom Temporal workflow |
High |
| Sandbox / Playground |
UI + Infrastructure |
Isolated interactive AI experimentation environment |
Custom React UI + isolated API proxy; PromptLayer; LangSmith |
High |
| AI Gateway / Proxy |
Infrastructure |
Single entry point for all AI API calls; enforces controls |
Kong; Apigee; AWS API Gateway; LiteLLM Proxy; custom |
Critical |
| Guardrail Middleware |
Processing |
PII redaction, prompt injection detection, content filtering on all API calls |
Microsoft Presidio; NeMo Guardrails; AWS Comprehend; custom |
Critical |
| Rate Limit and Quota Engine |
Processing |
Enforce per-team token and call limits; prevent quota exhaustion |
Redis + sliding window algorithm; Kong rate limiting plugin |
High |
| Cost Tracker + Attribution Engine |
Analytics |
Per-call cost metering; attribute to team/user/use-case |
Custom Kafka consumer + DynamoDB; ClickHouse; BigQuery |
High |
| Audit Logger |
Security |
Immutable per-call audit record: timestamp, team, model, token counts, guardrail actions |
S3 + Object Lock; Azure Immutable Blob; ClickHouse |
Critical |
| Per-Team Usage Dashboard |
Reporting |
Self-service usage and cost visibility for teams |
Grafana + API datasource; Retool; custom React + Chart.js |
High |
| Documentation Hub |
Content |
AI-extended OpenAPI specs; golden path templates; onboarding guides |
Backstage TechDocs; GitBook; Confluence |
High |
| Credential Manager |
Security |
Issue, rotate, and revoke team-scoped API keys |
HashiCorp Vault; AWS Secrets Manager; Azure Key Vault |
Critical |
| Budget and Quota Alert System |
Operations |
Notify team leads and platform on threshold breach |
Email + Slack; PagerDuty; SNS + SES |
Medium |
| Approval Engine |
Governance |
Auto-approve or route for human review based on capability risk tier |
Custom rule engine; Jira workflow; ServiceNow |
High |
7. Data Flow
Primary Flow (Developer Onboarding and First API Call)
| Step |
Actor |
Action |
Output |
| 1 |
Developer |
Browse AI API Catalogue; identify capability needed |
Selected capability from catalogue |
| 2 |
Developer |
Explore capability in Sandbox/Playground |
Confidence in capability suitability |
| 3 |
Developer |
Submit access request: team, use case, data classification |
Access request record in workflow engine |
| 4 |
Approval Engine |
Evaluate risk tier; auto-approve or route for review |
Approval decision within configured SLA |
| 5 |
Credential Manager |
Issue team-scoped API key with configured limits |
API key delivered to developer via secure channel |
| 6 |
Developer |
Configure application to call AI APIs through portal proxy (not direct vendor) |
Application configured with portal endpoint + team API key |
| 7 |
AI Gateway |
Receive API call; validate team API key; check RBAC |
Authenticated and authorised request |
| 8 |
Guardrail Middleware |
Apply PII redaction; prompt injection scan; content filter |
Sanitised request ready for model |
| 9 |
Rate Limit Engine |
Check team's remaining quota; decrement counter |
Permitted or rate-limited response |
| 10 |
AI Vendor API |
Execute inference; return response |
Raw model response |
| 11 |
Cost Tracker |
Record tokens consumed; attribute to team; update running total |
Cost record attributed to team |
| 12 |
Audit Logger |
Write immutable call record: team, model, tokens, guardrail actions, latency |
Audit log entry |
| 13 |
Usage Dashboard |
Update near-real-time dashboard with this call's data |
Dashboard updated within 2 minutes |
Error Flow
| Step |
Failure |
Detection |
Recovery |
| Guardrail Middleware Outage |
Calls not filtered; PII may reach vendor API |
Health check on middleware; portal canary test |
Block all API calls until middleware restored; alert security team |
| AI Vendor API Unavailable |
Teams cannot reach approved AI capability |
Health check; circuit breaker in proxy |
Serve cached unavailability notice; route to alternate approved model if configured |
| Team Quota Exhausted |
Rate limit enforced; developer's calls rejected |
Rate limit metric; developer receives 429 |
Developer notified; team lead can request quota increase via self-service |
| Audit Log Pipeline Failure |
Calls not being logged; compliance gap |
Log pipeline health check |
Block API calls until logging restored (fail-safe); alert compliance team |
| Credential Compromise |
Team API key found in public repository |
GitHub secret scanning alert; anomaly in usage patterns |
Immediately rotate key; audit calls made with compromised key; notify team |
8. Security Considerations
Portal Security Controls
| Domain |
Control |
Implementation |
Notes |
| Authentication |
SSO with MFA required to access portal and request access; API keys are team-scoped, not personal |
SAML/OIDC SSO; MFA enforced |
Prevents shared or personal credential use |
| Authorisation |
RBAC: Developer (read catalogue, submit requests, view own team's dashboard); Team Lead (approve team requests, view team dashboard); Platform Admin (all); Finance (cost reports only) |
RBAC in portal application; portal proxy validates team API key against capability permissions |
|
| Secrets |
Team API keys stored in encrypted credential manager; never shown in UI after initial issuance; rotated every 90 days |
HashiCorp Vault; AWS Secrets Manager |
Rotation prevents long-lived key exposure |
| Classification |
Catalogue entries tagged with maximum approved data classification; proxy enforces classification — calls containing Confidential data to Public-data-only APIs are blocked |
Data classification middleware in proxy |
|
| Encryption |
All portal traffic TLS 1.3; audit logs encrypted at rest with CMEK; API keys encrypted in credential manager |
Cloud-native TLS; CMEK |
|
| Auditability |
Every access request, approval, credential issuance, rotation, and revocation is logged immutably alongside per-call API audit records |
Append-only audit tables; S3 Object Lock for API call logs |
|
OWASP LLM Top 10 — Portal Controls
| OWASP LLM Risk |
Portal Control |
Implementation |
| LLM01 Prompt Injection |
Prompt injection detection applied to all API calls through the portal proxy |
NeMo Guardrails; custom pattern matcher; block or flag high-confidence injections |
| LLM02 Insecure Output Handling |
Output validation middleware strips executable content before returning to caller |
Output schema validation; content type enforcement |
| LLM03 Training Data Poisoning |
Not applicable to inference portal; addressed in training pipeline pattern |
Portal is inference-only; training pipelines are separate |
| LLM04 Model Denial of Service |
Per-team rate limiting and quota prevents any single team from exhausting shared capacity |
Sliding window rate limiter; per-team token budget |
| LLM05 Supply Chain Vulnerabilities |
Only approved AI vendors in catalogue; all vendor integrations security-reviewed |
Vendor approval process; DPA in place for all catalogue entries |
| LLM06 Sensitive Information Disclosure |
PII redaction applied to all API calls before forwarding to vendor |
Microsoft Presidio; AWS Comprehend; configurable per data classification |
| LLM07 Insecure Plugin Design |
Agent framework capabilities reviewed before catalogue listing; tool permissions documented |
Tool permission documentation required in catalogue entry; excessive-permission tools rejected |
| LLM08 Excessive Agency |
Agentic capabilities in catalogue have mandatory cost ceiling documentation; human oversight guardrails documented |
Catalogue entry requires cost ceiling + human oversight method for agent capabilities |
| LLM09 Overreliance |
Catalogue entries include known limitations and recommended human review guidance |
Mandatory "Limitations and Caveats" section in every catalogue entry |
| LLM10 Model Theft |
Portal proxy does not expose model weights or architecture; only inference results |
Proxy design: forward only inference requests; no model download capability |
9. Governance Considerations
Portal Governance
| Domain |
Requirement |
Owner |
Cadence |
| Catalogue Currency |
All catalogue entries reviewed for accuracy; deprecated capabilities removed |
Platform Engineering |
Quarterly |
| Access Request SLAs |
Low-risk: auto-approve; medium: 24h; high: 48h |
Platform Security |
Per-request |
| Guardrail Policy |
Guardrail configurations reviewed and updated as new attack vectors emerge |
Platform Security + AI Governance |
Quarterly + on incident |
| Budget Allocation |
Team AI budgets reviewed and adjusted |
Finance + BU heads |
Quarterly |
| Audit Log Review |
Audit logs reviewed for anomalous patterns; exported for compliance |
Platform Security |
Monthly |
| Golden Path Currency |
Templates tested against current API versions; updated when breaking changes occur |
Platform Engineering |
On model version change |
Governance Artefacts
| Artefact |
Description |
Retention |
| AI API Catalogue Version History |
Record of when capabilities were added, modified, or retired |
Permanent |
| Access Request and Approval Records |
All access requests with justification and approval decision |
7 years |
| Team API Key Issuance and Rotation Log |
When keys were issued, to whom, rotated, or revoked |
7 years |
| Per-Call Audit Logs |
Immutable record of every AI API call through the portal |
7 years |
| Monthly Cost Attribution Reports |
Per-team, per-capability cost summaries for chargeback |
7 years |
| Guardrail Exception Requests |
Approved exceptions to standard guardrail configuration |
5 years |
10. Operational Considerations
Monitoring and SLOs
| SLO |
Target |
Measurement |
Breach Action |
| Portal Gateway Availability |
99.9% per month |
Synthetic probes every 60 seconds |
P1 incident; investigate; notify all teams |
| Access Request Processing |
Low-risk: <15 minutes; medium: <24h; high: <48h |
Request-to-approval duration |
SLA breach alert to platform team; manual escalation |
| Guardrail Middleware Latency |
<100ms added to 99th percentile API call |
P99 latency of guardrail processing |
Investigate; scale horizontally; async mode for non-blocking guardrails |
| Dashboard Data Freshness |
<2 minutes from API call to dashboard |
Data pipeline lag metric |
Alert data engineering; manual cache refresh |
| Audit Log Integrity |
100% of API calls have corresponding audit log entry |
Reconciliation: API call count vs log entry count |
Immediately investigate; may be compliance-reportable gap |
Capacity Planning
The portal proxy adds 50–150ms to API call latency for the guardrail layer. This is acceptable for most AI use cases (human-facing chatbots: <300ms budget; batch processing: latency-insensitive). For latency-critical applications (<100ms budget), an accelerated path with lightweight guardrails may be needed. Portal proxy should be horizontally scalable behind a load balancer; auto-scaling should be triggered at 60% CPU utilisation.
Disaster Recovery
| Scenario |
Impact |
Recovery |
| Portal Proxy Outage |
All AI API calls fail through portal |
Direct vendor access (break-glass credentials) available for P0 production incidents; restore portal within 4 hours |
| Catalogue Database Failure |
Cannot browse or request new capabilities |
Restore from backup within 1 hour; existing team credentials unaffected |
| Audit Log Store Unavailable |
Compliance gap during outage |
Block new API calls (fail-safe); restore audit store; reconcile calls during gap from proxy access logs |
11. Cost Considerations
Cost Drivers
| Cost Driver |
Indicative Cost |
Notes |
| Portal Proxy Infrastructure |
USD 2,000–20,000/month |
Scales with request volume; Kong / Apigee licensing or cloud-native API GW |
| Guardrail Middleware |
USD 1,000–10,000/month |
PII detection cost scales with text volume |
| Usage Dashboard Infrastructure |
USD 500–5,000/month |
Grafana + data store; or Retool licensing |
| Catalogue and Portal Application |
USD 1,000–5,000/month |
Hosting; Backstage or Port licensing |
| Platform Engineering FTE |
USD 300,000–600,000/year |
1–2 FTE to build and maintain |
| Audit Log Storage |
USD 200–2,000/month |
Scales with call volume; Object Lock WORM storage |
AI Spend Governance Value (Cost Savings Through Portal)
| Benefit |
Estimated Value |
| Volume discount through consolidated API keys |
10–30% reduction in per-token pricing |
| Elimination of shadow AI spend |
20–40% reduction in untracked spend |
| Model tier routing (routing simple tasks to cheaper models) |
30–60% reduction in per-task model cost |
| Duplicate infrastructure elimination |
USD 50,000–200,000/year in avoided team-level AI infrastructure spend |
Indicative Implementation Cost Range
| Organisation |
Annual Portal Cost |
Notes |
| Small (3–10 teams, <1M API calls/month) |
USD 200,000–500,000 |
Lightweight stack; Backstage + Kong |
| Mid-size (10–50 teams, 1M–50M calls/month) |
USD 500,000–1,500,000 |
Full stack; dedicated platform team |
| Large enterprise (50+ teams, >50M calls/month) |
USD 1,500,000–5,000,000 |
Enterprise licensing; large portal team |
12. Trade-Off Analysis
Architecture Options
| Option |
Description |
Pros |
Cons |
Recommended For |
| Option A: Build on Backstage + LiteLLM Proxy |
Open-source Backstage for portal UI + LiteLLM Proxy for AI gateway |
Lowest licensing cost; full customisation; strong community |
Requires platform engineering investment; ongoing maintenance |
Organisations with strong platform engineering capability |
| Option B: Commercial Internal Developer Portal (Port, Cortex) + Cloud API GW |
Commercial IDP platform + cloud-native API gateway |
Faster time to value; managed maintenance; enterprise support |
Higher licensing cost; less flexibility |
Organisations with limited platform engineering capacity |
| Option C: Cloud Provider AI Platform (Azure AI Studio, AWS Bedrock console) |
Use cloud provider's native AI portal capabilities |
Seamlessly integrated with provider ecosystem; no build cost |
Vendor lock-in; limited customisation; single-cloud only |
Organisations already deeply committed to a single cloud provider |
Architectural Tensions
| Tension |
Trade-Off |
Resolution |
| Governance vs Developer Velocity |
Strong controls (approval workflows, guardrails) slow AI adoption |
Auto-approve low-risk capabilities; guardrails are invisible (applied in proxy, not visible to developer) |
| Centralised Platform vs Team Autonomy |
Centralised portal removes team control over AI infrastructure |
Teams retain control over prompts, use cases, and integration; portal controls only what crosses governance boundary |
| Completeness vs Time to Value |
Comprehensive portal takes 6–12 months to build; teams need AI now |
Phase 1 (8 weeks): catalogue + access request + proxy + audit log; Phase 2: sandbox + dashboard; Phase 3: golden paths + advanced features |
| Proxy Latency vs Control |
Every additional middleware layer adds latency |
Profile guardrail latency; async processing for non-blocking guardrails; hardware acceleration for high-volume paths |
13. Failure Modes
| Failure |
Likelihood |
Impact |
Detection |
Recovery |
| Portal Becomes Shadow IT Bypass Target |
High |
High — teams route around portal to direct vendor APIs |
API call origin monitoring; vendor invoice vs portal call count mismatch |
Enforce portal use via network egress rules; no direct vendor access from production networks |
| Guardrail False Positives Block Legitimate Calls |
Medium |
Medium — developer productivity impact; trust in portal erodes |
Developer feedback; error rate spike on guardrail decision |
Tune guardrail sensitivity; add team-specific exception with audit |
| Catalogue Staleness |
High |
Medium — developers use outdated API specs; misconfigurations |
Version mismatch alerts; developer-reported errors |
Implement automated API spec refresh from vendor APIs; quarterly manual review |
| Audit Log Gap |
Low |
Critical — compliance exposure; cannot demonstrate what AI calls were made |
Log pipeline monitoring; reconciliation check |
Fail-safe: block API calls if audit log unavailable |
| Access Request Bottleneck |
Medium |
Medium — 48h SLA for high-risk capabilities delays AI adoption |
Request backlog metric; SLA breach rate |
Add reviewers; pre-approve common high-risk patterns; escalate to platform leadership |
| Portal Single Point of Failure |
Low |
Critical — all AI calls fail if proxy is down |
Availability monitoring; synthetic probes |
Multi-AZ deployment; auto-scaling; break-glass direct access for P0 production |
Cascading Failure Scenario
The AI Developer Portal is deployed with a proxy that adds PII redaction and audit logging but does not have multi-AZ redundancy. A database maintenance window causes the audit log store to be unavailable for 45 minutes. The proxy is configured to fail-open (allow calls even when audit logging is unavailable) to prevent developer disruption. During the 45-minute window, 50,000 API calls are made without audit logging. A subsequent compliance audit finds the logging gap. Because the portal was logging all team API keys, the 45-minute window means investigators cannot fully reconstruct what calls were made. A data subject's Subject Access Request cannot be fully satisfied because some AI calls made during this period are unknown. The GDPR Article 30 record of processing is incomplete for this period. Remediation: configure fail-safe (block calls if audit unavailable); add multi-AZ audit log with write-ahead buffer.
14. Regulatory Considerations
| Regulation |
Portal Relevance |
Portal Control |
Reference |
| GDPR Article 30 — Records of Processing |
Every AI API call via portal creates a processing record |
Audit logger generates records for Article 30 compliance |
GDPR Article 30 |
| GDPR Article 25 — Privacy by Design |
Guardrails (PII redaction) embedded in portal proxy |
Privacy-by-default: PII redaction on by default for all data classifications |
GDPR Article 25 |
| Privacy Act APP 11 — Security |
Portal enforces security controls across all AI API usage |
Auth, RBAC, audit logging, guardrails |
APP 11 |
| APRA CPS234 ¶17 — Controls |
Portal implements preventive and detective controls for all AI API use |
Auth + guardrails + audit = preventive + detective |
CPS234 Paragraph 17 |
| EU AI Act — Transparency and Documentation |
Catalogue entries include transparency information and limitations |
Mandatory "Limitations and Caveats" section; data handling declaration |
EU AI Act Article 13 |
| ISO 42001 Clause 7 — Support |
Portal provides the awareness and documentation support required by Clause 7 |
Documentation Hub + Golden Paths serve ISO 42001 awareness obligation |
ISO/IEC 42001 Clause 7.2–7.3 |
| SOX / Financial Controls |
AI spend attribution for financial services companies; audit trail |
Cost Attribution Engine + immutable audit logs |
SOX Section 302 |
| NIST AI RMF GOVERN 1.3 — Policies |
Portal enforces AI usage policies across all teams |
Guardrails applied uniformly; no bypasses; policy visible in catalogue |
NIST AI RMF GOVERN 1.3 |
15. Reference Implementations
AWS
| Component |
AWS Service / Tool |
| AI API Catalogue |
AWS Service Catalog + Backstage (EC2/ECS hosted) |
| AI Gateway / Proxy |
Amazon API Gateway + AWS Lambda (guardrail middleware) |
| Guardrail Middleware |
Amazon Comprehend (PII) + custom Lambda |
| Rate Limiting |
API Gateway usage plans + Lambda token counter in DynamoDB |
| Cost Tracker |
Kinesis Data Streams + Lambda consumer + DynamoDB cost store |
| Audit Logger |
CloudTrail + S3 Object Lock |
| Usage Dashboard |
Amazon QuickSight; or Grafana on EC2 |
| Credential Manager |
AWS Secrets Manager |
| Sandbox Playground |
Custom React app on AWS Amplify + isolated API Gateway |
| Documentation |
Backstage TechDocs on S3 + CloudFront |
Azure
| Component |
Azure Service / Tool |
| AI API Catalogue |
Azure Developer Portal (APIM) + custom catalogue extension |
| AI Gateway / Proxy |
Azure API Management (built-in gateway) |
| Guardrail Middleware |
Azure AI Language (PII) + APIM policy |
| Rate Limiting |
APIM built-in rate limiting policies |
| Cost Tracker |
Azure Event Hubs + Azure Function + Cosmos DB |
| Audit Logger |
Azure Monitor + Immutable Blob Storage |
| Usage Dashboard |
Power BI + Azure Monitor |
| Credential Manager |
Azure Key Vault + Managed Identities |
| Sandbox Playground |
Custom app on Azure Static Web Apps + separate APIM instance |
| Documentation |
Azure DevOps Wiki + APIM developer portal |
GCP
| Component |
GCP Service / Tool |
| AI API Catalogue |
Apigee Developer Portal; or Backstage on GKE |
| AI Gateway / Proxy |
Apigee API Management |
| Guardrail Middleware |
Cloud DLP + Cloud Endpoints |
| Rate Limiting |
Apigee quota policies |
| Cost Tracker |
Cloud Pub/Sub + Cloud Functions + BigQuery |
| Audit Logger |
Cloud Audit Logs + Cloud Storage Bucket Lock |
| Usage Dashboard |
Looker + BigQuery |
| Credential Manager |
Secret Manager |
| Sandbox Playground |
Custom app on Cloud Run + separate Apigee environment |
On-Premises / Self-Hosted
| Component |
Technology |
| AI API Catalogue |
Backstage (open source, self-hosted) |
| AI Gateway / Proxy |
Kong Gateway (open source) + custom plugins |
| Guardrail Middleware |
Microsoft Presidio (open source); NeMo Guardrails |
| Rate Limiting |
Kong rate limiting plugin + Redis |
| Cost Tracker |
Apache Kafka + Flink + PostgreSQL |
| Audit Logger |
Splunk Enterprise + WORM storage |
| Usage Dashboard |
Grafana + InfluxDB or PostgreSQL |
| Credential Manager |
HashiCorp Vault |
| Sandbox Playground |
Custom React app + isolated Kong environment |
| Documentation |
Backstage TechDocs + GitBook |
| Pattern ID |
Pattern Name |
Relationship |
Notes |
| EAAPL-AGT010 |
AI Agent Cost Governance |
COMPLEMENTARY |
Portal provides cost visibility and budget alerts; Agent Cost Governance provides per-execution controls for agentic workloads |
| EAAPL-CMP004 |
Privacy-Preserving AI |
COMPLEMENTARY |
Portal's PII guardrail middleware operationalises privacy-preserving controls across all teams automatically |
| EAAPL-CMP007 |
Data Residency for AI |
COMPLEMENTARY |
Portal proxy enforces data residency routing; teams see which residency rules apply to their approved capabilities |
| EAAPL-PLT007 |
AI Observability Platform |
PREREQUISITE |
Portal's usage dashboards consume metrics from the AI observability platform |
| EAAPL-SEC001 |
Zero-Trust Architecture |
PREREQUISITE |
Portal API keys are verified through zero-trust identity infrastructure |
| EAAPL-CMP002 |
APRA CPS234 AI Security |
COMPLEMENTARY |
Portal's guardrails and audit logging satisfy CPS234 ¶17 detective and preventive control requirements for AI |
17. Maturity Assessment
Overall Maturity Label: Proven
| Dimension |
Level 1 |
Level 2 |
Level 3 |
Level 4 |
Level 5 |
Current Level |
| API Catalogue |
No catalogue |
Informal list in wiki |
Searchable catalogue with AI-extended specs |
Catalogue integrated with CMDB; auto-updated |
Catalogue self-populating from AI vendor APIs |
Level 3 |
| Access Governance |
No process |
Email request |
Workflow with auto-approve / human review |
SLA-tracked; exception management |
AI-assisted request routing and risk assessment |
Level 3 |
| Guardrails |
No guardrails |
Manual code reviews |
Portal proxy applies guardrails to all calls |
Guardrails configurable per team classification |
Adaptive guardrails based on real-time threat intelligence |
Level 3 |
| Usage Observability |
No visibility |
Monthly billing reports |
Near-real-time per-team dashboards |
Anomaly detection; proactive budget alerts |
Predictive capacity and spend forecasting |
Level 3 |
| Developer Experience |
Direct vendor onboarding (weeks) |
Basic portal (days) |
Golden path templates; sandbox; <1 day onboarding |
AI assistant for prompt development in portal |
Conversational portal with AI-powered capability recommendation |
Level 3 |
18. Revision History
| Version |
Date |
Author |
Changes |
| 1.0 |
2025-08-15 |
EAAPL Working Group |
Initial draft |
| 1.1 |
2026-06-12 |
EAAPL Working Group |
Added cascading failure scenario; expanded reference implementations; added regulatory considerations for ISO 42001 and NIST AI RMF alignment |