EAAPLEnterprise AI Architecture Pattern Library
EAAPLLibraryRegulatory ComplianceEAAPL-CMP001
EAAPL-CMP001Proven↓ Risk signal
⇄ Compare

APRA CPS230 AI Compliance

📋 Regulatory ComplianceAPRA CPS230🏭 Field-tested in AU8 signals · Q2 2026

[EAAPL-CMP001] APRA CPS230 AI Compliance

Category: Compliance / APRA Prudential Standards Sub-category: Operational Resilience for AI Systems Version: 2.0 Maturity: Mature Tags: APRA, CPS230, operational-resilience, business-continuity, third-party-AI, scenario-testing, critical-operations Regulatory Relevance: APRA CPS230 (effective 1 July 2025), CPS231, CPS234, SPG 230


1. Executive Summary

APRA Prudential Standard CPS230 Operational Risk Management (effective 1 July 2025) imposes comprehensive operational resilience obligations on Australian authorised deposit-taking institutions (ADIs), general and life insurers, and registrable superannuation entity (RSE) licensees. AI systems that support critical operations are directly in scope: they must be identified, risk-assessed, and maintained within defined tolerance levels for disruption.

This pattern provides the architecture for meeting CPS230 obligations specifically for AI systems—an area where the standard's requirements create novel obligations that generic operational risk management frameworks do not address. Critical differences from standard technology: AI models can fail silently (producing wrong outputs within availability SLOs); AI vendor dependencies may be concentrated (single foundation model provider); AI systems may be impossible to operate manually during outages (no human fallback for complex ML-driven decisions); and AI incidents may require regulatory notification even when operational SLOs are technically met.

For CIOs, CTOs, and CROs at APRA-regulated entities, this pattern provides a defensible compliance architecture that satisfies CPS230 §19–§46 obligations for AI systems. It is a mandatory pattern for any APRA-regulated entity operating AI systems that support critical operations, and a recommended pattern for AI systems supporting important (non-critical) operations.


2. Problem Statement

Business Problem

APRA-regulated entities are deploying AI systems that support critical operations (credit decisioning, claims processing, fraud detection, superannuation administration) without applying the same operational resilience disciplines required for traditional IT systems. CPS230 explicitly requires critical operation identification and resilience management; AI system failures are not recognised as operational disruptions in most existing BCM frameworks.

Technical Problem

Traditional BCP/DR frameworks are designed for binary system availability (up/down). AI systems can fail in ways that preserve technical availability while delivering materially degraded outputs (silent performance degradation, bias emergence, hallucination). Standard monitoring cannot detect these AI-specific failure modes. CPS230 scenario testing requirements cannot be met with standard disaster recovery exercises.

Symptoms

  • AI systems supporting credit or claims decisions not identified as critical operation supporting technology
  • Business Continuity Plans lack manual fallback procedures for AI-supported decisions
  • Third-party AI provider (e.g., AWS Bedrock, Azure OpenAI) not assessed as a material service provider under CPS230 §28
  • No tolerance for disruption defined for AI-dependent processes
  • Scenario testing does not include AI system failure scenarios

Cost of Inaction

  • Regulatory: APRA enforcement action under CPS230; public letter to board; direction to remediate; financial penalty in extreme cases
  • Operational: AI failure cascading to critical operation disruption without pre-planned response
  • Financial: Disruption to credit, claims, or fund administration affecting customer outcomes and generating FOS/AFCA complaints

3. Context

When to Apply

  • Any APRA-regulated entity (ADI, general insurer, life insurer, RSE licensee) with AI systems supporting operations
  • AI systems that directly or indirectly support operations designated as Critical under CPS230 §17
  • AI systems forming part of material service provider relationships under CPS230 §28
  • Before 1 July 2025 for entities with existing AI systems; immediately for new deployments

When NOT to Apply

  • Non-APRA-regulated entities (but CPS230 alignment is good practice for any regulated financial service)
  • AI systems used exclusively for internal productivity with no customer-facing impact and no critical operation dependency

Prerequisites

  • Critical operation identification completed (CPS230 §17 obligation)
  • AI Model Register (GOV001) operational — provides AI system inventory for critical operation mapping
  • Third-party risk management framework capable of assessing AI vendors
  • Business Continuity Management framework that can be extended for AI-specific scenarios

Industry Applicability

Entity Type Effective Date Critical AI Use Cases Key CPS230 Sections
ADIs (banks) 1 Jul 2025 Credit decisioning, fraud detection, KYC §17, §19, §28, §43
General insurers 1 Jul 2025 Claims assessment, pricing, fraud §17, §19, §28, §43
Life insurers 1 Jul 2025 Underwriting, claims, customer service AI §17, §19, §28
RSE licensees 1 Jul 2025 Member administration, advice, investment §17, §19, §28
APRA-regulated fintech 1 Jul 2025 (if ADI licence) Core banking AI, lending AI All above

4. Architecture Overview

The CPS230 AI Compliance architecture addresses four specific obligations in the standard: critical operation identification and mapping (§17), operational risk assessment (§19), third-party arrangement management (§28), and incident notification (§43). A fifth architectural concern—tolerance for disruption—underpins all four.

Critical Operation AI Dependency Mapping. CPS230 §17 requires APRA-regulated entities to identify their critical operations and the resources that support them. For AI systems, this requires a systematic dependency mapping: which AI models support which business processes, which business processes qualify as critical operations, and therefore which AI models are critical operation supporting systems. The AI Model Register (GOV001) provides the inventory; this pattern adds the critical operation tagging field and the dependency mapping tool.

The mapping reveals a common finding in APRA-regulated entities: more operations are AI-dependent than risk teams realise. Fraud detection AI is obviously critical; but customer service chatbots that route complaints, document classification systems that process insurance claims, and model-driven pricing systems are also part of critical operation chains.

Tolerance for Disruption (TFD) for AI Systems. CPS230 §19 requires entities to define a tolerance for disruption—the maximum period each critical operation can be disrupted before material customer or financial impact occurs. For AI-supported operations, TFD has two components: (1) traditional availability TFD (how long can the AI system be completely unavailable before critical operation is impaired?) and (2) quality TFD (how long can the AI system be producing degraded-quality outputs before critical operation is impaired?). The quality TFD is novel to AI and not in standard BCP frameworks. A credit model producing decisions with accuracy degraded to 60% may still be "available" but is causing material harm after a few thousand decisions.

Manual Fallback Architecture. CPS230 requires business continuity plans for critical operations. For AI-supported critical operations, the BCP must specify: what is the manual fallback process when the AI is unavailable? Who can execute it? What decision tools (scoring tables, decision trees, expert system) replace the AI? What volume can the manual process handle (throughput capacity)? For many AI systems, the honest answer is "we cannot process this volume manually"—which means the TFD for that operation is effectively zero for the AI component, requiring extremely high availability targets and vendor redundancy.

Third-Party AI Vendor Management (§28). Many APRA-regulated entities rely on cloud AI APIs (AWS Bedrock, Azure OpenAI, Google Vertex AI, Anthropic) as material service providers. CPS230 §28 requires: written agreement, risk assessment, exit strategy, access to performance data, right to audit, concentration risk management. For AI vendors specifically, §28 assessment requires: model version change notification requirements, output quality SLAs (not just availability SLAs), data processing agreement, geographic restrictions on training data, and the vendor's own BCP for the AI service. Concentration risk is particularly relevant: if the enterprise uses a single foundation model provider for multiple critical operations, failure of that provider creates correlated AI risk across operations.

AI Scenario Testing. CPS230 §22 requires scenario analysis for operational risks, including severe but plausible disruption scenarios. For AI systems, four scenario types must be tested: (1) AI system complete unavailability (standard DR scenario); (2) AI system silent performance degradation (no alerts, model quietly failing); (3) AI vendor outage affecting multiple AI systems simultaneously (concentration risk scenario); (4) Adversarial attack on AI system causing systematic wrong decisions. Each scenario requires a documented test plan, execution record, and findings report.


5. Architecture Diagram

ARCHITECTURE DIAGRAM
flowchart TD subgraph Input["AI Inventory and Mapping"] A[AI Model Register] B[Business Process Inventory] end subgraph Core["CPS230 Compliance Controls"] C[Critical Operation Mapper] D[TFD Monitor] E[Vendor Risk Assessor] end subgraph Output["Evidence and Response"] F[(APRA Evidence Package)] G[Incident Notification] H[Manual Fallback BCP] end A --> C B --> C C -->|dependency map| D C -->|vendor list| E D -->|TFD breach| G D --> F E --> F C --> H H --> F style A fill:#dbeafe,stroke:#3b82f6 style B fill:#dbeafe,stroke:#3b82f6 style C fill:#f0fdf4,stroke:#22c55e style D fill:#f0fdf4,stroke:#22c55e style E fill:#f0fdf4,stroke:#22c55e style F fill:#fef9c3,stroke:#eab308 style G fill:#fee2e2,stroke:#ef4444 style H fill:#d1fae5,stroke:#10b981

6. Components

Component Type Responsibility Technology Options Criticality
Critical Operation AI Dependency Mapper Analysis Tool Maps AI systems to business processes; identifies critical operation dependencies Custom CMDB extension, ServiceNow, Archer Critical
TFD Definition Tool Governance Process Structured process for defining availability and quality TFD per AI system Workshop facilitation template + governance tool Critical
AI Quality Monitor Monitoring Monitors AI output quality metrics against quality TFD thresholds GOV006 bias pipeline + custom quality metrics Critical
Manual Fallback Process Library BCP Documentation Documents manual fallback procedures for each AI-dependent critical operation Confluence, SharePoint, BCP tool High
AI Vendor Risk Assessment Template Governance Process AI-specific criteria for CPS230 §28 third-party assessment Template in GRC system Critical
AI Vendor Register Data Store Inventory of AI vendors with assessment status, contract details, concentration exposure ServiceNow, Archer, or GOV001 extension High
Scenario Test Planning Tool Governance Process Structures four AI scenario types; tracks test execution and findings GRC system + test management tool High
APRA Notification Workflow Compliance Process Manages 72-hour APRA notification window; drafts, approves, and submits notifications ServiceNow workflow + document management Critical

7. Data Flow

Critical Operation AI Mapping Flow

Step Actor Action Output
1 AI Governance + Business Extract AI system inventory from GOV001 AI system list with MRID and use case
2 Business Operations Map AI systems to business processes AI-to-process dependency map
3 Risk + Operations Identify business processes qualifying as critical operations (CPS230 §17 criteria) Critical operation list
4 Risk Identify AI systems supporting critical operations Critical operation AI dependency map
5 Risk Define TFD (availability + quality) for each AI-dependent critical operation Tolerance for disruption table per operation
6 Engineering Configure monitoring to detect TFD breaches Monitoring thresholds aligned to TFD

8. Security Considerations

AI Security as CPS230 Operational Risk

CPS230 §19 requires assessment of operational risks including cyber and technology risks. AI-specific security risks (adversarial attacks, model theft, prompt injection) must be included in the operational risk assessment for AI-dependent critical operations.

Third-Party AI Vendor Security

§28 requires assessment of service providers' security arrangements. AI vendor security assessment must include: data centre security certifications, penetration testing of AI APIs, model weight security (preventing extraction), and security incident notification obligations.

OWASP LLM Mapping for CPS230

OWASP LLM Risk CPS230 Operational Risk Category Required Control
LLM03 Training Data Poisoning Technology risk — model integrity Vendor training data provenance assessment
LLM05 Supply Chain Third-party risk §28 vendor assessment with supply chain scope
LLM08 Excessive Agency Operational risk — autonomous AI TFD-scoped human oversight requirement

9. Governance Considerations

Board Obligations (CPS230 §7–§9)

The Board must approve the entity's risk management strategy including operational risk. Board must receive quarterly reporting on AI system resilience status, vendor concentration risk, and TFD compliance for AI-dependent critical operations.

Governance Artefacts

Artefact Owner Frequency CPS230 Reference
Critical Operation AI Dependency Map CRO Annual + material change §17, §19
TFD Compliance Report CRO Quarterly §19
AI Vendor Risk Assessment Reports Procurement + Risk Annual per vendor §28
AI Scenario Test Reports CRO Annual §22
Material Incident Notification Log CISO + CRO Per event §43
Board Operational Risk Report (AI section) CRO Quarterly §7

10. Operational Considerations

SLOs Aligned to TFD

AI System Category Availability TFD Quality TFD Availability SLO Quality SLO
Critical — real-time credit 15 minutes 30 minutes 99.99% Quality monitor: daily
Critical — fraud detection 5 minutes 1 hour 99.999% Quality monitor: hourly
Important — claims processing 4 hours 24 hours 99.9% Quality monitor: daily
Standard — customer service AI 24 hours 72 hours 99.5% Quality monitor: weekly

Disaster Recovery

Scenario RTO Target Recovery Method
Primary AI endpoint failure Per TFD above Failover to secondary endpoint / region
AI vendor outage Per TFD + manual fallback threshold Manual fallback activation; vendor SLA claim
Silent quality degradation Per quality TFD Automatic rollback to prior model version

11. Cost Considerations

Indicative Compliance Implementation Cost

Activity One-Time Cost Ongoing Annual Cost
Critical operation AI dependency mapping AUD $50,000–$100,000 AUD $20,000 (annual review)
TFD definition and BCP update AUD $80,000–$150,000 AUD $30,000 (annual)
AI vendor risk assessments (per vendor) AUD $15,000–$30,000 AUD $10,000 (annual refresh)
Scenario testing programme AUD $40,000–$80,000 AUD $40,000 (annual)
APRA notification capability AUD $20,000 AUD $10,000 (ongoing)
Total AUD $205,000–$360,000 ~AUD $110,000/yr

12. Trade-Off Analysis

Option Comparison

Option Description Pros Cons Recommended For
A: Full CPS230 AI compliance architecture (this pattern) Comprehensive implementation of §17, §19, §22, §28, §43 obligations Full regulatory compliance; defensible under examination Significant implementation cost and effort All APRA-regulated entities
B: Minimum viable compliance Critical operation mapping + §43 notification only Lower cost; faster to implement Residual regulatory risk for §19, §22, §28 Temporary position during transition; not sustainable
C: Standard ITSM extension Apply existing BCP and vendor management to AI with minor updates Low incremental cost Standard ITSM misses AI-specific failure modes; quality TFD not addressed Not acceptable for AI supporting critical operations

13. Failure Modes

Failure Likelihood Impact Detection Recovery
AI system supporting critical operation not identified in dependency map High (initially) Critical — unmanaged risk Discovery via incident or examination Comprehensive mapping exercise; GOV001 integration
Quality TFD breach not detected (no quality monitoring) Medium High — harm accruing before detection Quality monitor gap Implement quality monitoring per quality TFD
APRA notification window missed (72h) Low Critical — enforcement risk Notification workflow SLA monitor Voluntary disclosure with explanation; legal counsel
AI vendor fails without exit strategy Low Critical — critical operation disrupted Vendor health monitoring; contractual notification rights Pre-positioned fallback vendor; tested exit procedure

14. Regulatory Considerations

CPS230 Specific Obligations

Section Obligation Architecture Implementation
§17 Identify critical operations Critical operation AI dependency map
§19 Operational risk management TFD definition; quality monitoring; BCP
§22 Scenario analysis Four AI scenario tests annually
§23 Business continuity plan Manual fallback procedures documented
§28 Third-party arrangements AI vendor risk assessment; written agreements
§43 Material incident notification 72-hour APRA notification workflow
§44 Significant changes AI deployments supporting critical operations assessed as significant changes
§46 Post-incident review AI incident PIR per GOV008

SPG 230 (Guidance)

APRA's Supervisory Practice Guide SPG 230 provides additional guidance on CPS230 implementation. Key AI implications: entities are expected to be able to demonstrate operational resilience of critical systems including AI; scenario testing should be "severe but plausible"; TFD should be set based on customer and financial impact, not technical convenience.


15. Reference Implementations

All reference implementations involve organisational and governance work more than technology. The architecture reflects standard enterprise tools configured for CPS230 AI compliance.

Component Technology
Critical Operation AI Mapping ServiceNow CSDM or Archer with custom AI dependency fields
TFD and BCP Documentation Fusion Framework, ServiceNow BCM, or Confluence with template
AI Vendor Risk Assessment Prevalent, ProcessUnity, or ServiceNow VRM with AI-specific questionnaire
Scenario Testing Management ServiceNow GRC or Archer with scenario testing module
APRA Notification ServiceNow workflow + document management

Pattern Relationship Dependency Direction
EAAPL-GOV001 AI Model Register Input — AI inventory for critical operation mapping GOV001 → CMP001
EAAPL-GOV008 AI Incident Management Implements — §43 notification via incident process GOV008 → CMP001
EAAPL-CMP002 APRA CPS234 Sibling — companion prudential standard CMP001 ↔ CMP002
EAAPL-GOV007 AI Audit Trail Evidence source — retention for §32 record-keeping GOV007 → CMP001

17. Maturity Assessment

Overall Maturity: Mature (Level 4)

Dimension Score (1–5) Evidence
Regulatory mapping completeness 5 All key CPS230 sections mapped to architecture
Critical operation AI mapping 4 Methodology defined; completeness depends on GOV001 maturity
Quality TFD innovation 4 Novel concept well-defined; limited industry reference implementations exist
Third-party AI vendor management 4 §28 requirements mapped; AI-specific assessment criteria defined
Scenario testing programme 3 Four scenario types defined; execution playbooks still developing

18. Revision History

Version Date Author Changes
1.0 2024-10-01 EAAPL Working Group Initial publication aligned to CPS230 exposure draft
2.0 2025-07-01 EAAPL Working Group Updated to CPS230 final standard (effective 1 July 2025); quality TFD concept introduced
← Back to LibraryMore Regulatory Compliance