What is operational resilience?

Operational resilience is the ability of a financial institution to continue delivering critical business services through disruption. Unlike traditional operational risk management that focuses on preventing failures, operational resilience assumes disruptions will occur and focuses on maintaining service delivery within defined impact tolerances.

How is operational resilience different from business continuity planning?

Business continuity planning focuses on recovering internal processes after disruption. Operational resilience takes a broader, outcome-focused view — starting with critical services delivered to end-users, mapping all dependencies, setting maximum tolerable disruption levels, and testing against severe scenarios. It encompasses but goes beyond traditional BCP.

The Digital Operational Resilience Act (DORA) is an EU regulation establishing requirements for ICT risk management, incident reporting, resilience testing, third-party risk management, and information sharing for financial entities. It includes direct regulatory oversight of critical ICT third-party providers like cloud services.

Operational Resilience Frameworks for Financial Services

Q: What is an impact tolerance?

An impact tolerance is the maximum tolerable level of disruption to an important business service, beyond which disruption would cause intolerable harm to consumers, market integrity, or financial stability. It is typically expressed in terms of time (maximum outage duration), data integrity, and minimum throughput.

Operational Resilience in Financial Services

Operational resilience has emerged as a critical priority for financial regulators and institutions worldwide. Unlike traditional operational risk management — which focuses on preventing and measuring losses — operational resilience focuses on an institution's ability to continue delivering critical services through disruption.

From Operational Risk to Operational Resilience

The shift in perspective is fundamental:

Operational Risk	Operational Resilience
Prevent failures from happening	Assume failures will happen
Measure losses after events	Maintain services during events
Focus on individual risks	Focus on end-to-end services
Backward-looking (loss data)	Forward-looking (scenario testing)
Risk transfer (insurance)	Service continuity (redundancy)

The COVID-19 pandemic and escalating cyber threats demonstrated that even well-managed firms face disruptions. The question is not whether disruption will occur, but how quickly critical services can be restored.

Regulatory Frameworks

UK Framework (PRA/FCA): The UK pioneered operational resilience regulation with PS6/21 and PS15/21, requiring firms to:

Identify Important Business Services (IBS)
Set impact tolerances for maximum tolerable disruption
Map the resources supporting each IBS
Test ability to remain within impact tolerances through severe but plausible scenarios
Full compliance deadline: March 2025

EU Framework (DORA): The Digital Operational Resilience Act (DORA) focuses on ICT resilience:

ICT risk management frameworks
ICT incident reporting
Digital operational resilience testing (including threat-led penetration testing)
Third-party ICT risk management
Information sharing

US Framework: US regulators (Fed, OCC, FDIC) issued joint guidance emphasizing:

Critical operations identification
Governance and risk management
Scenario testing and business continuity planning
Third-party dependency management

Building an Operational Resilience Framework

Step 1: Identify Important Business Services

Map the services your institution delivers to external end-users (customers, market participants, counterparties). These are business services, not internal processes. Examples:

Processing payments
Settling securities trades
Providing market liquidity
Administering deposits and lending

Step 2: Set Impact Tolerances

For each important business service, define the maximum tolerable disruption — the point at which disruption would cause intolerable harm to consumers, market integrity, or financial stability. Impact tolerances are expressed in terms of:

Time: Maximum duration of service unavailability
Data: Maximum acceptable data loss or corruption
Volume: Minimum transaction throughput during disruption

Step 3: Map Dependencies

For each important business service, map all resources required for delivery:

People — Key personnel and skills
Technology — Systems, applications, infrastructure
Data — Critical data stores and flows
Facilities — Physical locations and equipment
Third parties — Vendors, cloud providers, market infrastructure

This mapping reveals single points of failure and concentration risks.

Step 4: Scenario Testing

Test the ability to remain within impact tolerances under severe but plausible scenarios:

Major cyber attack (ransomware, DDoS)
Cloud provider outage
Key vendor failure
Pandemic/workforce unavailability
Natural disaster affecting data centers
Regulatory intervention or sanctions event

Step 5: Remediation and Investment

Identify vulnerabilities where impact tolerances would be breached and invest in:

System redundancy and failover capability
Alternative processing arrangements
Enhanced stress testing procedures
Improved incident management and communication protocols

Third-Party Risk and Concentration

Financial institutions increasingly depend on a small number of critical third parties — particularly cloud service providers (AWS, Azure, Google Cloud). This creates concentration risk that individual firms cannot fully mitigate.

Regulators are responding with:

Direct oversight powers over critical third parties (EU DORA)
Multi-cloud and exit strategy requirements
Enhanced due diligence and contractual protections
Regular testing of third-party failure scenarios

Connection to ERM

Operational resilience sits within the broader enterprise risk management framework but requires distinct governance:

Board-level ownership of important business services
Cross-functional coordination spanning IT, operations, compliance, and business units
Investment decisions driven by service criticality, not just risk appetite
Regular reporting on resilience posture and testing outcomes

FRM Exam Perspective

While operational resilience is evolving rapidly, FRM candidates should understand:

The distinction between operational risk and operational resilience
Important business service identification and impact tolerance setting
The role of scenario testing in resilience
Key regulatory frameworks (UK, EU DORA, US guidance)
Third-party and concentration risk considerations
Integration with Basel III operational risk capital requirements

Operational Resilience Frameworks for Financial Services

Operational Resilience in Financial Services

From Operational Risk to Operational Resilience

Regulatory Frameworks

Building an Operational Resilience Framework

Third-Party Risk and Concentration

Connection to ERM

FRM Exam Perspective

Frequently Asked Questions

Related Articles

ESG Risk Integration in Financial Institutions: Frameworks and Best Practices

Risk Data Aggregation and BCBS 239: Principles for Effective Risk Reporting

Algorithmic Trading Risks and Controls: Managing the Machines

Start Practicing Today