Operational Resilience in Financial Services

Operational resilience has emerged as a critical priority for financial regulators and institutions worldwide. Unlike traditional operational risk management — which focuses on preventing and measuring losses — operational resilience focuses on an institution's ability to continue delivering critical services through disruption.

From Operational Risk to Operational Resilience

The shift in perspective is fundamental:

Operational RiskOperational Resilience
Prevent failures from happeningAssume failures will happen
Measure losses after eventsMaintain services during events
Focus on individual risksFocus on end-to-end services
Backward-looking (loss data)Forward-looking (scenario testing)
Risk transfer (insurance)Service continuity (redundancy)

The COVID-19 pandemic and escalating cyber threats demonstrated that even well-managed firms face disruptions. The question is not whether disruption will occur, but how quickly critical services can be restored.

Regulatory Frameworks

UK Framework (PRA/FCA): The UK pioneered operational resilience regulation with PS6/21 and PS15/21, requiring firms to:

  • Identify Important Business Services (IBS)
  • Set impact tolerances for maximum tolerable disruption
  • Map the resources supporting each IBS
  • Test ability to remain within impact tolerances through severe but plausible scenarios
  • Full compliance deadline: March 2025

EU Framework (DORA): The Digital Operational Resilience Act (DORA) focuses on ICT resilience:

  • ICT risk management frameworks
  • ICT incident reporting
  • Digital operational resilience testing (including threat-led penetration testing)
  • Third-party ICT risk management
  • Information sharing

US Framework: US regulators (Fed, OCC, FDIC) issued joint guidance emphasizing:

  • Critical operations identification
  • Governance and risk management
  • Scenario testing and business continuity planning
  • Third-party dependency management

Building an Operational Resilience Framework

Step 1: Identify Important Business Services

Map the services your institution delivers to external end-users (customers, market participants, counterparties). These are business services, not internal processes. Examples:

  • Processing payments
  • Settling securities trades
  • Providing market liquidity
  • Administering deposits and lending

Step 2: Set Impact Tolerances

For each important business service, define the maximum tolerable disruption — the point at which disruption would cause intolerable harm to consumers, market integrity, or financial stability. Impact tolerances are expressed in terms of:

  • Time: Maximum duration of service unavailability
  • Data: Maximum acceptable data loss or corruption
  • Volume: Minimum transaction throughput during disruption

Step 3: Map Dependencies

For each important business service, map all resources required for delivery:

  • People — Key personnel and skills
  • Technology — Systems, applications, infrastructure
  • Data — Critical data stores and flows
  • Facilities — Physical locations and equipment
  • Third parties — Vendors, cloud providers, market infrastructure

This mapping reveals single points of failure and concentration risks.

Step 4: Scenario Testing

Test the ability to remain within impact tolerances under severe but plausible scenarios:

  • Major cyber attack (ransomware, DDoS)
  • Cloud provider outage
  • Key vendor failure
  • Pandemic/workforce unavailability
  • Natural disaster affecting data centers
  • Regulatory intervention or sanctions event

Step 5: Remediation and Investment

Identify vulnerabilities where impact tolerances would be breached and invest in:

  • System redundancy and failover capability
  • Alternative processing arrangements
  • Enhanced stress testing procedures
  • Improved incident management and communication protocols

Third-Party Risk and Concentration

Financial institutions increasingly depend on a small number of critical third parties — particularly cloud service providers (AWS, Azure, Google Cloud). This creates concentration risk that individual firms cannot fully mitigate.

Regulators are responding with:

  • Direct oversight powers over critical third parties (EU DORA)
  • Multi-cloud and exit strategy requirements
  • Enhanced due diligence and contractual protections
  • Regular testing of third-party failure scenarios

Connection to ERM

Operational resilience sits within the broader enterprise risk management framework but requires distinct governance:

  • Board-level ownership of important business services
  • Cross-functional coordination spanning IT, operations, compliance, and business units
  • Investment decisions driven by service criticality, not just risk appetite
  • Regular reporting on resilience posture and testing outcomes

FRM Exam Perspective

While operational resilience is evolving rapidly, FRM candidates should understand:

  • The distinction between operational risk and operational resilience
  • Important business service identification and impact tolerance setting
  • The role of scenario testing in resilience
  • Key regulatory frameworks (UK, EU DORA, US guidance)
  • Third-party and concentration risk considerations
  • Integration with Basel III operational risk capital requirements