Ensuring compliance with the FCA’s new operational resilience regulations should be a top priority for financial institutions

By Guy Warren, CEO, ITRS Group

 

Earlier this year, the Financial Conduct Authority (FCA)’s long-awaited and highly anticipated regulatory framework on operational resilience for financial institutions came into force. From 31st March, firms must ensure that their operational resilience strategies are robust – or face backlash from the regulator.

While the lead time for firms to prepare for this regulatory deadline was generous (the FCA announced the plans for the regulation over a year ago), factors like COVID-induced acceleration in digital transformation and online activity, as well as increased market volatility resulting from Russia invading Ukraine, made it more challenging for firms to make meaningful progress towards operational resilience.

As a result, since the FCA set the timer for this deadline last year, businesses’ IT estates have only grown larger, more complex and unwieldy. It’s clear that many firms still have a long way to go before they can feel confident they have met their compliance objectives.

But it’s not too late. Although the regulation came into force this March, a three-year transitional period means firms actually have until 2025 before the regulator expects them to be operating consistently within the impact tolerances they have set out as part of their operational resilience guidelines

So what can firms do to ensure they are on the right path?

 

Identify transaction flows

To achieve operational resilience, firms must identify the paths which the key services use, target and remove any points of weakness and build on modern, up-to-date software that can operate across multiple computers so that if one fails, the rest are able to pick up the slack.

Of course, this is not a one-and-done process. As firms inevitably continue in their pursuit of digital transformation, they must seek to replace or update the outdated elements. After all, it’s digital transformation – not digital expansion.

That said, they must take care not to rush. Over 60% of outages occur as a result of poor change management and could be avoided with more careful planning and a system to fall back on if things aren’t up and running in time.

 

Understand performance and uptime

Businesses will soon be expected to declare the level of performance and uptime they are prepared to commit to and stick to it. This is something firms should start thinking about today as it will require significant historic data to accurately calculate.

Google has popularised Site Reliability Engineering (SRE) the gold standard of uptime monitoring and performance delivery for internet giants and, increasingly, any firms with digital transformation ambitions. The SRE approach involves tracking data and trends over a long lifespan to identify and quickly fix degrading performance levels, and uses both Service Level Objectives (SLOS) and Service Level Indicators (SLIs) as a two-phase early warning system to ensure they are never close to being in breach of their SLA.

Less digitally-native sectors like banking should be following Google’s suit and pursue an SRE approach to operations. While Google has the benefit of massive resources and an incredibly experienced team dedicated to the monitoring of this data, third party providers can support smaller businesses with remote specialists and purpose-built software.

 

Optimise Cloud usage

A comprehensive stock take of the demand profile of business workloads is a critical first step. Firms must begin by right-sizing their estate and developing a thorough understanding of workload behaviour and demand profiles via detailed analytics.

Once a company gathers all this information, it can optimise its environment for the right workload configuration and accurately plan its monthly cloud spend based on a right-sized environment. This means more accurate instance sizes and, in the majority of cases, decreased financial input.

 

Pre-test limits

In order to know for sure that the production environment is going to run properly at peak demand, pre-testing is essential to gauge what it can withstand. Firms need to not only identify the overall capacity ceiling of their systems, but specific bottlenecks and pinch points that can affect overall performance.

The right software will enable firms to model certain levels of demand on their systems. Load testing can simulate the number of users on a platform to see at what point the system will fail and provision for it precisely.

Underpinning this is the dire need for monitoring. With different disparate data and flashing alerts all flooding in at the same time, manual processing is inadequate and the right technology is crucial. By onboarding a proactive monitoring system that encompasses physical, cloud and third-party estates, firms can suppress the white noise and hone in on what’s valuable in real-time, helping them predict and mitigate IT failures before they occur.

 

Integrate security into operations

As opposed to traditional conceptions of security as separate to operations, firms must begin to integrate it into their operations and operational mindset from the get-go. Everyone involved in production should be trained with equal awareness of the critical importance of cybersecurity to ensure that not a single person in the business will let in that Trojan horse. This is particularly important in a COVID-normal world where remote working is increasingly the modus operandi for many.

The new best practice approach involves Zero Trust Networks – challenging firms to provide proof for each transaction made, even inside their own data centre.

 

Nominate a Chief Resilience Officer

Finally, businesses that want to get on the front foot of new senior management requirements – namely SMF24 in the UK – should look to designate a senior leader to focus solely on operational resilience so that the C-suite’s slate is clean by the time they come under scrutiny. The fact that SMF24 will backdate past discretions makes this all the more important to get on top of today.

 

 

spot_img

Explore more