USING ROBUST MACHINE LEARNING MODELS TO COUNTER FRAUD IN A PANDEMIC

By: Andy Renshaw, SVP Product Management at Feedzai

 

Recently, a Bank of England report found that over a third of banks have reported a negative impact on the performance of their machine learning models as a result of the pandemic.  In essence, many failed to perform because the economic downturn could not be forecasted on the basis of economic data or historical predictors alone.

We will continue to monitor these developments closely, along with other regulators like the Financial Conduct Authority, and take necessary steps to support the safe adoption of ML and DS in financial services,” the Bank stated. “As Covid has resulted in changes in model performance, more continuous monitoring and validation is required to mitigate this risk, compared to static validation and testing methods.”

This begs the question: If machine learning models are cracking under the sudden behavioural changes caused by the Coronavirus pandemic, what can be done to ensure this doesn’t continue to be a limiting factor in the future?

First, let’s look at the characteristics of robust machine learning models:

Andy Renshaw

Reliance on individual behaviours, not generalised cohorts. It’s good practice to teach the model how to characterise fraud vs. non-fraud on each instance of every behaviour, instead of groupings of behaviours.

Avoidance of ‘overfitting’. Pay attention to model degradation over time, run simulations and production scenarios months into the future, which enables FIs to avoid overfitting. This ensures the model is generalised enough to handle changes in behaviour and unseen patterns.

Inclusion of peak volumes. The data that comprises datasets should include peak volumes (e.g., black Friday, promotions, etc.), so even the stockpiling of toilet paper or hand sanitizer can be read as “normal” by the model.

Usage of historical and real-time data. FIs should build and train models using several months of historical data. Once these models are deployed, update their features and profiles with real-time information.

 

Dynamic entity risk profiling. Where possible, avoid hard encoding of data entities, like risky merchant categories, emails, IPs, payment types, ATMs, locations, etc. Instead, compute those items dynamically and in real-time, which is useful in detecting shifts in fraud strategies. It’s also key to adjusting to new macro-trends in consumer behaviours, such as those witnessed during the pandemic.

To truly understand how robust machine learning models work in practical terms, let’s examine a credit card use case.

In order to feed the model minute data from thousands of cards based on information from the payment gateway, it’s necessary to start with the raw data. This includes each card’s behaviour throughout the year — where it is used, what time of day, what the transaction amounts are, what types of vendors it frequents, etc. It can seem like an endless list of options, but doing this for every card with the help of automation will help machine learning models adjust accordingly.

FIs could potentially have billions of transactions to process, and while technically the data from each could be used, the expense related to machine time and cluster usage makes it an inefficient option. Instead, most big data machine learning applications use a technique called data sampling. Data sampling, much like its name implies, uses a sample subset of data points to identify patterns and trends in the larger data set. But sampling data is not without its challenges. It must be done in a way that maintains the integrity of the underlying data while allowing for the computation of the right features.

Robust machine learning models will implement smart sampling techniques that satisfy these constraints and give a 3% to 6% lift in fraud detection performance when compared with traditional sampling approaches.

It almost goes without saying that in order to detect fraud, changes in customer behaviour must be detected. Fortunately, criminal transactions look quite different from actual transactions. Criminals typically want to spend money quickly before customers or organisations report the card lost or stolen. This usually results in a short-term spike in spending.

Therefore, measuring behaviour allows FIs to compute a specific card’s six-week or eight-week average spending and compare it with the average spending from the previous week or even a single day. Digging deeper, it can also look at complex equations or the probability of a specific person spending a certain amount of money during a particular time frame. In this way, they can differentiate between criminal and authentic behaviour.

Every single transaction has a fraud/non-fraud label associated with it. Labels allow the model to develop an intimate understanding of fraud and its early indicators. In that way, when the card has a change in behaviour (e.g., increase in spending), the model won’t necessarily raise an alert. It knows that particular card because of previous data inputs, which we used to train the model. The model understands what fraud looks like in the context of every specific behaviour.

The model also learns base risk factors. These are risk behaviours independent of individual consumer behaviour because they are common fraud vectors, such as high-velocity spending, transactions in risky merchant types (e.g., gambling websites), or late-night ATM withdrawals.

It’s the combination of individual behaviour and base risk factors that allow the model to build a precise assessment of the true risk of that particular transaction. It takes everything into account, along with the context in which the transaction occurs, looking at spending habits, recent previous activity, merchant of transaction (sometimes even the item), and even the hour in which the transaction occurs. As you can see, the machine learning model doesn’t just consider one factor. Rather, the model looks at hundreds of elements simultaneously to determine if a transaction is suspicious or not.

 

Putting it all together

For example, Allan and Bob are both thirty-five-year-old single men living in London, UK. Allan typically spends £1,000/week, and Bob typically spends £500/week. The model does not trigger an alert when Allan spends his £1,000. However, when Bob spends £1,000, the model triggers multiple alerts. Still, merely spending an additional £500 a week is not the definitive factor (maybe his dishwasher broke down). It is the combination of multiple factors that raised the alarm. These could include: higher spending (change in Bob’s normal behaviour), conducting multiple quick similar amount transactions (a common fraud indicator), spending on a merchant known to have high fraud (the higher likelihood that the transaction is fraudulent), shopping during hours Bob is usually sleeping (change in Bob’s normal behaviour).

At the start of the pandemic, there was a drastic decrease in the number of transactions, which possibly threw some machine learning models off of their baseline “norms” and because short term behaviours were not accounted for. If models are built with robust characteristics, including relying on individual behaviour, avoiding overfitting, including peak volumes, and using historical and real-time data to update risk profiles dynamically, they would hold up well to these socio and economic changes. As global economies slowly come out of lockdown, it’s likely to mean a gradual increase in transactions. From a machine learning model point of view, this is the perfect scenario. The models can easily adjust to these incremental changes. A drastic uptick in transactions could affect the models, but even so, that impact would be automatically contained and quickly mitigated, all thanks to built-in model robustness.

 

spot_img

Explore more