Why generic AI is losing the AML battle

admin

2 months ago

By Jane Smith, Field Chief Data & AI Officer, ThoughtSpot

The UK government has recently published its new Fraud Strategy as financial crime cases hit record levels. Where are financial institutions currently falling short on detection?

The core issue is a context gap between the AI tools being deployed and the complexity of the threat. Businesses across sectors have continued to invest heavily in AI, but many are still struggling to turn that investment into reliable, high-impact decisions. In financial services, that gap has real consequences.

Generic AI can process volume, but financial crime detection requires something more specific: a system that understands the language of the industry. That means the particular regulatory frameworks, the behavioural patterns that define suspicious activity, and the KPIs that matter in a financial services context. Without that domain grounding, institutions risk generating poor insights at best, and significant miscalculations at worst. The government’s strategy is a welcomed framework, but the institutions best positioned to act on it will be those that have already closed the context gap in their analytical infrastructure.

What specifically does “domain knowledge” mean in an AML context, and why does its absence matter?

Money laundering is deliberately designed to look ordinary. Sophisticated schemes layer transactions to obscure origins, exploit product lines with lower scrutiny thresholds, and move quickly across jurisdictions to stay ahead of manual review cycles. Identifying that activity requires an analytical system that understands not just what happened, but what it means in context.

A generic model might flag a breach in transaction volume. But it may not understand the customer risk profile that should contextualise that breach, the specific regulatory obligations tied to that product category, or the known typologies that are currently in focus for regulators.

While not a technical failure, it is a classic case of model failure. Because plausibly, without the real time contextual grounding of the information, a compliance team is just chasing noise.

This highlights the importance of both strong discipline around semantic layer development, as well as the ongoing management inclusion of encoded contextual data. Semantic layers are effectively the Rosetta Stone of a business. It should include the logic, rules and specificities of your business and the wider financial sector. Part of this is effectively encoding all the expert knowledge of a company’s AML and data specialists into a form which agentic models can both access and understand.

Without this, there’s a gap in what a generic agent will be able to detect. And it won’t reflect the true knowledge of an experienced financial crime analyst, reducing its ability to more precisely know and understand precisely where sophisticated laundering schemes reside.

Fragmented data is frequently cited as a barrier to effective detection. Is that framing too simple?

It’s both accurate and slightly simplistic. And that’s because technical fragmentation usually reflects a fundamental fragmentation within the semantic layer. For example – in the example of a global bank – the term ‘customer risk’ could mean one thing in London and something else in Munich. If there is a fragmentation in the underlying logic being applied to these separate cases and definitions, it creates variant results. And each of these avoidable variants add noise to a bank’s AML efforts, making it inherently more difficult to spot the real fraudulent transactions.

That means the institutions making genuine progress aren’t just investing in better connectivity between systems. They’re building analytical architectures with strong semantic definitions, where data from across the organisation – inclusive of structured transaction data, unstructured communications, and external risk signals – feeds into a single governed layer that produces consistent, verifiable outputs.

What should effective AI-driven financial crime detection actually look like in 2026?

The baseline requirement is determinism: the ability to produce consistent, repeatable results from the same inputs, and to trace every generated insight back to its source. That might sound straightforward, but it rules out a significant proportion of the AI tooling currently being evaluated in financial services. And a key reason for this is the probabilistic nature of many LLMs.

That’s because most tools take text directly to SQL – a process which is both probabilistic and hard to unpick. This presents two dilemmas. Firstly, it allows for inconsistent results when similar questions are asked differently. Your CSIO in London may ask a question differently than a AML expert in Prague, however you do need them both to receive the same information and insights. Otherwise your AI infrastructure is driving your team across different paths.

Secondly too, this freeform SQL becomes hard to unpick. Adding to the opaqueness of results we are looking to move away from as a sector. One solution to this is the implementation of a more tokenized approach to query generation, meaning prompts are mapped to specific search token instead of a freeform, probabilistic SQL script. This allows for more deterministic and consistent answers, allowing global teams to move faster on critical insights as they are working off the same intelligence.

Beyond that, the distinction between industry-specific and generic AI will increasingly define competitive and compliance positioning. The organisations getting ahead of financial crime are not just deploying more AI, but are deploying AI that is genuinely literate in the domain. That means understanding the specific KPIs, regulatory standards, and behavioural patterns of financial services, not just processing high volumes of transactions quickly.

The government’s fraud strategy is a useful framework. But the institutions that will actually move the needle on financial crime are the ones building the analytical infrastructure to act on it – deterministic, traceable, and domain-specific enough to surface what generic tools consistently miss.