Addressing the generalist limitations of LLMs

editorial

5 months ago

Authored by Justin Hwang, COO at RNA Analytics

Artificial intelligence is becoming increasingly embedded across a wide range of industries, revolutionising organisational systems in the process. For any tool to be truly transformational, however, it must be designed specifically for the task at hand. The deployment of LLMs in the actuarial context is no exception.

The advancement of artificial intelligence tools represents a notable shift in the way business and industry approach complex challenges. From early rule-based systems to today’s highly sophisticated Large Language Models (LLMs) such as GPT and Gemini, AI has been propelled by technological and data advances.

Large language models (LLMs) are arguably the stand out innovation – deep neural network models that are specifically designed to understand and generate human language by learning from enormous volumes of text data – ranging from billions to trillions of words. With their exceptional natural language processing capabilities, LLMs are widely used in document analysis and customer service, with unmatched flexibility and scalability.

At the same time, LLMs have their limitations – limitations that pose serious risks when applying the models to actuarial science, where accuracy and reliability are paramount. The propensity for LLMs to hallucinate; a lack of in-depth domain expertise and advanced reasoning capabilities; and an inability to work with complex Excel spreadsheets make generalist LLMs a source of risk, rather than opportunity for actuaries. Addressing these limitations is fundamental to the successful deployment of LLMs in the actuarial setting.

Overcoming domain- or organisation-specific expertise is key to addressing the risks associated with overly broad public data sources, such as Google or Wikipedia, that often fail to fully grasp the complex contract structures of insurance products or detailed regulatory frameworks like IFRS 17 and capital requirements – a significant barrier to practical application in the field. Enhancing actuarial expertise through additional model training, or fine-tuning using domain-specific data, terminology and task patterns, transforms a general-purpose AI into an ‘in-house expert’, tailored to understand and perform tasks within a specialised professional context.

Fine-tuning becomes both a way of enhancing performance, and a strategy for ensuring trust, providing the foundation for consistent judgments and explainable results in the insurance industry. Fine-tuning involves four steps: collecting domain-specific documents, such as manuals; creating instruction datasets with question-answer pairs; retraining the base model by updating parameters; and validating performance with real business questions. This approach works best for highly specialised domains where prompt-based methods fail, when consistent standards are needed for repeated queries, or when models must operate independently in isolated, on-premise environments without API access.

Another way of overcoming the limitations of traditional language models is through improving accuracy by integrating external knowledge sources with Retrieval-Augmented Generation (RAG). RAG enables LLMs to retrieve information from external knowledge sources in real time (Retrieval); supplement its response with that information (Augmentation); and then generate the final answer (Generation). Unlike conventional LLMs that rely solely on pre-trained parameters, RAG significantly improves accuracy and reliability by referencing trusted external data, making it a practical and powerful framework for structurally controlling hallucinations. Because it goes beyond merely compensating for the weaknesses of LLMs to build trustworthy AI response systems, RAG is particularly well-suited to the high-accuracy domain of actuarial work.

The decisive factor for the success of any AI project is arguably the dataset – the quality of data directly impacts the performance of AI. And while some insurance entities have invested large sums in AI systems, many have failed to achieve the expected results due to document format. Many documents containing insurance data are written in formats that AI struggles to read. This is not just about typos or errors; often, documents are not structured in a way that AI can systematically understand. Addressing this requires a shift away from PDF-centric document creation. PDFs were designed primarily for printing. While they look good to the human eye, their structure is ambiguous to a machine. Recent attempts to analyse PDFs using AI technologies like Optical Character Recognition or Vision Transformers have been made, but these methods have limitations in accuracy and require significant time and cost for preprocessing and postprocessing. On the other hand, .docx, .tex, .html, and .md (Markdown) are text-based, globally recognised formats that AI can accurately parse. Notably, Microsoft’s open-source Markdown project has active participation from developers worldwide, making it a stable and reliable format for numerous insurance business documents. Keeping abreast of the AI transformation demands that all companies develop dedicated tools to convert documents into AI-readable formats, or accelerate company-wide efforts to transition to standardised document formats.

Further, complex actuarial formulas are often inserted as images, which AI cannot read effectively. While OCR provides some recognition, accuracy is low and costs high. Instead, formulas should be input using TeX syntax. KaTeX is recommended for its quick browser rendering and ease of learning, enabling organisation-wide adoption.

The next piece of the dataset puzzle is the need to avoid arranging an entire document in table format, which from an AI’s perspective makes a document appear almost encrypted – obscuring the semantic structure. This is where semantic formatting features like ‘heading styles,’ ‘paragraphs,’ and ‘lists’ come to the fore. Companies in the insurance ecosystem possess vast amounts of data, but if these data do not exist in a form that AI can read and write, their value cannot be realised.

Finally, ontology-based databases are crucial to helping AI understand meaning in the actuarial context – not only enhancing interoperability between data and structural understanding, but also enabling more precise, and faster decision-making.

The adoption of AI in actuarial work has moved beyond the experimental phase, and is now evolving towards genuine operational realisation for improved automation and enhanced accuracy. Avoiding a ‘garbage in, garbage out’ scenario requires certainty from the outset when it comes to knowledge sources, structures and system design – a task that demands collaboration among actuaries, data scientists and AI engineers, as they work together on a phased strategy for building future-proof tools based on trusted, accessible data and ontologies. This transformation represents more than technological advancement; it signals the emergence of a new actuarial paradigm where human expertise is amplified by AI precision. Actuaries and insurers that master this integration will not only enhance their operational efficiency but will also unlock previously unimaginable possibilities for risk assessment, product innovation and regulatory compliance.§§§§§§§§