Compliance Testing – Fairness Assessment using R

May 16, 2026

Retrieval Augmented Generation (RAG) augmented by ML can help in Proactive Risk Identification enabling predictive analysis to identify potential issues regarding unbalanced customer selection.

See how Retrieval Augmented Generation (RAG) augmented by ML can help in Proactive Risk Identification enabling predictive analysis to identify potential issues regarding unbalanced customer selection

Open code in R language on Github

The problem is here to test if in a VC firm, the data-driven process of startup dossier discovery and selection is unbiased and compliant with a declared principle of “fairness”. Apart from usual financial assessment, the data-driven selection is based on provided descriptions such as: value proposition, customer’s pain points and a list of top benefits for customers.

Using Semantic tagging using the LSEG-PermID (Open Calais) service, we propose to replace the traditional cumbersome manual process of startup companies sourcing and screening by the use of a Machine Learning (ML) process. Our open code on GitHub offers a step by step implementation in R language of the internal rating models approach presented in:

How it works in practice:

Characterize their activity using a Natural Language Process (NLP) tagging system.
followed by a K-means clustering algorithm capable of classifying the startups by their activity.
Test if the selection/dismissal of their dossier is a “fair” process.

Wrestling with a similar regulatory or operational challenge?

We help regulated firms reduce the friction between what compliance requires and what teams actually have to do — through better processes first, AI where it earns its place. A 30-minute Business & Automation Review maps where your time is going and where automation could pay back fastest.

Navigating AI compliance for IHT 2027

Explore how AI is reshaping compliance for IHT 2027. Understand the frameworks and operational shifts impacting financial advisers.

Compliance Testing – Fairness Assessment using R

Retrieval Augmented Generation (RAG) augmented by ML can help in Proactive Risk Identification enabling predictive analysis to identify potential issues regarding unbalanced customer selection.

Company default prediction – DLMM internal rating model in R

Most firms are sitting on data that could predict which clients are at risk or which investments are underperforming. Machine learning is the type of artificial intelligence that enables computers to learn from this existing knowledge and data.

Behavioral & decision-making quantification

GenAI can adopt a persona and "make decisions" or "behave" in a way that can be quantified. This technique is used to simulate scenarios, which can then be analyzed quantitatively and used in particular to assess multi-criteria decision alternatives

Prompt for data

Extracting quantitative information using GenAI tools requires to properly structure the prompts used to question them to efficiently use their large language models (LLMs)

Compliance Testing – Fairness Assessment using R

Table of Contents

Open code in R language on Github

Wrestling with a similar regulatory or operational challenge?

Case study: how Humboldt Financial automated its back office on Plannr

Navigating AI compliance for IHT 2027

Compliance Testing – Fairness Assessment using R

Company default prediction – DLMM internal rating model in R

Behavioral & decision-making quantification

Prompt for data