Compliance Testing - Fairness Assessment using R
- Compliance Testing - Fairness Assessment using R
- Preprocessing - Semantic tagging using the LSEG-PermID (Open Calais) service
- Step1 - Load data and libraries
- Step 2 – Perform Principal Component Analysis (PCA) and evaluate clustering potential
- Step 3 - Perform K-means on a factor scores sub-space at evaluate performance with a minimum number of clusters
- Step 4 - Display retained clusters statistics
- Step5 - Evaluate fairness of Basing Hall-BIC selection process
projdir = “C:/RExercise/”
setwd(projdir)
The data table BH_OCC_wStatus-IDSorted_25-May-2021.csv should be downloaded into your R project directory from ( https://github.com/MoiraCorp/Compliance-Testing-Fairness-Assessment-using-R/blob/main/permid-preprocess/OCC_wStatus-IDSorted.csv )
OCC_wStatus <- read.table(“OCC_wStatus-IDSorted.csv”, header=TRUE, sep=”,”)
IMPORTANT Note: The file paths in R follow the Linux standard
so that the “\” character used in Windows file paths need to be changed to character “/”
This training exercise uses several specialized R libraries:
- ggplot2
on CRAN: ggplot2: Create Elegant Data Visualisations Using the Grammar of Graphics (https://cran.r-project.org/web/packages/ggplot2/index.html)
see also: ggplot2 R Documentation (https://www.rdocumentation.org/packages/ggplot2/versions/3.5.0)
with examples in: Data visualization (https://r4ds.hadley.nz/data-visualize) - plotly
on CRAN: plotly: Create Interactive Web Graphics via ‘plotly.js (https://cran.r-project.org/web/packages/plotly/index.html)
see also: Interactive web-based data visualization with R, plotly, and shiny (https://plotly-r.com/) - FactoMineR
on CRAN: FactoMineR: Multivariate Exploratory Data Analysis and Data Mining (https://cran.r-project.org/web/packages/FactoMineR/index.html) - factoextra
on CRAN: factoextra: Extract and Visualize the Results of Multivariate Data Analyses (https://cran.r-project.org/web/packages/factoextra/index.html)
IMPORTANT Note: Before using the “library” function in R, one needs to check that the correponding packages have been loaded from CRAN
Check R or RStudio documentation on how to install CRAN packages
library (ggplot2)
library(plotly)
library(FactoMineR)
library(factoextra)