Financial Crime World

Are Three or Four Classes Better Than Two?

A recent study has shed light on whether classifying non-reported transactions into three or four categories is more effective than combining them into a single class. The research team used machine learning algorithms, including XGBoost, to test the predictive performance of different classification models.

Experiment Design

In their experiment, the researchers merged various types of non-reported transactions into a single class, and then compared it to two multiclass models with three and four classes. One model had classes (A) and (D) as separate categories, while (B) and (C) were combined into one group. The other model had four distinct classes: (A), (B), (C), and (D).

Evaluation Criteria

The team evaluated the performance of each model using three different criteria:

  • Brier Score: measures the mean squared error between predicted probabilities and true responses
  • Area Under the Receiver Operating Characteristic Curve (AUC): assesses the ranking of predictions
  • Proportion of Positive Predictions (PPP): calculates the proportion of positive predictions at a specific true positive rate

Results

The results showed that the multiclass model with four classes outperformed the single-class model in terms of predictive performance, particularly when evaluating on a realistic test set that included normal transactions. The team also found that including both levels of legitimate transactions in the training process improved the model’s ability to distinguish between reported and non-reported transactions.

Limitations and Recommendations

However, the researchers cautioned that the PPP measure should be used with care due to its limitations as a proper scoring rule. Nevertheless, they emphasized the importance of considering multiple performance metrics to get a comprehensive understanding of a predictive model’s effectiveness.

Implications for Fraud Detection

The study’s findings have significant implications for organizations seeking to improve their fraud detection capabilities. By classifying non-reported transactions into three or four categories, models can better capture subtle differences in transaction patterns and behaviors, leading to more accurate predictions and improved decision-making.