Predictive Modeling for Detecting Financial Transactions: A Research Overview

Introduction

This research aims to develop a predictive model that can accurately distinguish between reported and non-reported financial transactions. The proposed model will utilize various performance metrics, including Brier score, AUC (Area Under the Curve), and PPP (Proportion of Positive Predictions) to evaluate its effectiveness.

Understanding the Problem Context

Distinguishing between Reported and Non-Reported Transactions

The research focuses on distinguishing between two classes:

Reported transactions (D): This class includes legitimate financial transactions that have been reported.
Non-reported transactions (A + B + C): This class combines various types of non-legitimate transactions, including alert/case not leading to reporting.

Performance Metrics

The research utilizes three key performance metrics:

1. Brier Score

A proper scoring rule that measures the mean squared error of predicted probabilities compared to true responses.

2. AUC (Area Under the Curve)

A measure of the quality of predictions, with values between 0 and 1, where higher values indicate better performance.

3. PPP (Proportion of Positive Predictions)

A measure that calculates the proportion of positive predictions required to achieve a certain TPR (True Positive Rate) level.

Experimental Design

Types of Legitimate Transactions

The research compares models trained on different combinations of legitimate transaction classes, including:

Normal transactions: This class includes typical financial transactions.
Alerts/Cases not leading to reporting: This class includes non-legitimate transactions that do not result in reporting.

Performance Evaluation

The research uses Brier score, AUC, and PPP measures to evaluate the performance of different models. Some results are presented for both all transactions in the test set and only alerted transactions.

Conclusion

This research aims to develop a predictive model that can accurately distinguish between reported and non-reported financial transactions using various performance metrics. The experimental design compares models trained on different combinations of legitimate transaction classes, providing insights into the impact of including these classes on model performance.