Fraud Detection System for Banking Transactions
2026-04-09 • Machine Learning
Machine Learning
AI summaryⓘ
The authors study how to better detect fake transactions in online payment systems, which are hard to spot because fraud strategies keep changing and fake transactions are much rarer than real ones. They use a made-up dataset called PaySim and follow a step-by-step data science process to explore and improve the data features. The authors compare different machine learning methods like Logistic Regression and Random Forest, and handle the problem of the rare fraud cases by using a technique called SMOTE to balance the data. They also tune their models carefully to get the best results. Their approach aims to help financial technology systems catch fraud more effectively and reliably.
fraud detectiondigital paymentsimbalanced dataSMOTELogistic RegressionRandom ForestXGBoosthyperparameter tuningGridSearchCVCRISP-DM
Authors
Ranya Batsyas, Ritesh Yaduwanshi
Abstract
The expansion of digital payment systems has heightened both the scale and intricacy of online financial transactions, thereby increasing vulnerability to fraudulent activities. Detecting fraud effectively is complicated by the changing nature of attack strategies and the significant disparity between genuine and fraudulent transactions. This research introduces a machine learning-based fraud detection framework utilizing the PaySim synthetic financial transaction dataset. Following the CRISP-DM methodology, the study includes hypothesis-driven exploratory analysis, feature refinement, and a comparative assessment of baseline models such as Logistic Regression and tree-based classifiers like Random Forest, XGBoost, and Decision Tree. To tackle class imbalance, SMOTE is employed, and model performance is enhanced through hyperparameter tuning with GridSearchCV. The proposed framework provides a robust and scalable solution to enhance fraud prevention capabilities in FinTech transaction systems. Keywords: fraud detection, imbalanced data, HPO, SMOTE