An Interpretable and Drift-Aware AI Framework for Real-Time Financial Fraud Detection in Large-Scale Transaction Systems
DOI:
https://doi.org/10.66280/ijair.v1i1.7Abstract
Real-time fraud detection in payment and banking infrastructures is constrained as much by operating conditions as by model capacity. Effective systems must separate rare fraudulent activity from a dominant legitimate population, remain reliable as adversaries adapt (concept drift), and deliver decisions within strict latency budgets. This paper presents a deployable fraud detection framework for large-scale transaction streams that couples a low-latency gradient- boosted decision tree (GBDT) scorer with graph-derived relational signals, and embeds the resulting model within an explainability and governance layer designed for auditability.
We describe the end-to-end pipeline—stream ingestion, feature computation with online/offline parity, model training and calibration, online serving, and continuous monitoring—and evalu- ate the approach on anonymized, benchmark-style transaction data using time-sliced splits to approximate production drift. The empirical results show consistent, incremental gains over representative baselines in AUC and in false positive rate at fixed recall, while preserving deci- sion evidence suitable for operational review. Practical implications for transaction trust, loss mitigation, and the resilience of digital financial infrastructure are discussed.
References
[1] A. Dal Pozzolo, O. Caelen, Y.-A. Le Borgne, S. Waterschoot, and G. Bontempi, “Calibrating probability with undersampling for unbalanced classification,” IEEE Symposium Series on Computational Intelligence, 2015.
[2] A. C. Bahnsen, D. Aouada, A. Stojanovic, and B. Ottersten, “Feature engineering strategies for credit card fraud detection,” Expert Systems with Applications, 2016.
[3] C. Phua, V. Lee, K. Smith, and R. Gayler, “A comprehensive survey of data mining-based fraud detection research,” arXiv preprint arXiv:1009.6119, 2010.
[4] N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, “SMOTE: Synthetic minority over-sampling technique,” Journal of Artificial Intelligence Research, vol. 16, pp. 321–357, 2002.
[5] H. He and E. A. Garcia, “Learning from imbalanced data,” IEEE Transactions on Knowledge and Data Engineering, vol. 21, no. 9, pp. 1263–1284, 2009.
[6] J. Gama, I. Žliobaitė , A. Bifet, M. Pechenizkiy, and A. Bouchachia, “A survey on concept drift adaptation,” ACM Computing Surveys, vol. 46, no. 4, 2014.
[7] A. Bifet, R. Gavalda, G. Holmes, and B. Pfahringer, Machine Learning for Data Streams: with Practical Examples in MOA. MIT Press, 2018.
[8] T. Chen and C. Guestrin, “XGBoost: A scalable tree boosting system,” in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 2016, pp. 785–794.
[9] G. Ke, Q. Meng, T. Finley, T. Wang, W. Chen, W. Ma, Q. Ye, and T.-Y. Liu, “LightGBM: A highly efficient gradient boosting decision tree,” in Advances in Neural Information Processing Systems (NeurIPS), 2017.
[10] L. Prokhorenkova, G. Gusev, A. Vorobev, A. Dorogush, and A. Gulin, “CatBoost: Unbiased boosting with categorical features,” in Advances in Neural Information Processing Systems (NeurIPS), 2018.
[11] M. T. Ribeiro, S. Singh, and C. Guestrin, “Why should I trust you? Explaining the predictions of any classifier,” in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 2016.
[12] S. M. Lundberg and S.-I. Lee, “A unified approach to interpreting model predictions,” in Advances in Neural Information Processing Systems (NeurIPS), 2017.
[13] W. Samek, G. Montavon, A. Vedaldi, L. K. Hansen, and K.-R. Müller, Explainable AI: Inter- preting, Explaining and Visualizing Deep Learning. Springer, 2019.
[14] C. Molnar, Interpretable Machine Learning, 2nd ed. 2022.
[15] B. Perozzi, R. Al-Rfou, and S. Skiena, “DeepWalk: Online learning of social representations,” in Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 2014.
[16] P. Veličković , G. Cucurull, A. Casanova, A. Romero, P. Liò , and Y. Bengio, “Graph attention networks,” in International Conference on Learning Representations (ICLR), 2018.
[17] E. Rossi, B. Chamberlain, F. Frasca, D. Eynard, F. Monti, and M. Bronstein, “Temporal graph networks for deep learning on dynamic graphs,” arXiv preprint arXiv:2006.10637, 2020.
[18] Y. Gorishniy, I. Rubachev, V. Khrulkov, and A. Babenko, “Revisiting deep learning models for tabular data,” in Advances in Neural Information Processing Systems (NeurIPS), 2021.
[19] G. Somepalli, M. Goldblum, A. Bansal, and T. Goldstein, “SAINT: Improved neural networks for tabular data via row attention and contrastive pre-training,” in Advances in Neural Infor- mation Processing Systems (NeurIPS), 2021.
[20] N. Hollmann, S. Müller, K.-R. Müller, and F. Hutter, “TabPFN: A transformer that solves small tabular classification problems in a second,” in Advances in Neural Information Processing Systems (NeurIPS), 2022.
[21] A. Kotelnikov, A. Baranchuk, A. Rubachev, and A. Babenko, “TabDDPM: Modelling tabular data with diffusion models,” in Advances in Neural Information Processing Systems (NeurIPS), 2022.
[22] S. Ö . Arik and T. Pfister, “TabNet: Attentive interpretable tabular learning,” arXiv preprint arXiv:1908.07442, 2019.
[23] F. Guidotti, A. Monreale, S. Ruggieri, F. Turini, D. Pedreschi, and F. Giannotti, “A survey of methods for explaining black box models,” ACM Computing Surveys, vol. 51, no. 5, 2018.
[24] F. Doshi-Velez and B. Kim, “Towards a rigorous science of interpretable machine learning,” arXiv preprint arXiv:1702.08608, 2017.
[25] F. T. Liu, K. M. Ting, and Z.-H. Zhou, “Isolation forest,” in IEEE International Conference on Data Mining (ICDM), 2008.
[26] A. P. Bradley, “The use of the area under the ROC curve in the evaluation of machine learning algorithms,” Pattern Recognition, vol. 30, no. 7, pp. 1145–1159, 1997.
Downloads
Published
Versions
- 2026-03-02 (3)
- 2026-02-04 (2)
- 2026-02-04 (1)
How to Cite
Issue
Section
License
Copyright (c) 2026 International Journal of Artificial Intelligence Research

This work is licensed under a Creative Commons Attribution 4.0 International License.
This article is published under the Creative Commons Attribution 4.0 International License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.



