An Interpretable and Drift-Aware AI Framework for Real-Time Financial Fraud Detection in Large-Scale Transaction Systems

Zhihao Wang; Yiming Chen; Haoran Liu

doi:10.66280/ijair.v1i1.7

Authors

Zhihao Wang
Yiming Chen
Haoran Liu

DOI:

https://doi.org/10.66280/ijair.v1i1.7

Abstract

Real-time fraud detection in payment and banking infrastructures is constrained as much by operating conditions as by model capacity. Effective systems must separate rare fraudulent activity from a dominant legitimate population, remain reliable as adversaries adapt (concept drift), and deliver decisions within strict latency budgets. This paper presents a deployable fraud detection framework for large-scale transaction streams that couples a low-latency gradient- boosted decision tree (GBDT) scorer with graph-derived relational signals, and embeds the resulting model within an explainability and governance layer designed for auditability.
We describe the end-to-end pipeline—stream ingestion, feature computation with online/offline parity, model training and calibration, online serving, and continuous monitoring—and evalu- ate the approach on anonymized, benchmark-style transaction data using time-sliced splits to approximate production drift. The empirical results show consistent, incremental gains over representative baselines in AUC and in false positive rate at fixed recall, while preserving deci- sion evidence suitable for operational review. Practical implications for transaction trust, loss mitigation, and the resilience of digital financial infrastructure are discussed.

References

[1] A. Dal Pozzolo, O. Caelen, Y.-A. Le Borgne, S. Waterschoot, and G. Bontempi, “Calibrating probability with undersampling for unbalanced classification,” IEEE Symposium Series on Computational Intelligence, 2015.

[2] A. C. Bahnsen, D. Aouada, A. Stojanovic, and B. Ottersten, “Feature engineering strategies for credit card fraud detection,” Expert Systems with Applications, 2016.

[3] C. Phua, V. Lee, K. Smith, and R. Gayler, “A comprehensive survey of data mining-based fraud detection research,” arXiv preprint arXiv:1009.6119, 2010.

[4] N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, “SMOTE: Synthetic minority over-sampling technique,” Journal of Artificial Intelligence Research, vol. 16, pp. 321–357, 2002.

[5] H. He and E. A. Garcia, “Learning from imbalanced data,” IEEE Transactions on Knowledge and Data Engineering, vol. 21, no. 9, pp. 1263–1284, 2009.

[6] J. Gama, I. Žliobaitė , A. Bifet, M. Pechenizkiy, and A. Bouchachia, “A survey on concept drift adaptation,” ACM Computing Surveys, vol. 46, no. 4, 2014.

[7] A. Bifet, R. Gavalda, G. Holmes, and B. Pfahringer, Machine Learning for Data Streams: with Practical Examples in MOA. MIT Press, 2018.

[8] T. Chen and C. Guestrin, “XGBoost: A scalable tree boosting system,” in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 2016, pp. 785–794.

[9] G. Ke, Q. Meng, T. Finley, T. Wang, W. Chen, W. Ma, Q. Ye, and T.-Y. Liu, “LightGBM: A highly efficient gradient boosting decision tree,” in Advances in Neural Information Processing Systems (NeurIPS), 2017.

[10] L. Prokhorenkova, G. Gusev, A. Vorobev, A. Dorogush, and A. Gulin, “CatBoost: Unbiased boosting with categorical features,” in Advances in Neural Information Processing Systems (NeurIPS), 2018.

[11] M. T. Ribeiro, S. Singh, and C. Guestrin, “Why should I trust you? Explaining the predictions of any classifier,” in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 2016.

[12] S. M. Lundberg and S.-I. Lee, “A unified approach to interpreting model predictions,” in Advances in Neural Information Processing Systems (NeurIPS), 2017.

[13] W. Samek, G. Montavon, A. Vedaldi, L. K. Hansen, and K.-R. Müller, Explainable AI: Inter- preting, Explaining and Visualizing Deep Learning. Springer, 2019.

[14] C. Molnar, Interpretable Machine Learning, 2nd ed. 2022.

[15] B. Perozzi, R. Al-Rfou, and S. Skiena, “DeepWalk: Online learning of social representations,” in Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 2014.

[16] P. Veličković , G. Cucurull, A. Casanova, A. Romero, P. Liò , and Y. Bengio, “Graph attention networks,” in International Conference on Learning Representations (ICLR), 2018.

[17] E. Rossi, B. Chamberlain, F. Frasca, D. Eynard, F. Monti, and M. Bronstein, “Temporal graph networks for deep learning on dynamic graphs,” arXiv preprint arXiv:2006.10637, 2020.

[18] Y. Gorishniy, I. Rubachev, V. Khrulkov, and A. Babenko, “Revisiting deep learning models for tabular data,” in Advances in Neural Information Processing Systems (NeurIPS), 2021.

[19] G. Somepalli, M. Goldblum, A. Bansal, and T. Goldstein, “SAINT: Improved neural networks for tabular data via row attention and contrastive pre-training,” in Advances in Neural Infor- mation Processing Systems (NeurIPS), 2021.

[20] N. Hollmann, S. Müller, K.-R. Müller, and F. Hutter, “TabPFN: A transformer that solves small tabular classification problems in a second,” in Advances in Neural Information Processing Systems (NeurIPS), 2022.

[21] A. Kotelnikov, A. Baranchuk, A. Rubachev, and A. Babenko, “TabDDPM: Modelling tabular data with diffusion models,” in Advances in Neural Information Processing Systems (NeurIPS), 2022.

[22] S. Ö . Arik and T. Pfister, “TabNet: Attentive interpretable tabular learning,” arXiv preprint arXiv:1908.07442, 2019.

[23] F. Guidotti, A. Monreale, S. Ruggieri, F. Turini, D. Pedreschi, and F. Giannotti, “A survey of methods for explaining black box models,” ACM Computing Surveys, vol. 51, no. 5, 2018.

[24] F. Doshi-Velez and B. Kim, “Towards a rigorous science of interpretable machine learning,” arXiv preprint arXiv:1702.08608, 2017.

[25] F. T. Liu, K. M. Ting, and Z.-H. Zhou, “Isolation forest,” in IEEE International Conference on Data Mining (ICDM), 2008.

[26] A. P. Bradley, “The use of the area under the ROC curve in the evaluation of machine learning algorithms,” Pattern Recognition, vol. 30, no. 7, pp. 1145–1159, 1997.

An Interpretable and Drift-Aware AI Framework for Real-Time Financial Fraud Detection in Large-Scale Transaction Systems

Authors

DOI:

Abstract

References

Downloads

Published

Versions

How to Cite

Issue

Section

License

Make a Submission

Journal Information

Current Issue

Information

Indexing & Infrastructure