Reinforcement Learning-Based Portfolio Management with Financial Time-Series Forecasting

Authors

  • Bryan Brooks School of Systems Engineering, Stevens Institute of Technology
  • Paul Blackwood Department of Finance and Quantitative Methods, University of Kentucky

Keywords:

Reinforcement Learning, Portfolio Management, Financial Time-Series, Systems Engineering, Algorithmic Governance, Socio-Technical Infrastructure, Sustainability.

Abstract

The integration of Reinforcement Learning (RL) into portfolio management signifies a shift from passive predictive modeling to autonomous, goal-oriented decision-making within complex financial ecosystems. Traditional portfolio optimization strategies often separate the forecasting of financial time-series from the act of capital allocation, leading to structural inefficiencies and a failure to account for transaction costs, market impact, and temporal dependencies. This paper presents a systemic investigation into a unified Reinforcement Learning framework that treats portfolio management as a continuous control problem. We examine the architectural trade-offs between value-based and policy-gradient methods, emphasizing the importance of state-space representation in capturing non-stationary market dynamics. The research explores the socio-technical infrastructures required for large-scale deployment, addressing the physical requirements of low-latency compute and the data governance frameworks necessary for institutional adoption. Furthermore, we analyze the critical dimensions of environmental sustainability in high-compute financial AI, the ethical imperatives of fairness in automated asset allocation, and the policy implications of algorithmic market convergence. By synthesizing perspectives from systems engineering, behavioral finance, and public policy, this work provides a comprehensive roadmap for developing resilient and socially responsible RL-based investment systems. We conclude that while Reinforcement Learning offers a superior capacity for navigating volatile markets, its success is contingent upon a holistic approach to system robustness and a commitment to transparent algorithmic governance.

References

1.Abadie, A. (2021). Using machine learning for volatility estimation and prediction. Journal of Economic Literature, 59(2), 606-640.

2.Arratia, A. (2014). Computational Finance: An Introductory Course with R. Atlantis Press.

3.Battiston, S., et al. (2012). DebtRank: Too central to fail? Financial networks, the FED and systemic risk. Scientific Reports, 2, 541.

4.Tang, Y., Kojima, K., Gotoda, M., Nishikawa, S., Hayashi, S., Koike-Akino, T., ... & Klamkin, J. (2020). Design and Optimization of Shallow-Angle Grating Coupler for Vertical Emission from Indium Phosphide Devices.

5.Bengio, Y., et al. (2013). Representation learning: A review and new perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(8), 1798-1828.

6.Bollerslev, T. (1986). Generalized autoregressive conditional heteroskedasticity. Journal of Econometrics, 31(3), 307-327.

7.Qi, R. (2025). AUBIQ: A generative AI-powered framework for automating business intelligence requirements in resource-constrained enterprises. Frontiers in Business and Finance, 2(01), 66-86.

8.Brock, W. A., Lakonishok, J., & LeBaron, B. (1992). Simple technical trading rules and the stochastic properties of stock returns. The Journal of Finance, 47(5), 1731-1764.

9.Bronstein, M. M., et al. (2017). Geometric deep learning: Going beyond Euclidean data. IEEE Signal Processing Magazine, 34(4), 18-42.

10.Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.

11.Zhang, T. (2025, October). From Black Box to Actionable Insights: An Adaptive Explainable AI Framework for Proactive Tax Risk Mitigation in Small and Medium Enterprises. In Proceedings of the 2025 2nd International Conference on Digital Economy and Computer Science (pp. 193-199).

12.Cont, R. (2001). Empirical properties of asset returns: Stylized facts and statistical issues. Quantitative Finance, 1(2), 223-236.

13.Defferrard, M., et al. (2016). Convolutional neural networks on graphs with fast localized spectral filtering. Advances in Neural Information Processing Systems.

14.Diebold, F. X., & Yilmaz, K. (2014). On the network topology of variance decompositions: Measuring the connectedness of financial institutions. Journal of Econometrics, 182(1), 119-134.

15.Elliott, M., Golub, B., & Jackson, M. O. (2014). Financial networks and cascading failures. Econometrica, 82(6), 2099-2153.

16.Qi, R. (2025, July). DecisionFlow for SMEs: A lightweight visual framework for multi-task joint prediction and anomaly detection. In Proceedings of the 2025 International Conference on Economic Management and Big Data Application (pp. 899-903).

17.Yi, X. (2025, October). Real-Time Fair-Exposure Ad Allocation for SMBs and Underserved Creators via Contextual Bandits-with-Knapsacks. In Proceedings of the 2025 2nd International Conference on Digital Economy and Computer Science (pp. 1602-1607).

18.Fischer, T., & Krauss, C. (2018). Deep learning with long short-term memory networks for financial market predictions. European Journal of Operational Research, 270(2), 654-669.

19.Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. MIT Press.

20.Gu, S., Kelly, B., & Xiu, D. (2020). Empirical asset pricing via machine learning. The Review of Financial Studies, 33(5), 2223-2273.

21.Hamilton, W. L., Ying, R., & Leskovec, J. (2017). Inductive representation learning on large graphs. Advances in Neural Information Processing Systems.

22.Li, H., & Liu, T. (2023). Portfolio optimization based on the LSTM forecasting model. In Proceedings of the 2nd International Conference on Financial Technology and Business Analysis (Vol. 48, No. 1, pp. 97-106).

23.He, K., et al. (2016). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.

24.Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735-1780.

25.Hull, J. C. (2021). Machine Learning in Business: An Introduction to the World of Data Science. Pearson.

26.Kipf, T. N., & Welling, M. (2017). Semi-supervised classification with graph convolutional networks. International Conference on Learning Representations.

27.Lim, B., & Zohren, S. (2021). Time-series forecasting with deep learning: A survey. Philosophical Transactions of the Royal Society A, 379(2194), 20200209.

28.Liu, T. (2026). A Comparative Study of Transformer-Based and Classical Models for Financial Time-Series Forecasting. Journal of Risk and Financial Management, 19(3), 203.

29.Lopez de Prado, M. (2018). Advances in Financial Machine Learning. John Wiley & Sons.

30.Markowitz, H. (1952). Portfolio selection. The Journal of Finance, 7(1), 77-91.

31.Mnih, V., et al. (2015). Human-level control through deep reinforcement learning. Nature, 518(7540), 529-533.

32.Newman, M. E. J. (2010). Networks: An Introduction. Oxford University Press.

33.Paszke, A., et al. (2019). PyTorch: An imperative style, high-performance deep learning library. Advances in Neural Information Processing Systems.

34.Rossi, G. (2018). Socio-Technical Systems and the Finance Industry. Routledge.

35.Tang, Y., Kojima, K., Gotoda, M., Nishikawa, S., Hayashi, S., Koike-Akino, T., ... & Klamkin, J. (2020, February). InP grating coupler design for vertical coupling of InP and silicon chips. In Integrated Optics: Devices, Materials, and Technologies XXIV (Vol. 11283, pp. 33-38). SPIE.

36.Schulman, J., et al. (2017). Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347.

37.Schwartz, R., et al. (2020). Green AI. Communications of the ACM, 63(12), 54-63.

38.Sutton, R. S., & Barto, A. G. (2018). Reinforcement Learning: An Introduction. MIT Press.

39.Taleb, N. N. (2007). The Black Swan: The Impact of the Highly Improbable. Random House.

40.Vaswani, A., et al. (2017). Attention is all you need. Advances in Neural Information Processing Systems.

41.Zhou, D. (2025, December). M-VP2: Microservice-Oriented Vulnerability Patch Planning-A Cost-Aware Approachusing Multi-Agent Reinforcement Learning. In 2025 5th International Conference on Computer, Internet of Things and Control Engineering (CITCE) (pp. 248-254). IEEE.

42.Veličković, P., et al. (2018). Graph attention networks. International Conference on Learning Representations.

43.Zhang, G. P. (2003). Time series forecasting using a hybrid ARIMA and neural network model. Neurocomputing, 50, 159-175.

Downloads

Published

2026-03-19 — Updated on 2026-03-26

How to Cite

Bryan Brooks, & Paul Blackwood. (2026). Reinforcement Learning-Based Portfolio Management with Financial Time-Series Forecasting. International Journal of Artificial Intelligence Research, 1(1). Retrieved from https://www.isipress.org/index.php/IJAIR/article/view/93