Stock portfolio optimization using Deep Q Reinforcement Learning strategy based on State-Action matrix
The purpose of this paper is to optimize the portfolio consisting of stocks using DEEPQ's reinforcement learning strategy based on the state-action matrix. For this purpose, in order to optimize and make profitable the portfolio consisting of stocks, the performance of the reinforcement learning strategy based on the DEEP Q algorithm and the passive strategy of Buying and Holding in two states of Bullish and Bearish markets during the time period of 2017-2021 were investigated. The statistical population was 672 companies admitted to the Tehran Stock Exchange, of which 7 companies (statistical sample) were considered suitable. The comparison of two strategies shows that the Reinforcement Learning strategy, in the Bullish and Bearish markets, compared to the trading method of buying and holding, which has led to losses, has a high potential for profitability in the Iranian stock market. Based on the results, it is suggested that brokers and stock exchange companies and analysts use the Reinforcement Learning strategy for profitability and stock portfolio optimization. Also, the comparison of the results of these two approaches makes it clear that the application of Reinforcement Learning is more suitable for investors who do not have the high risk-taking ability of the Buy-and-Hold approach.