Open Access Paper
28 December 2022 Price movement prediction using deep learning: a case study of the China futures market
Zelin Wang, Ya Li
Author Affiliations +
Proceedings Volume 12506, Third International Conference on Computer Science and Communication Technology (ICCSCT 2022); 125066Q (2022) https://doi.org/10.1117/12.2662524
Event: International Conference on Computer Science and Communication Technology (ICCSCT 2022), 2022, Beijing, China
Abstract
Financial time series prediction has always been a tricky problem due to the uncertainty in the market. It has attracted attention from industry to academia. In recent years, deep learning has shown excellent performance in many different fields. More and more researchers try to apply deep learning on financial markets. In this paper, the complete modeling process of price movement prediction is introduced. Based on high frequency data Limit Order Books, an improved deep learning model combining the local feature extraction ability of Convolutional Neural Network (CNN) with the sequential feature extraction ability of Long Short-Term Memory (LSTM) is proposed and evaluated on RB dominant contracts in the China futures market. Based on the experimental results, it is concluded that our model’s performance on prediction is better than that of single CNN and LSTM models. Through back testing, trading based on the predicted results of the proposed model can yield significantly more returns than other models.

1.

INTRODUCTION

Predicting the price trend of financial assets is a meaningful and challenging research problem, which has attracted more and more attention in academia in recent years1. If the trend can be accurately predicted, it is of great significance for investors in the financial market, who can make profitable decisions based on the accurate prediction results.

The financial market has a large number of uncertain factors, such as some emergencies and policy adjustments, so that the financial time series are characterized by non-stationary and low signal-to-noise ratio. Some researchers use traditional methods to model and forecast the future price of financial assets using some time series models, such as ARIMA2, GARCH3. However, the prediction performance of these methods has much room for improvement.

With the introduction of electronic trading, the volume of trading in financial assets has been greatly increased, and exchanges have huge amounts of historical data. In recent years, with the improvement of computing power, deep learning solves a lot of problems in different areas, such as speech recognition4, image captioning5, demonstrating amazing big data processing capabilities. Researchers apply deep learning models on huge amounts of historical data to predict future prices to generate trading signals1, but processing such a large amount of data is not an easy task, especially in the China futures market. Fully mining the useful information in the massive data can find profitable opportunities and hedge against volatile markets.

In this paper, a new deep learning model is proposed, combining CNN with LSTM networks, which can be used to predict mid-price movements in the China futures market based on limit order data more accurately. Through backtesting, trading based on the predicted results of the proposed model can yield significantly more returns than other models.

The paper structure is as follows. Related work on deep learning applied on financial time series forecasting is briefly presented in Section 2. In Section 3, the used limit order book data is described. Section 4 introduces our new deep learning model. In Section 5, experiments and results are analysed. Finally, we give our conclusions and future work in Section 6.

2.

RELATED WORK

The key features from massive amounts of data can be captured using deep learning. More and more researchers are applying deep learning on financial asset price movements forecasting. Financial assets include stocks, futures and options. Stock price forecasting attracts much attention from researchers, which has many achievements in academia6.

LSTM and CNN are classical deep learning models, which are applied in the area of price forecasting most7-13. Chen et al. use LSTM to predict the China stock price movements. Although the accuracy of model is not high, their initial efforts demonstrate the power of LSTM in the China stock market and deep learning on stock price has much potential to be found7. Fischer et al. deploy LSTM networks on some stocks of the S&P 500 to predict out-of-sample directional movements. They find that LSTM performs much better than the logistic regression and deep neural networks9. Gong et al. apply a CNN-based model to contruct a three-category prediction model to forecast AAPL, showing that the performance of the model can be improved using technical indicators14. Recently, some novel hybrid models combining CNN and LSTM are proposed15-18, which can capture key features to improve the forecasting accuracy.

With the introduction of electronic trading, huge amounts of transaction data are generated. Some researchers analyse future trends in financial asset prices by utilizing limit order book (LOB) data11,13,19-24, which is the most detailed data in the market right now, recording the most trade information. Kercheval et al. use the support vector machine to predict the short-term price trend of stocks by mining the historical information in the limit order book19. To predict short term price movements of stocks, Tsantekidis et al. propose a deep learning methodology, which is demonstrated that the proposed method performs better than a support vector machine model and a multilayer perceptron model11. Zhang et al. develop a large-scale deep learning model called DeepLOB used LOB data as the input13, which achieves higher accuracy on the benchmark LOB dataset25. Tran et al. propose the Temporal Attention-Augmented Bilinear Network (TABL) based on the Bilinear Network (BL)26, showing that the new model performs much better than all existing advanced models while requiring fewer computation22.

As far as we know, there is relatively little research on deep learning models used to predict the price trend of the China futures market in academia, and this is the first work that proposes a deep learning model incorporating CNN and LSTM that specializes in processing LOB raw data to forecast the price directional movements in the China futures market for the purpose of trading. This article can provide reference for traders who want to conduct quantitative trading through artificial intelligence.

3.

DATASET

We choose rebar (RB) futures contracts which are very popular with investors as the research object. RB contracts are characterized by high price volatility, large trading volume, and high investment risk. Our dataset consists of the LOB data of the RB dominant contracts, which have the highest trading volume compared with other contracts, so trading dominant contracts can reduce the slippage effectively.

3.1.

Limit order book data

We first introduce some basic concepts of a limit order book (LOB) briefly. With the development of information technology, more and more exchanges use limit order books to automatically match orders. A LOB has two kinds of limit orders: ask orders and bid orders. An ask (bid) order posted means a trader wants to sell (buy) a specific amount of a financial asset at a certain price or more (less). If a submitted limit order is not executed immediately, it will be recorded on the LOB waiting for suitable orders on the other side to match.

The LOB prioritizes orders on both sides by the size of the price, which are divided into different levels so the best ask and bid orders are placed at the top level with the highest transaction priority in respective sides of LOB. The lower (higher) the ask (bid) order price, the higher the transaction priority. When a new limit order comes in, the LOB will reorder all orders. For orders with the same price, the submitted earlier order has the higher priority. The ask and bid orders at different levels reflect the supply and demand for the asset at any time so that a limit order book has abundant micro market information which has a strong correlation with the future price movements, awaiting suitable models to fully utilize. For more details on LOB, please refer to Reference27.

We select RB dominant contracts from 2019-07-05 to 2019-11-01 about four months. The limit order book data provided by China futures exchanges is snapshot data sliced every 500 milliseconds, containing 5 price levels. Only daytime trading data (RB contracts open at 9 a.m. and close at 3 p.m.) is used to construct the dataset, so that there are about 27,000 pieces of data in one day, and a total of more than 1.7 million samples in four months. For stationarity of the dataset, data from 2 minutes after opening and 2 minutes before closing is stripped out. We take the raw LOB data directly as the model input, so at time t there are 20 dimensional features as following:

00228_PSISDG12506_125066Q_page_3_1.jpg

where i denotes the i -th level of a limit order book, 00228_PSISDG12506_125066Q_page_3_2.jpg and 00228_PSISDG12506_125066Q_page_3_3.jpg represent the price and volume of the ask(bid) order at i -th level respectively.

3.2

Normalization and labelling

The prices of RB dominant contracts can vary enormously, so at different times the numerical values of prices may not be at the same level, which is not conducive to deep learning models to learn the common features in training dataset and test dataset. Therefore, a proper normalization method is required for the dataset to be numerically stable. Global standardization does not change the relative size of data values at different times, so we standardize the data of a certain day with z-score normalization, using the average and variance of corresponding features in the previous 5 trading days of the day. After this process, the dataset can be more numerically stable, where the numerical range is decreased.

We use the mid-price to create labels which represent the direction of price movements. The mid-price is defined as the mean between the best ask price and best bid price at time t :

00228_PSISDG12506_125066Q_page_3_4.jpg

Mid-price is a virtual price since traders cannot trade at this price. Since mid-price lies between the best ask price and best bid price, its movement can reflect the trend of the market accurately. Therefore, our task is to predict the mid-price movements.

Simply comparing pt with pt+k to get the direction of the price movement will make the resulting labels very noisy because of the much noise of the financial data. To filter such noise from the labels, the following smoothed method is used. We denote the mean of the previous k mid-prices by 00228_PSISDG12506_125066Q_page_3_5.jpg, and the mean of the next k mid-prices by m+ :

00228_PSISDG12506_125066Q_page_3_6.jpg
00228_PSISDG12506_125066Q_page_3_7.jpg

Then, a label at time t denoted by lt is defined as:

00228_PSISDG12506_125066Q_page_3_8.jpg

After executing this smoothed method, the resulting labels have small number of stationary labels (lt = 0) scattered among them. We use the K-nearest neighbor method to classify stationary labels into upward labels or downward labels to make the labels contiguous. Consequently, the dataset only has two labels, upward and downward, with a ratio of nearly 1:1.

4.

PROPOSED METHOD

4.1

Overview

Our model comprises two building modules: the spatial feature extraction module that consists of CNN and the temporal feature extraction module composed of LSTM, as shown in Figure 1. We use the 300 most recent states of the LOB as an input of our model to forecast the direction of the price movements in 10 minutes. Therefore, an input sample is a 2-D panel data with a time dimension and a feature dimension. To fully exploit the features in the raw data, appropriate models should be established for the two dimensions to capture the dependencies respectively. We combine CNN and LSTM to construct a novel model specializing in processing LOB data, where CNNs are suitable for extracting local features to find spatial relevance between different types of features and LSTM models are skilled in capturing long-term temporal correlations.

Figure 1.

The architecture of the proposed model combining CNN and LSTM.

00228_PSISDG12506_125066Q_page_4_1.jpg

4.2

Details of building modules

The spatial feature extraction module consists of 6 convolutional layers that are divided into 3 horizontal convolution layers and 3 vertical convolution layers, staggered with each other. The horizontal layers find the dependencies between different features, while the vertical layers are conducted to capture short-term temporal correlations after each horizontal convolution. The first horizontal layer is used to find the correlations between price and the volume at the same level and side, then the second captures the dependencies between the ask side and the bid side at the same level. Finally, the last finds the relationships among 5 levels of LOB.

The temporal feature extraction module is used for further processing the features extracted by the spatial feature extraction module. The features in different channels of the last convolutional layer at the same time are reconstructed into new features of a time step for LSTM. At the last output layer we place a fully connected layer with a softmax activation function, and hence the model can output the probability of each class.

5.

EXPERIMENTS AND RESULTS

We train the model using cross-validation as illustrated in Figure 2, where the performance is measured by calculating the mean recall, precision, and F1 score over all folds. We use 1 level, 2levels, 3 levels, 4 levels, and 5 levels of LOB respectively as the input of the model to explore the influence of the depth of limit order book on the model performance. Finally, we examine the trading results of the proposed model and compare our model with other simpler models such as MLP, Bilinear Networks (BL), CNN, and LSTM.

Figure 2.

Multiple tasks are generated through cross-validation.

00228_PSISDG12506_125066Q_page_4_2.jpg

4.3

Experiments settings

During the cross-validation training, we use 15-day data as the training set and 5-day data as the test set. We use the most recent 300 states as a sample input to predict the price movement in 10 minutes, so the size of the input is 300 × 20. The sizes of the 3 horizontal convolutional layers are 1 × 2, 1 × 2 and 1 × 5, and the corresponding strides are 1 × 2, 1 × 2 and 1 × 1. The sizes of vertical convolutional layers are 4 × 1 consistently. The channels of the convolutional layers are all set at 8. The Leaky ReLU activation function is used behind every convolutional layer. The LSTM network uses 16 hidden neurons followed by a full connected layer with 2 neurons using softmax activation function. We choose the Adam optimizer and its learning rate is set at 1 × 10-5.

4.4

Results on different levels

The performances of the model for different levels of LOB are shown in Table 1. Experimental results show that the performance gets better with an increase in the levels. Prediction based on 1 level has the worst accuracy, and 3 levels or more deeper depth of LOB can improve the performance significantly.

Table 1.

Results for different levels.

Level(s)Up_F1Down_F1
146.57%60.55%
249.37%62.45%
362.85%68.70%
465.62%68.78%
566.30%69.13%

4.5

Forecasting and trading results

The forecasting results are shown in Table 2. Our model can get better performance than single CNN and LSTM models. Although MLP can also predict well, its trading result is much worse than our model’s as will be discussed later.

Table 2.

Results for different models.

ModelPrecisionRecallUp_F1Down_F1
MLP68.49%68.33%66.71%69.60%
BL68.06%67.41%64.43%68.99%
CNN65.08%66.93%62.54%68.10%
LSTM68.24%67.93%66.30%69.13%
Our model69.01%68.63%66.60%70.42%

We design our trading policy based on the prediction result of the model. At each time step, a signal is generated through the output of the model. If the model predicts +1 (-1), we go long (short). However, simply executing this policy may cause serious problems, since the prediction results are noisy, which can result in frequent trading and high fees. So based on the simple policy, we set a minimum holding time T and a threshold α for reversing the position. We do not consider new prediction signals until the holding time exceeds T. When the opposite signal continuously appears for over a, we reverse the position. The new strategy can filter out a considerable amount of noise (T = 5min, α=20 min). The trading results based on the new policy is shown in Figure 3. The size of our position is one share, and the proposed model can yield significantly more returns than other models.

Figure 3.

Trading results of different models.

00228_PSISDG12506_125066Q_page_6_1.jpg

6.

CONCLUSION

In this paper, we proposed a hybrid deep learning model combining CNN and LSTM for the prediction of the China futures market based on the large-scale LOB data. The spatial feature extraction module can capture both the spatial correlations between different types of features and short-term temporal dependencies, while the temporal feature extraction module can extract the long-term dependencies. The combination of the two modules can improve the prediction performance of the model. Trading results demonstrate that our model performs significantly better than other compared models. We also demonstrated the depth of LOB is important for the model performance, and 5 levels of LOB with the richest information have the best results.

Our future research direction is to reduce the noisy predictions to fully use the prediction results to improve the trading return. From the perspective of model and input, the results predicted as the same class should be generated as continuously as possible, while from the perspective of strategy, it is important to design a suitable policy which can filter out noise effectively.

REFERENCES

[1] 

Sezer, O. B., Mehmet, U. G. and Ahmet, M. O., “Financial time series forecasting with deep learning: A systematic literature review: 2005-2019,” Applied Soft Computing, 90 106181 (2020). https://doi.org/10.1016/j.asoc.2020.106181 Google Scholar

[2] 

Adebiyi, A. A., Adewumi, A. O. and Ayo, C. K., “Stock price prediction using the ARIMA model,” in UKSim-AMSS 16th Inter. Conf. on Computer Modelling and Simulation, 106 –112 (2014). Google Scholar

[3] 

Awartani, B. M. A. and Corradi, V., “Predicting the volatility of the S&P-500 stock index via GARCH models: The role of asymmetries,” International Journal of Forecasting, 167 –183 (2005). https://doi.org/10.1016/j.ijforecast.2004.08.003 Google Scholar

[4] 

Graves, A., Mohamed, A. R. and Hinton, G., “Speech recognition with deep recurrent neural networks,” in IEEE Inter. Conf. on Acoustics, Speech and Signal Processing, 6645 –6649 (2013). Google Scholar

[5] 

Kelvin, X., Jimmy, B., Ryan, K., Kyunghyun, C., Aaron, C., Ruslan, S., Rich, Z. and Yoshua, B., “Show, attend and tell: Neural image caption generation with visual attention,” in Inter. Conf. on Machine Learning, 2048 –2057 (2015). Google Scholar

[6] 

Jiang, W., “Applications of deep learning in stock market prediction: Recent progress,” Expert Systems with Applications, 184 115537 (2021). https://doi.org/10.1016/j.eswa.2021.115537 Google Scholar

[7] 

Chen, K., Yi, Z. and Fangyan, D., “A LSTM-based method for stock returns prediction: A case study of China stock market,” in 2015 IEEE Inter. Conf. on Big Data, 2823 –2824 (2015). Google Scholar

[8] 

Nelson, D. M. Q., Pereira, A. C. M. and Oliveira, R. A. D., “Stock market’s price movement prediction with LSTM neural networks,” in 2017 International Joint Conference on Neural Networks, 1419 –1426 (2017). Google Scholar

[9] 

Fischer, T. and Christopher, K., “Deep learning with long short-term memory networks for financial market predictions,” European Journal of Operational Research, 270 (2), 654 –669 (2018). https://doi.org/10.1016/j.ejor.2017.11.054 Google Scholar

[10] 

Qiu, J., Wang, B. and Zhou, C., “Forecasting stock prices with long-short term memory neural network based on attention mechanism,” PloS ONE, 15 (1), e0227222 (2020). https://doi.org/10.1371/journal.pone.0227222 Google Scholar

[11] 

Avraam, T., Nikolaos, P., Anastasios, T., Juho, K., Moncef, G. and Alexandros, I., “Forecasting stock prices from the limit order book using convolutional neural networks,” in 2017 IEEE 19th Conf. on Business Informatics, 7 –12 (2017). Google Scholar

[12] 

Silvio, B., Salvatore, M. C., Andrea, C., Alessandro, S. P. and Diego, R. R., “Deep learning and time series-to-image encoding for financial forecasting,” IEEE/CAA Journal of Automatica Sinica, 683 –692 (2020). Google Scholar

[13] 

Zihao, Z., Stefan, Z. and Stephen, R., “Deeplob: Deep convolutional neural networks for limit order books,” IEEE Transactions on Signal Processing, 67 (11), 3001 –3012 (2019). https://doi.org/10.1109/TSP.78 Google Scholar

[14] 

Gong, Y., Wu, J. M. W., Li, Z., Liu, S., Sun, L. Y. and Chen, C. M., “A CNN-based method for AAPL stock price trend prediction using historical data and technical indicators,” Advances in Intelligent Systems and Computing, 268 25 –33 (2022). https://doi.org/10.1007/978-981-16-8048-9 Google Scholar

[15] 

Jaiswal, R. and Brijendra, S., “A hybrid convolutional recurrent (CNN-GRU) model for stock price prediction,” in 2022 IEEE 11th Inter. Conf. on Communication Systems and Network Technologies, 299 –304 (2022). Google Scholar

[16] 

Lu, W., Li, J., Wang, J. and Qin, L., “A CNN-BiLSTM-AM method for stock price prediction,” Neural Computing and Applications, 33 4741 –4753 (2021). https://doi.org/10.1007/s00521-020-05532-z Google Scholar

[17] 

Sun, L., Xu, W. and Liu, J., “Two-channel attention mechanism fusion model of stock price prediction based on CNN-LSTM,” Transactions on Asian and Low-Resource Language Information Processing, 20 (5), 1 –12 (2021). https://doi.org/10.1145/3453693 Google Scholar

[18] 

Zhou, X., “Stock price prediction using combined LSTM-CNN model,” in 3rd International Conference on Machine Learning, Big Data and Business Intelligence,, 67 –71 (2021). Google Scholar

[19] 

Kercheval, A. N. and Zhang, Y., “Modelling high-frequency limit order book dynamics with support vector machines Quantitative Finance,” 15 (8), 1315 –1329 (2015). Google Scholar

[20] 

Avraam, T., Nikolaos, P., Anastasios, T., Juho, K., Moncef, G. and Alexandros, I., “Using deep learning to detect price change indications in financial markets,” in 2017 25th European Signal Processing Conf., 2511 –2515 (2017). Google Scholar

[21] 

Zhang, Z., Zohren, S. and Roberts, S., “Bdlob: Bayesian deep convolutional neural networks for limit order books,” arXiv preprint arXiv:1811.10041, (2018). Google Scholar

[22] 

Dat, T. T., Alexandros, I., Juho, K. and Moncef, G., “Temporal attention-augmented bilinear network for financial time-series data analysis,” IEEE Transactions on Neural Networks and Learning Systems, 30 (5), 1407 –1418 (2018). Google Scholar

[23] 

Tsantekidis, A., Passalis, N., Tefas, A., Kanniainen, J., Gabbouj, M. and Iosifidis, A., “Using deep learning for price prediction by exploiting stationary limit order book features,” Applied Soft Computing, 93 (2020). https://doi.org/10.1016/j.asoc.2020.106401 Google Scholar

[24] 

Lv, X. and Zhang, L., “Feature fusion learning based on LSTM and CNN networks for trend analysis of limit order books,” in Inter. Conf. on Neural Information Processing, 125 –137 (2021). Google Scholar

[25] 

Ntakaris, A., Magris, M., Kanniainen, J., Gabbouj, M. and Iosifidis, A., “Benchmark dataset for mid-price forecasting of limit order book data with machine learning methods,” Journal of Forecasting, 37 (4), 852 –866 (2018). https://doi.org/10.1002/for.v37.8 Google Scholar

[26] 

Dat, T. T., Martin, M., Juho, K., Moncef, G. and Alexandros, I., “Tensor representation in high- frequency financial data for price change prediction,” in 2017 IEEE Symp. Series on Computational Intelligence, 1 –7 (2017). Google Scholar

[27] 

Parlour, C. A. and Duane, J. S., “[Limit Order Markets: A Survey Handbook of Financial Intermediation and Banking],” 63 –95 (2008). Google Scholar
© (2022) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Zelin Wang and Ya Li "Price movement prediction using deep learning: a case study of the China futures market", Proc. SPIE 12506, Third International Conference on Computer Science and Communication Technology (ICCSCT 2022), 125066Q (28 December 2022); https://doi.org/10.1117/12.2662524
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Data modeling

Performance modeling

Feature extraction

Convolution

Artificial intelligence

Data processing

Mining

Back to Top