Neural networks for algorithmic trading. Simple time series forecasting (2024)

Alex Honchar

7 min read

Jun 18, 2016

I want to implement trading system from scratch based only on deep learning approaches, so for any problem we have here (price prediction, trading strategy, risk management) we gonna use different variations of artificial neural networks (ANNs) and check how well they can handle this.

Now I plan to work on next sections:

Simple time series forecasting (and mistakes done)
Correct 1D time series forecasting + backtesting
Multivariate time series forecasting
Volatility forecasting and custom losses
Multitask and multimodal learning
Hyperparameters optimization
Enhancing classical strategies with neural nets
Probabilistic programming and Pyro forecasts

I highly recommend you to check out code and IPython Notebook in this repository.

In this, first part, I want to show how MLPs, CNNs and RNNs can be used for financial time series prediction. In this part we are not going to use any feature engineering. Let’s just consider historical dataset of S&P 500 index price movements. We have information from 1950 to 2016 about open, close, high, low prices for every day in the year and volume of trades. First, we will try just to predict close price in the end of the next day, second, we will try to predict return (close price — open price). Download the dataset from Yahoo Finance or from this repository.

Neural networks for algorithmic trading. Simple time series forecasting (3)

We will consider our problem as 1) regression problem (trying to forecast exactly close price or return next day) 2) binary classification problem (price will go up [1; 0] or down [0; 1]).

For training NNs we gonna use framework Keras.

First let’s prepare our data for training. We want to predict t+1 value based on N previous days information. For example, having close prices from past 30 days on the market we want to predict, what price will be tomorrow, on the 31st day.

We use first 90% of time series as training set (consider it as historical data) and last 10% as testing set for model evaluation.

Here is example of loading, splitting into training samples and preprocessing of raw input data:

It will be just 2-hidden layer perceptron. Number of hidden neurons is chosen empirically, we will work on hyperparameters optimization in next sections. Between two hidden layers we add one Dropout layer to prevent overfitting.

Important thing is Dense(1), Activation(‘linear’) and ‘mse’ in compile section. We want one output that can be in any range (we predict real value) and our loss function is defined as mean squared error.

Let’s see what happens if we just pass chunks of 20-days close prices and predict price on 21st day. Final MSE= 46.3635263557, but it’s not very representative information. Below is plot of predictions for first 150 points of test dataset. Black line is actual data, blue one — predicted. We can clearly see that our algorithm is not even close by value, but can learn the trend.

Neural networks for algorithmic trading. Simple time series forecasting (4)

Let’s scale our data using sklearn’s method preprocessing.scale() to have our time series zero mean and unit variance and train the same MLP. Now we have MSE = 0.0040424330518 (but it is on scaled data). On the plot below you can see actual scaled time series (black)and our forecast (blue) for it:

Neural networks for algorithmic trading. Simple time series forecasting (5)

For using this model in real world we should return back to unscaled time series. We can do it, by multiplying or prediction by standard deviation of time series we used to make prediction (20 unscaled time steps) and add it’s mean value:

MSE in this case equals 937.963649937. Here is the plot of restored predictions (red) and real data (green):

Neural networks for algorithmic trading. Simple time series forecasting (6)

Not bad, isn’t it? But let’s try more sophisticated algorithms for this problem!

I am not going to dive into theory of convolutional neural networks, you can check out this amazing resourses:

cs231n.github.io — Stanford CNNs for Computer Vision course
http://www.wildml.com/2015/12/implementing-a-cnn-for-text-classification-in-tensorflow/ — CNNs for text recognition, can be useful for understanding how it works for 1D data

Let’s define 2-layer convolutional neural network (combination of convolution and max-pooling layers) with one fully-connected layer and the same output as earlier:

Let’s check out results. MSEs for scaled and restored data are: 0.227074542433; 935.520550172. Plots are below:

Neural networks for algorithmic trading. Simple time series forecasting (7)

Neural networks for algorithmic trading. Simple time series forecasting (8)

Even looking on MSE on scaled data, this network learned much worse. Most probably, deeper architecture needs more data for training, or it just overfitted due to too high number of filters or layers. We will consider this issue later.

As recurrent architecture I want to use two stacked LSTM layers (read more about LSTMs here).

Plots of forecasts are below, MSEs = 0.0246238639582; 939.948636707.

Neural networks for algorithmic trading. Simple time series forecasting (9)

Neural networks for algorithmic trading. Simple time series forecasting (10)

RNN forecasting looks more like moving average model, it can’t learn and predict all fluctuations.

So, it’s a bit unexpectable result, but we can see, that MLPs work better for this time series forecasting. Let’s check out what will happen if we swith from regression to classification problem. Now we will use not close prices, but daily return (close price-open price) and we want to predict if close price is higher or lower than open price based on last 20 days returns.

Neural networks for algorithmic trading. Simple time series forecasting (11)

Code is changed just a bit — we change our last Dense layer to have output [0; 1] or [1; 0] and add softmax output to expect probabilistic output.

To load binary outputs, change in the code following line:

split_into_chunks(timeseries, TRAIN_SIZE, TARGET_TIME, LAG_SIZE, binary=False, scale=True)split_into_chunks(timeseries, TRAIN_SIZE, TARGET_TIME, LAG_SIZE, binary=True, scale=True)

Also we change loss function to binary cross-entopy and add accuracy metrics.

Train on 13513 samples, validate on 1502 samples
Epoch 1/5
13513/13513 [==============================] - 2s - loss: 0.1960 - acc: 0.6461 - val_loss: 0.2042 - val_acc: 0.5992
Epoch 2/5
13513/13513 [==============================] - 2s - loss: 0.1944 - acc: 0.6547 - val_loss: 0.2049 - val_acc: 0.5965
Epoch 3/5
13513/13513 [==============================] - 1s - loss: 0.1924 - acc: 0.6656 - val_loss: 0.2064 - val_acc: 0.6019
Epoch 4/5
13513/13513 [==============================] - 1s - loss: 0.1897 - acc: 0.6738 - val_loss: 0.2051 - val_acc: 0.6039
Epoch 5/5
13513/13513 [==============================] - 1s - loss: 0.1881 - acc: 0.6808 - val_loss: 0.2072 - val_acc: 0.6052
1669/1669 [==============================] - 0s  Test loss and accuracy: [0.25924376667510113, 0.50209706411917387]

Oh, it’s not better than random guessing (50% accuracy), let’s try something better. Check out the results below.

Train on 13513 samples, validate on 1502 samples
Epoch 1/5
13513/13513 [==============================] - 3s - loss: 0.2102 - acc: 0.6042 - val_loss: 0.2002 - val_acc: 0.5979
Epoch 2/5
13513/13513 [==============================] - 3s - loss: 0.2006 - acc: 0.6089 - val_loss: 0.2022 - val_acc: 0.5965
Epoch 3/5
13513/13513 [==============================] - 4s - loss: 0.1999 - acc: 0.6186 - val_loss: 0.2006 - val_acc: 0.5979
Epoch 4/5
13513/13513 [==============================] - 3s - loss: 0.1999 - acc: 0.6176 - val_loss: 0.1999 - val_acc: 0.5932
Epoch 5/5
13513/13513 [==============================] - 4s - loss: 0.1998 - acc: 0.6173 - val_loss: 0.2015 - val_acc: 0.5999
1669/1669 [==============================] - 0s 
Test loss and accuracy: [0.24841217570779137, 0.54463750750737105]

Train on 13513 samples, validate on 1502 samples
Epoch 1/5
13513/13513 [==============================] - 18s - loss: 0.2130 - acc: 0.5988 - val_loss: 0.2021 - val_acc: 0.5992
Epoch 2/5
13513/13513 [==============================] - 18s - loss: 0.2004 - acc: 0.6142 - val_loss: 0.2010 - val_acc: 0.5959
Epoch 3/5
13513/13513 [==============================] - 21s - loss: 0.1998 - acc: 0.6183 - val_loss: 0.2013 - val_acc: 0.5959
Epoch 4/5
13513/13513 [==============================] - 17s - loss: 0.1995 - acc: 0.6221 - val_loss: 0.2012 - val_acc: 0.5965
Epoch 5/5
13513/13513 [==============================] - 18s - loss: 0.1996 - acc: 0.6160 - val_loss: 0.2017 - val_acc: 0.5965
1669/1669 [==============================] - 0s 
Test loss and accuracy: [0.24823409688551315, 0.54523666868172693]

We can see, that treating financial time series prediction as regression problem is better approach, it can learn the trend and prices close to the actual.

What was surprising for me, that MLPs are treating sequence data better as CNNs or RNNs which are supposed to work better with time series. I explain it with pretty small dataset (~16k time stamps) and dummy hyperparameters choice.

You can reproduce results and get better using code from repository.

I think we can get better results both in regression and classification using different features (not only scaled time series) like some technical indicators, volume of sales. Also we can try more frequent data, let’s say minute-by-minute ticks to have more training data. All these things I’m going to do later, so stay tuned :)

P.S.
Follow me also in Facebook for AI articles that are too short for Medium, Instagram for personal stuff and Linkedin!

Neural networks for algorithmic trading. Simple time series forecasting (2024)

FAQs

Are neural networks good for time series forecasting? ›

Advantages of Recurrent Neural Network

RNNs can find complex patterns in the input time series. RNNs give good results in forecasting more then few-steps. RNNs can model sequence of data so that each sample can be assumed to be dependent on previous ones.

Is time series forecasting hard? ›

Approaches to time series forecasting. “Prediction is very difficult, especially if it's about the future.” Today, time series problems are usually solved by conventional statistical (e.g., ARIMA) and machine learning methods, including artificial neural networks (ANN), support vector machines (SVMs), and some others.

View Details ›

Which deep learning algorithm is best for time series forecasting? ›

LSTM (Long Short-Term Memory)

This makes LSTM particularly effective in capturing long-term dependencies in time series data. LSTM models can learn complex patterns and relationships within the data, making them suitable for deep learning time series in the fields like: Speech recognition. Natural language processing.

Discover More Details ›

Which type of neural network is best suited for processing time series data? ›

Abstract. Recurrent neural networks (RNNs) are a class of neural networks that are naturally suited to processing time-series data and other sequential data.

Which forecasting is best for time series? ›

There are many different methods for time series forecasting, including classical methods, machine learning models, and statistical models. Some of the most popular methods include Naïve, SNaïve, seasonal decomposition, exponential smoothing, ARIMA, and SARIMA.

View Details ›

Why is time series analysis so hard? ›

The difficulty with time series is that it is not a binary task. If your test forecast is the same as your original data, there is a great great chance that your model is overfitting your data.

Why is forecasting so difficult? ›

Data Limitations: Economic forecasting relies largely on historical data to develop models and make predictions. However, data can be incomplete, outdated, or subject to revisions, making it difficult to accurately capture the current state of the economy.

Read On ›

What is the simplest method of time series forecasting? ›

Naïve method

For naïve forecasts, we simply set all forecasts to be the value of the last observation. That is, ^yT+h|T=yT. y ^ T + h | T = y T . This method works remarkably well for many economic and financial time series.

Find Out More ›

What is GPT for time series forecasting? ›

TimeGPT is a production ready, generative pretrained transformer for time series. It's capable of accurately predicting various domains such as retail, electricity, finance, and IoT with just a few lines of code.

Which algorithm is best for prediction? ›

Linear regression is a supervised learning algorithm used to predict and forecast values within a continuous range, such as sales numbers or prices.

Why is CNN better than neural networks? ›

A Convolutional Neural Network (CNN) is a type of deep learning algorithm specifically designed for image processing and recognition tasks. Compared to alternative classification models, CNNs require less preprocessing as they can automatically learn hierarchical feature representations from raw input images.

Get More Info ›

Can we use deep learning for time series forecasting? ›

Deep learning, the currently leading field of machine learning, applied to time series forecasting can cope with complex and high-dimensional time series that cannot be usually handled by other machine learning techniques.

View Details ›

What is the difference between deep learning and neural networks? ›

Deep learning models can recognize data patterns like complex pictures, text, and sounds to produce accurate insights and predictions. A neural network is the underlying technology in deep learning. It consists of interconnected nodes or neurons in a layered structure.

View Details ›

Is neural network better than ARIMA? ›

We see that ARIMA yields the best performance, i.e., it achieves the smallest mean square error and mean absolute error on the test set. In contrast, the LSTM neural network performs the worst of the three models. The exact predictions plotted against the true values can be seen in the following images.

See Details ›

Are neural networks good for prediction? ›

Predictive neural networks produce forecasted values or categories for future observations – critical information for your business. The most important predictor variables are also highlighted, providing more invaluable information to assist decision-making.

See Details ›