Pybrain time series prediction using LSTM recurrent nets

I think a better (simpler/clearer) example to learn from would be here, towards the bottom of the page:

http://pybrain.org/docs/tutorial/netmodcon.html

Essentially, once set up as shown, it will automatically keep track of the inputs' past history (until and unless you hit reset). From the docs:

http://pybrain.org/docs/api/structure/networks.html?highlight=recurrentnetwork#pybrain.structure.networks.RecurrentNetwork

"Until .reset() is called, the network keeps track of all previous inputs and thus allows the use of recurrent connections and layers that look back in time."

So yes, no need to re-present all the past inputs to the network each time.


I have tested LSTM predicting some time sequence with Theano. I found that for some smooth curve, it can be predicted properly. However for some zigzag curve . It's hard to predict. The detailed article are as below: Predict Time Sequence with LSTM

The predicted result can be shown as follow:
(source: fuzihao.org)


You can train an LSTM network with a single input node and a single output node for doing time series prediction like this:

First, just as a good practice, let's use Python3's print function:

from __future__ import print_function

Then, make a simple time series:

data = [1] * 3 + [2] * 3
data *= 3
print(data)

[1, 1, 1, 2, 2, 2, 1, 1, 1, 2, 2, 2, 1, 1, 1, 2, 2, 2]

Now put this timeseries into a supervised dataset, where the target for each sample is the next sample:

from pybrain.datasets import SequentialDataSet
from itertools import cycle

ds = SequentialDataSet(1, 1)
for sample, next_sample in zip(data, cycle(data[1:])):
    ds.addSample(sample, next_sample)

Build a simple LSTM network with 1 input node, 5 LSTM cells and 1 output node:

from pybrain.tools.shortcuts import buildNetwork
from pybrain.structure.modules import LSTMLayer

net = buildNetwork(1, 5, 1, 
                   hiddenclass=LSTMLayer, outputbias=False, recurrent=True)

Train the network:

from pybrain.supervised import RPropMinusTrainer
from sys import stdout

trainer = RPropMinusTrainer(net, dataset=ds)
train_errors = [] # save errors for plotting later
EPOCHS_PER_CYCLE = 5
CYCLES = 100
EPOCHS = EPOCHS_PER_CYCLE * CYCLES
for i in xrange(CYCLES):
    trainer.trainEpochs(EPOCHS_PER_CYCLE)
    train_errors.append(trainer.testOnData())
    epoch = (i+1) * EPOCHS_PER_CYCLE
    print("\r epoch {}/{}".format(epoch, EPOCHS), end="")
    stdout.flush()

print()
print("final error =", train_errors[-1])

Plot the errors (note that in this simple toy example, we are testing and training on the same dataset, which is of course not what you'd do for a real project!):

import matplotlib.pyplot as plt

plt.plot(range(0, EPOCHS, EPOCHS_PER_CYCLE), train_errors)
plt.xlabel('epoch')
plt.ylabel('error')
plt.show()

Now ask the network to predict the next sample:

for sample, target in ds.getSequenceIterator(0):
    print("               sample = %4.1f" % sample)
    print("predicted next sample = %4.1f" % net.activate(sample))
    print("   actual next sample = %4.1f" % target)
    print()

(The code above is based on the example_rnn.py and the examples from the PyBrain documentation)