


I've made a multilayer LSTM model that uses regression to predict next frame's values of the data. The model finishes after 20 epochs. I then get some predictions and compare them to my ground truth values. As you can see them in the picture above, predictions converge to a constant value. I don't know why this happens.Here is my model so far:

from keras.models import Sequential
from keras.layers.core import Dense, Activation, Dropout
from keras.layers import LSTM, BatchNormalization
from tensorflow.python.keras.initializers import RandomUniform

init = RandomUniform(minval=-0.05, maxval= 0.05)

model = Sequential()

model.add(LSTM(kernel_initializer=init, activation='relu', return_sequences=True, units=800, dropout=0.5, recurrent_dropout=0.2, input_shape=(x_train.shape[1], x_train.shape[2]) ))
model.add(LSTM(kernel_initializer=init, activation='relu', return_sequences=False, units=500, dropout=0.5, recurrent_dropout=0.2 ))

model.add(Dense(1024, activation='linear', kernel_initializer=init))
model.add(Dense(1, activation='linear', kernel_initializer= 'normal'))

model.compile(loss='mean_squared_error', optimizer='rmsprop' )


I decreased epochs from 20 to 3. results are as follows:


By comparing 2 pictures, I can conclude that when the number of epochs increases, the predictions are more likely to converge to some specific value which is around -0.1.


因此,在尝试了不同数量的LSTM单位和不同类型的体系结构之后,我意识到当前LSTM单位的数量会导致模型学习如此缓慢并且20个纪元不足以适应如此庞大的模型.对于每一层,我将LSTM单位的数量更改为64个,还删除了c2层并将纪元数从20增加至400,结果令人难以置信地接近于基本真理值. 我应该提到的是,新模型中使用的数据集与前一个模型不同,因为我在使用该数据集时遇到了一些问题.这是新模型:

So, after trying different number of LSTM units and different types of architectures, I realized that the current number of LSTM units causes the model to learns so slowly and 20 epochs were not sufficient for such huge model.For each layer, I changed the number of LSTM units to 64 and also removed Dense(1024)layer and increased the number of epochs from 20 to 400 and results were incredibly close to the ground truth values. I should mention that the dataset used in the new model was different from the former one because I encountered some problems with that dataset . here is the new model:

from keras.optimizers import RMSprop
from keras.initializers import glorot_uniform, glorot_normal, RandomUniform

init = glorot_normal(seed=None)
init1 = RandomUniform(minval=-0.05, maxval=0.05)
optimizer = RMSprop(lr=0.001, rho=0.9, epsilon=None, decay=0.0)

model = Sequential()

model.add(LSTM(units=64, dropout=0.2, recurrent_dropout=0.2,
               input_shape=(x_train.shape[1], x_train.shape[2]),
               return_sequences=True, kernel_initializer=init))

model.add(LSTM(units=64, dropout=0.2, recurrent_dropout=0.2,
               return_sequences=False, kernel_initializer=init))

model.add(Dense(1, activation='linear', kernel_initializer= init1))
model.compile(loss='mean_squared_error', optimizer=optimizer )


you can see the predictions here:


It's still not the best model, but at least outperformed the former one.If you have any further recommendation on how to improve it, it'll be greatly appreciated.


09-15 03:08