问题描述
我正在尝试使用以下ConvLSTM2D
架构从低分辨率的图像序列中估计高分辨率的图像序列:
import numpy as np, scipy.ndimage, matplotlib.pyplot as plt
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Convolution2D, ConvLSTM2D, MaxPooling2D, UpSampling2D
from sklearn.metrics import accuracy_score, confusion_matrix, cohen_kappa_score
from sklearn.preprocessing import MinMaxScaler, StandardScaler
np.random.seed(123)
raw = np.arange(96).reshape(8,3,4)
data1 = scipy.ndimage.zoom(raw, zoom=(1,100,100), order=1, mode='nearest') #low res
print (data1.shape)
#(8, 300, 400)
data2 = scipy.ndimage.zoom(raw, zoom=(1,100,100), order=3, mode='nearest') #high res
print (data2.shape)
#(8, 300, 400)
X_train = data1.reshape(data1.shape[0], 1, data1.shape[1], data1.shape[2], 1)
Y_train = data2.reshape(data2.shape[0], 1, data2.shape[1], data2.shape[2], 1)
#(samples,time, rows, cols, channels)
model = Sequential()
input_shape = (data1.shape[0], data1.shape[1], data1.shape[2], 1)
#samples, time, rows, cols, channels
model.add(ConvLSTM2D(16, kernel_size=(3,3), activation='sigmoid',padding='same',input_shape=input_shape))
model.add(ConvLSTM2D(8, kernel_size=(3,3), activation='sigmoid',padding='same'))
print (model.summary())
model.compile(loss='mean_squared_error',
optimizer='adam',
metrics=['accuracy'])
model.fit(X_train, Y_train,
batch_size=1, epochs=10, verbose=1)
x,y = model.evaluate(X_train, Y_train, verbose=0)
print (x,y)
此声明将导致以下Value
错误:
如何纠正此ValueError
?我认为问题在于输入形状,但无法弄清楚到底是什么错误.
请注意,输出也应该是图像序列,而不是分类结果.
之所以会发生这种情况,是因为LSTMs
需要时态数据,但是您的第一个声明为many-to-one
模型,该模型输出形状为(batch_size, 300, 400, 16)
的张量.也就是说,成批图像:
model.add(ConvLSTM2D(16, kernel_size=(3,3), activation='sigmoid',padding='same',input_shape=input_shape))
model.add(ConvLSTM2D(8, kernel_size=(3,3), activation='sigmoid',padding='same'))
您希望输出为形状为(batch_size, 8, 300, 400, 16)
的张量(即图像序列),以便第二个LSTM可以使用它们.解决此问题的方法是在第一个LSTM定义中添加return_sequences
:
model.add(ConvLSTM2D(16, kernel_size=(3,3), activation='sigmoid',padding='same',input_shape=input_shape,
return_sequences=True))
model.add(ConvLSTM2D(8, kernel_size=(3,3), activation='sigmoid',padding='same'))
您提到了分类.如果您要缩进的是对整个序列进行分类,那么最后需要分类器:
model.add(ConvLSTM2D(16, kernel_size=(3,3), activation='sigmoid',padding='same',input_shape=input_shape,
return_sequences=True))
model.add(ConvLSTM2D(8, kernel_size=(3,3), activation='sigmoid',padding='same'))
model.add(GlobalAveragePooling2D())
model.add(Dense(10, activation='softmax')) # output shape: (None, 10)
但是,如果您要尝试在序列中内内对每个图像进行分类,则只需使用TimeDistributed
x = Input(shape=(300, 400, 8))
y = GlobalAveragePooling2D()(x)
y = Dense(10, activation='softmax')(y)
classifier = Model(inputs=x, outputs=y)
x = Input(shape=(data1.shape[0], data1.shape[1], data1.shape[2], 1))
y = ConvLSTM2D(16, kernel_size=(3, 3),
activation='sigmoid',
padding='same',
return_sequences=True)(x)
y = ConvLSTM2D(8, kernel_size=(3, 3),
activation='sigmoid',
padding='same',
return_sequences=True)(y)
y = TimeDistributed(classifier)(y) # output shape: (None, 8, 10)
model = Model(inputs=x, outputs=y)
最后,看看keras存储库中的示例. 使用ConvLSTM2D的生成模型有一个.. /p>
从data1中估算data2 ...
如果这次我做对了,则X_train
应该是8张(300,400,1)图像堆栈中的1个样本,而不是1张形状(300,400,1)图像堆栈中的8个样本.
如果是这样,那么:
X_train = data1.reshape(data1.shape[0], 1, data1.shape[1], data1.shape[2], 1)
Y_train = data2.reshape(data2.shape[0], 1, data2.shape[1], data2.shape[2], 1)
应更新为:
X_train = data1.reshape(1, data1.shape[0], data1.shape[1], data1.shape[2], 1)
Y_train = data2.reshape(1, data2.shape[0], data2.shape[1], data2.shape[2], 1)
此外,当损失很大时,accuracy
通常没有任何意义.您可以使用其他指标,例如mae
.
现在,您只需要更新模型即可返回序列,并在最后一层中具有一个单位(因为您要估算的图像只有一个通道):
model = Sequential()
input_shape = (data1.shape[0], data1.shape[1], data1.shape[2], 1)
model.add(ConvLSTM2D(16, kernel_size=(3, 3), activation='sigmoid', padding='same',
input_shape=input_shape,
return_sequences=True))
model.add(ConvLSTM2D(1, kernel_size=(3, 3), activation='sigmoid', padding='same',
return_sequences=True))
model.compile(loss='mse', optimizer='adam')
然后,model.fit(X_train, Y_train, ...)
将开始正常训练:
Using TensorFlow backend.
(8, 300, 400)
(8, 300, 400)
Epoch 1/10
1/1 [==============================] - 5s 5s/step - loss: 2993.8701
Epoch 2/10
1/1 [==============================] - 5s 5s/step - loss: 2992.4492
Epoch 3/10
1/1 [==============================] - 5s 5s/step - loss: 2991.4536
Epoch 4/10
1/1 [==============================] - 5s 5s/step - loss: 2989.8523
I'm trying to use the following ConvLSTM2D
architecture to estimate high resolution image sequences from low resolution ones:
import numpy as np, scipy.ndimage, matplotlib.pyplot as plt
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Convolution2D, ConvLSTM2D, MaxPooling2D, UpSampling2D
from sklearn.metrics import accuracy_score, confusion_matrix, cohen_kappa_score
from sklearn.preprocessing import MinMaxScaler, StandardScaler
np.random.seed(123)
raw = np.arange(96).reshape(8,3,4)
data1 = scipy.ndimage.zoom(raw, zoom=(1,100,100), order=1, mode='nearest') #low res
print (data1.shape)
#(8, 300, 400)
data2 = scipy.ndimage.zoom(raw, zoom=(1,100,100), order=3, mode='nearest') #high res
print (data2.shape)
#(8, 300, 400)
X_train = data1.reshape(data1.shape[0], 1, data1.shape[1], data1.shape[2], 1)
Y_train = data2.reshape(data2.shape[0], 1, data2.shape[1], data2.shape[2], 1)
#(samples,time, rows, cols, channels)
model = Sequential()
input_shape = (data1.shape[0], data1.shape[1], data1.shape[2], 1)
#samples, time, rows, cols, channels
model.add(ConvLSTM2D(16, kernel_size=(3,3), activation='sigmoid',padding='same',input_shape=input_shape))
model.add(ConvLSTM2D(8, kernel_size=(3,3), activation='sigmoid',padding='same'))
print (model.summary())
model.compile(loss='mean_squared_error',
optimizer='adam',
metrics=['accuracy'])
model.fit(X_train, Y_train,
batch_size=1, epochs=10, verbose=1)
x,y = model.evaluate(X_train, Y_train, verbose=0)
print (x,y)
This declaration will result in the following Value
error:
How can I correct this ValueError
? I think problem is with input shapes, but could not figure out what exactly is wrong.
Notice that the output should be sequences of images too, instead of a classification result.
This is happening because LSTMs
require temporal data, but your first one was declared as a many-to-one
model, which outputs a tensor of shape (batch_size, 300, 400, 16)
. That is, batches of images:
model.add(ConvLSTM2D(16, kernel_size=(3,3), activation='sigmoid',padding='same',input_shape=input_shape))
model.add(ConvLSTM2D(8, kernel_size=(3,3), activation='sigmoid',padding='same'))
You want the output to be a tensor of shape (batch_size, 8, 300, 400, 16)
(i.e. sequences of images), so they can be consumed by the second LSTM. The way to fix this is to add return_sequences
in the first LSTM definition:
model.add(ConvLSTM2D(16, kernel_size=(3,3), activation='sigmoid',padding='same',input_shape=input_shape,
return_sequences=True))
model.add(ConvLSTM2D(8, kernel_size=(3,3), activation='sigmoid',padding='same'))
You mentioned classification. If what you indent is to classify entire sequences, then you need a classifier at the end:
model.add(ConvLSTM2D(16, kernel_size=(3,3), activation='sigmoid',padding='same',input_shape=input_shape,
return_sequences=True))
model.add(ConvLSTM2D(8, kernel_size=(3,3), activation='sigmoid',padding='same'))
model.add(GlobalAveragePooling2D())
model.add(Dense(10, activation='softmax')) # output shape: (None, 10)
But if you are trying to classify each image within the sequences, then you can simply reapply the classifier using TimeDistributed
:
x = Input(shape=(300, 400, 8))
y = GlobalAveragePooling2D()(x)
y = Dense(10, activation='softmax')(y)
classifier = Model(inputs=x, outputs=y)
x = Input(shape=(data1.shape[0], data1.shape[1], data1.shape[2], 1))
y = ConvLSTM2D(16, kernel_size=(3, 3),
activation='sigmoid',
padding='same',
return_sequences=True)(x)
y = ConvLSTM2D(8, kernel_size=(3, 3),
activation='sigmoid',
padding='same',
return_sequences=True)(y)
y = TimeDistributed(classifier)(y) # output shape: (None, 8, 10)
model = Model(inputs=x, outputs=y)
Finally, take a look at the examples in keras repository. There's one for a generative model using ConvLSTM2D.
Edit: to estimate data2 from data1...
If I got it right this time, X_train
should be 1 sample of a stack of 8 (300, 400, 1) images, not 8 samples of a stack of 1 image of shape (300, 400, 1).
If that's true, then:
X_train = data1.reshape(data1.shape[0], 1, data1.shape[1], data1.shape[2], 1)
Y_train = data2.reshape(data2.shape[0], 1, data2.shape[1], data2.shape[2], 1)
Should be updated to:
X_train = data1.reshape(1, data1.shape[0], data1.shape[1], data1.shape[2], 1)
Y_train = data2.reshape(1, data2.shape[0], data2.shape[1], data2.shape[2], 1)
Also, accuracy
doesn't usually make sense when your loss is mse. You can use other metrics such as mae
.
Now you just need to update your model to return sequences and to have a single unit in the last layer (because the images you are trying to estimate have a single channel):
model = Sequential()
input_shape = (data1.shape[0], data1.shape[1], data1.shape[2], 1)
model.add(ConvLSTM2D(16, kernel_size=(3, 3), activation='sigmoid', padding='same',
input_shape=input_shape,
return_sequences=True))
model.add(ConvLSTM2D(1, kernel_size=(3, 3), activation='sigmoid', padding='same',
return_sequences=True))
model.compile(loss='mse', optimizer='adam')
After that, model.fit(X_train, Y_train, ...)
will start training normally:
Using TensorFlow backend.
(8, 300, 400)
(8, 300, 400)
Epoch 1/10
1/1 [==============================] - 5s 5s/step - loss: 2993.8701
Epoch 2/10
1/1 [==============================] - 5s 5s/step - loss: 2992.4492
Epoch 3/10
1/1 [==============================] - 5s 5s/step - loss: 2991.4536
Epoch 4/10
1/1 [==============================] - 5s 5s/step - loss: 2989.8523
这篇关于使用基于ConvLSTM2D的Keras模型从较低的图像估计高分辨率图像的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!