x_train=x_train/255.0x_test=x_test/255.0c_trainX = trainX.reshape(x_train.shape[0],28,28,1)#x_train.shape[0] = 60model3 = Sequential() # DNN 的类型model3.add(Conv2D(28, kernel_size=(3,3), input_shape = (28,28,1)))model3.add(MaxPooling2D(pool_size=(2,2)))model3.add(Flatten())model3.add(密集(200,激活=relu"))model3.add(密集(10,激活= tf.nn.softmax))model3.compile(optimizer='adam', loss='sparse_categorical_crossentropy',指标=['准确度'])model3.fit(c_trainX, y_train, epochs=15)model3.evaluate(c_testX, y_test)[0.5343353748321533, 0.9064000248908997]----这是我的验证损失和准确性p = model3.predict(c_testX[:10])使用另一个输入导入 urllib从 PIL 导入图像%matplotlib 内联urllib.request.urlretrieve('https://github.com/antony-joy/Data_sets/blob/main/tes.jpg?raw=true',测试.jpg")img = Image.open("testing.jpg")numpyimgdata = np.asarray(img)导入 cv2numpyimgdata=numpyimgdata/255load_img_rz = np.array(Image.open("testing.jpg").resize((28,28)))Image.fromarray(load_img_rz).save('r_kolala.jpeg')打印(调整大小后:",load_img_rz.shape)numpyimgdata_reshape_grey = cv2.cvtColor(load_img_rz, cv2.COLOR_BGR2GRAY)your_new_array = np.expand_dims(numpyimgdata_reshape_grey,axis=-1)numpyimgdata_reshape = your_new_array.reshape(-1,28, 28, 1) # 完成后使图像在# 与测试和训练数据的维度相同image_predicted_array = model3.predict(numpyimgdata_reshape)test_pred = np.argmax(image_predicted_array,axis=1)打印(预测:",test_pred)


这实际上是错误的.它应该打印为一条裤子,用 1 表示mnist 数据集 5 中的原因是凉鞋标签说明

  • 0 T 恤/上衣
  • 1 条裤子
  • 2 套头衫
  • 3 连衣裙
  • 4 大衣
  • 5 凉鞋
  • 6 衬衫
  • 7 运动鞋
  • 8 袋
  • 9 踝靴

我尝试了不同的图像,当我尝试一些靴子或帆布鞋时,我得到了 5 号(凉鞋).这里的实际错误似乎是什么?




获取数据 - 进行一些预处理 - 可视化样本.

 from tensorflow.keras.datasets import fashion_mnist(x_train, y_train), (x_test, y_test) = fashion_mnist.load_data()x_train = tf.expand_dims(x_train, -1) # 从 28 x 28 到 28 x 28 x 1x_train = tf.divide(x_train, 255) # 归一化y_train = tf.one_hot(y_train , depth=10) # 使目标成为 One-Hotx_test = tf.expand_dims(x_test, -1) # 从 28 x 28 到 28 x 28 x 1x_test = tf.divide(x_test, 255) # 归一化y_test = tf.one_hot(y_test , depth=10) # 使目标成为 One-Hotx_train.shape, y_train.shape, x_test.shape, y_test.shape(TensorShape([60000, 28, 28, 1]),TensorShape([60000, 10]),TensorShape([10000, 28, 28, 1]),TensorShape([10000, 10]))

[奖励]:看,这些是 28 和灰度图像.现在,无论出于何种原因,如果我们想要调整大小和/或想要使其成为 RGB(3 通道),我们也可以这样做.在




model = Sequential()model.add(Conv2D(16, kernel_size=(3,3), input_shape = (28,28,1)))model.add(Conv2D(32, kernel_size=(3,3), activation=relu"))model.add(Conv2D(64, kernel_size=(3,3), activation=relu"))model.add(Conv2D(128, kernel_size=(3,3), activation=relu"))模型.add(GlobalAveragePooling2D())模型.添加(辍学(0.5))模型.添加(密集(10,激活= tf.nn.softmax))模型摘要()# 与你不同,我使用 categorical_crossentropy# 因为我 one_hot 编码了我的 y_train 和 y_test模型编译(优化器='亚当',损失='categorical_crossentropy',指标=['准确度'])模型拟合(x_train,y_train,batch_size=256,epochs=15,validation_data=(x_test, y_test))
........纪元:15:损失:0.4552 - 准确度:0.8370 - val_loss:0.4008 - val_accuracy:0.8606



#一个预处理函数def infer_prec(img, img_size):img = tf.expand_dims(img, -1) # 从 28 x 28 到 28 x 28 x 1img = tf.divide(img, 255) # 归一化img = tf.image.resize(img, # 根据输入调整大小[img_size, img_size])img = tf.reshape(img, # reshape 以添加批量维度[1, img_size, img_size, 1])返回图像

好的,我抓取一些Fashion MNIST看起来相似的数据,让我们打开其中一个.

导入 cv2导入 matplotlib.pyplot 作为 pltimg = cv2.imread('/content/a.jpg', 0) # 读取图像为灰度打印(img.shape)#(300, 231)plt.imshow(img, cmap=灰色")plt.show()img = infer_prec(img, 28) # 调用预处理函数打印(img.shape) # (1, 28, 28, 1)

到目前为止一切都很好,除了现在我们有一个白色背景,这与我们训练模型的训练样本不同.如果我没记错的话,Fashion MNIST 的所有样本都有黑色背景.此时,如果我们将此样本传递给模型进行预测,它将无法做出准确或接近准确的预测.

当我们将 RGB 样本设为 灰度 时,白色像素保持白色,而另一个彩色像素变为黑色.对于我们的案例来处理这个问题,我们可以在将灰度图像传递给模型进行预测之前对灰度图像使用 bitwise_not 操作.这个 bitwise_not 只是让 0 变成 1,反之亦然.

导入 cv2导入 matplotlib.pyplot 作为 pltimg = cv2.imread('/content/a.jpg', 0) # 读取图像为灰度img = cv2.bitwise_not(img) # <----- bitwise_not打印(img.shape)#(300, 231)plt.imshow(img, cmap=灰色")plt.show()img = infer_prec(img, 28) # 调用预处理函数打印(img.shape) # (1, 28, 28, 1)


y_pred = model.predict(img)y_pred数组([[3.1869055e-03, 5.6372599e-05, 1.1225128e-01, 2.2242602e-02,7.7411497e-01、5.8861728e-11、8.7906137e-02、6.2964287e-12、2.4166984e-04, 2.0408438e-08]], dtype=float32)


tf.argmax(y_pred,axis=-1).numpy()# array([4]) # 涂层

Importing the data and splitting it into 4 for test and train

c_trainX = trainX.reshape(x_train.shape[0],28,28,1)#x_train.shape[0] = 60
model3 = Sequential() # type of DNN
model3.add(Conv2D(28, kernel_size=(3,3), input_shape = (28,28,1)))  
model3.add(Dense(200, activation="relu"))                
model3.add(Dense(10, activation=tf.nn.softmax))         
model3.compile(optimizer='adam', loss='sparse_categorical_crossentropy', 

model3.fit(c_trainX, y_train, epochs=15)

model3.evaluate(c_testX, y_test)
[0.5343353748321533, 0.9064000248908997]---- This is my validation loss and 
p = model3.predict(c_testX[:10])

import urllib
from PIL import Image
%matplotlib inline
img = Image.open("testing.jpg")
numpyimgdata = np.asarray(img)    
import cv2
load_img_rz = np.array(Image.open("testing.jpg").resize((28,28)))
print("After resizing:",load_img_rz.shape)
numpyimgdata_reshaped_grey = cv2.cvtColor(load_img_rz, cv2.COLOR_BGR2GRAY)
your_new_array = np.expand_dims(numpyimgdata_reshaped_grey, axis=-1)
       numpyimgdata_reshaped = your_new_array.reshape(-1,28, 28, 1)     # this is done make the image in
                                                  # the same dimension of that of test and train data
image_predicted_array = model3.predict(numpyimgdata_reshaped)
test_pred = np.argmax(image_predicted_array, axis=1)     


This is actually wrong. It should be printed as a trouser which is denoted by 1cause in mnist dataset 5 is sandalLabel Description

  • 0 T-shirt/top
  • 1 Trouser
  • 2 Pullover
  • 3 Dress
  • 4 Coat
  • 5 Sandal
  • 6 Shirt
  • 7 Sneaker
  • 8 Bag
  • 9 Ankle boot

I tried with different images, I am getting number 5(sandals) when I try with some boot or canvas shoes even. What seems to be the actual mistake here?


The main problem is very simple. Here I will give you a complete implementation of your program. Please note that I may change the model definition and image preprocessing step. Ok, let get started.

Fashion MNIST

Get the data - Do some preprocessing - Visualize a sample.

from tensorflow.keras.datasets import fashion_mnist   

(x_train, y_train), (x_test, y_test) = fashion_mnist.load_data()

x_train = tf.expand_dims(x_train, -1)     # from 28 x 28 to 28 x 28 x 1 
x_train = tf.divide(x_train, 255)         # Normalize 
y_train = tf.one_hot(y_train , depth=10)  # Make target One-Hot

x_test = tf.expand_dims(x_test, -1)       # from 28 x 28 to 28 x 28 x 1 
x_test = tf.divide(x_test, 255)           # Normalize
y_test = tf.one_hot(y_test , depth=10)    # Make target One-Hot
x_train.shape, y_train.shape, x_test.shape, y_test.shape
(TensorShape([60000, 28, 28, 1]),
 TensorShape([60000, 10]),
 TensorShape([10000, 28, 28, 1]),
 TensorShape([10000, 10]))

Now, let's visualize one of a sample from our preprocessed data.

plt.imshow(x_train[0][:,:,0], cmap="gray")

Observe that, the main object right white and the background is black.

Model and Training

It's better to use pretrained weight I think. However, here is a toy model to train.

model = Sequential() 
model.add(Conv2D(16, kernel_size=(3,3), input_shape = (28,28,1)))  
model.add(Conv2D(32, kernel_size=(3,3), activation="relu"))  
model.add(Conv2D(64, kernel_size=(3,3), activation="relu"))  
model.add(Conv2D(128, kernel_size=(3,3), activation="relu"))  
model.add(Dense(10, activation=tf.nn.softmax))       

# Unlike you I use categorical_crossentropy
# as because I one_hot encoded my y_train and y_test

model.fit(x_train, y_train, batch_size=256, 
             epochs=15, validation_data=(x_test, y_test))
epoch:15: loss: 0.4552 - accuracy: 0.8370 - val_loss: 0.4008 - val_accuracy: 0.8606


Let's make some predictions on the web-searched samples. Before that, let's first define a function that will do the necessary preprocessing.

# a preprocess function 
def infer_prec(img, img_size):
    img = tf.expand_dims(img, -1)       # from 28 x 28 to 28 x 28 x 1 
    img = tf.divide(img, 255)           # normalize 
    img = tf.image.resize(img,          # resize acc to the input
             [img_size, img_size])
    img = tf.reshape(img,               # reshape to add batch dimension 
            [1, img_size, img_size, 1])
    return img 

Ok, I scrape some Fashion MNIST looking similar data, let's open one of them.

import cv2
import matplotlib.pyplot as plt

img = cv2.imread('/content/a.jpg', 0)   # read image as gray scale    
print(img.shape)   # (300, 231)

plt.imshow(img, cmap="gray")

img = infer_prec(img, 28)  # call preprocess function 
print(img.shape)   # (1, 28, 28, 1)

All is good so far, except now we have a white background, which is not like our training sample on which our model is trained on. If I'm not wrong, all the samples of Fashion MNIST do have a black background. At this point, if we pass this sample to the model for prediction, it wouldn't make accurate or close accurate predictions.

When we make an RGB sample to Grayscale, the white pixel remains white and the other colorful pixel gets black. For our case to handle this, we can use the bitwise_not operation on the grayscale image before passing it to the model for prediction. This bitwise_not simply makes 0 to 1 and vice-versa.

import cv2
import matplotlib.pyplot as plt

img = cv2.imread('/content/a.jpg', 0)  # read image as gray scale    
img = cv2.bitwise_not(img)             # < ----- bitwise_not
print(img.shape)   # (300, 231)

plt.imshow(img, cmap="gray")

img = infer_prec(img, 28)  # call preprocess function 
print(img.shape)   # (1, 28, 28, 1)

Now, pass it to the model for predicted probabilities.

y_pred = model.predict(img)

array([[3.1869055e-03, 5.6372599e-05, 1.1225128e-01, 2.2242602e-02,
        7.7411497e-01, 5.8861728e-11, 8.7906137e-02, 6.2964287e-12,
        2.4166984e-04, 2.0408438e-08]], dtype=float32)

Now we can get the predicted label and compare gt.

tf.argmax(y_pred, axis=-1).numpy() 
# array([4]) # Coat

