本文介绍了如何使用PyTorch从本地目录导入MNIST数据集的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在用PyTorch编写一个众所周知的问题代码 MNIST手写数字数据库.我从主要网站下载了训练和测试数据集,包括标记的数据集.数据集格式为 t10k-images-idx3-ubyte.gz ,然后提取 t10k-images-idx3-ubyte .我的数据集文件夹看起来像

I am writing a code of a well-known problem MNIST database of handwritten digits in PyTorch. I downloaded the train and testing dataset (from the main website) including the labeled dataset. The dataset format is t10k-images-idx3-ubyte.gz and after extract t10k-images-idx3-ubyte. My dataset folder looks like

MINST
 Data
  train-images-idx3-ubyte.gz
  train-labels-idx1-ubyte.gz
  t10k-images-idx3-ubyte.gz
  t10k-labels-idx1-ubyte.gz

现在,我写了一个代码来加载像波纹管这样的数据

Now, I wrote a code to load data like bellow

def load_dataset():
    data_path = "/home/MNIST/Data/"
    xy_trainPT = torchvision.datasets.ImageFolder(
        root=data_path, transform=torchvision.transforms.ToTensor()
    )
    train_loader = torch.utils.data.DataLoader(
        xy_trainPT, batch_size=64, num_workers=0, shuffle=True
    )
    return train_loader

我的代码显示为受支持的扩展名:.jpg,.jpeg,.png,.ppm,.bmp,.pgm,.tif,.tiff,.webp

如何解决此问题,我还想检查是否从数据集中加载了我的图像(只是一个数字包含前5个图像)?

How can I solve this problem and I also want to check that my images are loaded (just a figure contains the first 5 images) from the dataset?

推荐答案

阅读此

更新

您可以使用此格式导入数据

You can import data using this format

xy_trainPT = torchvision.datasets.MNIST(
    root="~/Handwritten_Deep_L/",
    train=True,
    download=True,
    transform=torchvision.transforms.Compose([torchvision.transforms.ToTensor()]),
)

现在, download = True 发生了什么事情?首先,您的代码将在根目录(您给定的路径)中检查是否包含任何数据集.

Now, what is happening at download=True first your code will check at the root directory (your given path) contains any datasets or not.

如果,则将从网络上下载数据集.

If no then datasets will be downloaded from the web.

如果,此路径已经包含一个数据集,那么您的代码将使用现有的数据集运行,并且不会从互联网上下载.

If yes this path already contains a dataset then your code will work using the existing dataset and will not download from the internet.

您可以检查,首先给出一个没有任何数据集的路径 (数据将从互联网上下载),然后给出另一个已经包含数据集数据的路径下载.

You can check, first give a path without any dataset (data will be downloaded from the internet), and then give another path which already contains dataset data will not be downloaded.

这篇关于如何使用PyTorch从本地目录导入MNIST数据集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-15 02:55