本文介绍了将 MNIST 数据从 numpy 数组转换为原始 ubyte 数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我几乎完全使用了这个代码,只是改变了行:

I used this code almost exactly, just changing the line:

f = gzip.open("../data/mnist.pkl.gz", 'rb')
training_data, validation_data, test_data = cPickle.load(f)

到这些行:

import pickle as cPickle
f = gzip.open("mnist.pkl.gz", 'rb')
u = cPickle._Unpickler(f)
u.encoding='latin1'
training_data, validation_data, test_data = u.load()

考虑酸洗问题.原始 mnist.pkl.gz 是从他的存储库下载的(可在 此处获得)),或者生成 .pkl.gz 的代码是 此处.输出很棒,它是训练和测试数据的腌制 numpy 数组,经过检查,我可以看到是否打印了训练数据的长度,它是 250,000 个 numpy 数组.

to account for pickling issues.The original mnist.pkl.gz was downloaded from his repo (available here), or the code to generate the .pkl.gz is here. The output is great, it's a pickled numpy array of the training and test data, and on inspection, I can see if I print the length of the training data, it's 250,000 numpy arrays.

我需要将数据恢复为原始 MNIST 数据的确切格式(即 ubyte,训练和测试数据以及标签分开),以便放入我无法控制的外部管道中,因此它必须是和原版一样.

I need to get the data back into the exact format as the original MNIST data (i.e. ubyte, training and testing data and labels separate) to be put into an external pipeline that i have no control over, so it must be the same as the original.

我真的很困惑如何做到这一点.例如,我可以看到 this 之类的东西,这可能会有所帮助,但我看不出它是如何适合这个的问题.如果有人能帮我将这个腌制的 numpy 数组的输出恢复为原始的 MNIST 格式(即 ubyte,训练和测试数据以及标签分开),我会非常感激.

I'm really stuck on how to do this. I can see for example things like this that might help, but I can't see how it suits this problem. If someone could help me revert the output from this pickled numpy arrays to the original MNIST format (i.e. ubyte, training and testing data and labels separate), i'd really appreciate it.

编辑 1:我刚刚意识到这可能更容易一些,实际上我只需要将训练数据转换为 ubyte 格式,而不是测试数据,因为我已经拥有原始 ubyte 格式的测试数据.

Edit 1: Something I've just realised that might be easier, I actually only need to convert the training data into ubyte format, not the testing one, since I already have the testing data in ubyte format in the original.

推荐答案

一旦有了 numpy 数组中的数据,就可以将 numpy 数组转换为 mnist 格式参考这个

Once you have the data in numpy arrays, you can convert the numpy arrays into mnist formatrefer thishttps://github.com/davidflanagan/notMNIST-to-MNIST/blob/17823f4d4a3acd8317c07866702d2eb2ac79c7a0/convert_to_mnist_format.py#L92

你可以在这里阅读更多关于 mnist 数据格式的信息http://yann.lecun.com/exdb/mnist/

You can read more the the mnist data format herehttp://yann.lecun.com/exdb/mnist/

您还可以从这里验证转换后的图像https://stackoverflow.com/a/53181925

You can also verify your converted images from herehttps://stackoverflow.com/a/53181925

这篇关于将 MNIST 数据从 numpy 数组转换为原始 ubyte 数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-25 07:58