Fashion MNIST

Research Project by Kashif Rasul, Han Xiao (ex-member) & Roland Vollgraf

Fashion-MNIST is a dataset of Zalando’s article images consisting of a training set of 60,000 examples and a test set of 10,000 examples. Each example is a 28x28 grayscale image, associated with a label from 10 classes. Fashion-MNIST is intended to serve as a direct drop-in replacement of the original MNIST dataset for benchmarking machine learning algorithms.

Why is this of interest for the scientific community?

The original MNIST dataset contains a lot of handwritten digits. People from AI/ML/Data Science community love this dataset and use it as a benchmark to validate their algorithms. In fact, MNIST is often the first dataset they would try on. “If it doesn’t work on MNIST, it won’t work at all”, they said. “Well, if it does work on MNIST, it may still fail on others.” Fashion-MNIST is intended to serve as a direct drop-in replacement for the original MNIST dataset to benchmark machine learning algorithms, as it shares the same image size and the structure of training and testing splits.

Seriously, we are talking about replacing MNIST. Here are some good reasons:

MNIST is too easy. Check out our side-by-side benchmark. and “Most pairs of MNIST digits can be distinguished pretty well by just one pixel”
MNIST is overused. Check out “Ian Goodfellow wants people to move away from mnist.”
MNIST can not represent modern CV tasks. Check out “François Chollet: Ideas on MNIST do not transfer to real CV.”

GitHub:

Find detailed information and the data set on GitHub