FMNIST Data Loading¶

class deepobs.fmnist.fmnist_input.data_loading(batch_size)[source]¶

Class providing the data loading functionality for the Fashion-MNIST data set.

Parameters:	batch_size (int) -- Batch size of the input-output pairs. No default value is given.

train_eval_size¶

Number of data points to evaluate during the train eval phase. Currently set to 10000 the size of the test set.

Type:	int

D_train¶

The training data set.

Type:	tf.data.Dataset

D_train_eval¶

The training evaluation data set. It is the same data as D_train but we go through it separately.

Type:	tf.data.Dataset

D_test¶

The test data set.

Type:	tf.data.Dataset

phase¶

Variable to describe which phase we are currently in. Can be "train", "train_eval" or "test". The phase variable can determine the behaviour of the network, for example deactivate dropout during evaluation.

Type:	tf.Variable

iterator¶

A single iterator for all three data sets. We us the initialization operators (see below) to switch this iterator to the data sets.

Type:	tf.data.Iterator

X¶

Tensor holding the Fashion-MNIST images. It has dimension batch_size x 28 (image size) x 28 (image size) x 1 (rgb).

Type:	tf.Tensor

y¶

Label of the Fashion-MNIST images. It has dimension batch_size x 10 (number of classes).

Type:	tf.Tensor

train_init_op¶

A TensorFlow operation to be performed before starting every training epoch. It sets the phase variable to "train" and initializes the iterator to the training data set.

Type:	tf.Operation

train_eval_init_op¶

A TensorFlow operation to be performed before starting every training eval phase. It sets the phase variable to "train_eval" and initializes the iterator to the training eval data set.

Type:	tf.Operation

test_init_op¶

A TensorFlow operation to be performed before starting every test evaluation phase. It sets the phase variable to "test" and initializes the iterator to the test data set.

Type:	tf.Operation

load()[source]¶

Returns the data (X (images) and y (labels)) and the phase variable.

Returns:	Tupel consisting of the images (X), the label (y) and the phase variable (phase).
Return type:	tupel

make_dataset(images_file, labels_file, batch_size, one_hot=True, shuffle=True, shuffle_buffer_size=10000, num_prefetched_batches=10, data_set_size=-1)[source]¶

Creates a data set from given images and label files.

Parameters:	images_file (str) -- Path to the images in compressed `.gz` files. labels_file (str) -- Path to the labels in compressed `.gz` files. batch_size (int) -- Batch size of the input-output pairs. one_hot (bool) -- Switch to turn on or off one-hot encoding of the labels. Defaults to `True`. shuffle (bool) -- Switch to turn on or off shuffling of the data set. Defaults to `True`. shuffle_buffer_size (int) -- Size of the shuffle buffer. Defaults to `10000` the size of the test and train eval data set, meaning that they will be completely shuffled. num_prefetched_batches (int) -- Number of prefeteched batches, defaults to `10`. data_set_size (int) -- Size of the data set to extract from the images and label files. Defaults to `-1` meaning that the full data set is used.
Returns:	Data set object created from the images and label files.
Return type:	tf.data.Dataset

read32(bytestream)[source]¶

Helper function to read a bytestream.

Parameters:	bytestream (bytestream) -- Input bytestream.
Returns:	Bytestream as a np array.
Return type:	np.array

read_fmnist_data(images_file, labels_file, one_hot=True)[source]¶

Read the Fashion-MNIST images and labels from the downloaded files.

Parameters:	images_file (str) -- Path to the images in compressed `.gz` files. labels_file (str) -- Path to the labels in compressed `.gz` files. one_hot (bool) -- Switch to turn on or off one-hot encoding of the labels. Defaults to `True`.
Returns:	Tupel consisting of all the images (X) and the labels (y).
Return type:	tupel

test_dataset(batch_size)[source]¶

Creates the test data set.

Parameters:	batch_size (int) -- Batch size of the input-output pairs.
Returns:	The test data set.
Return type:	tf.data.Dataset

train_dataset(batch_size)[source]¶

Creates the training data set.

Parameters:	batch_size (int) -- Batch size of the input-output pairs.
Returns:	The training data set.
Return type:	tf.data.Dataset

train_eval_dataset(batch_size)[source]¶

Creates the train eval data set.

Parameters:	batch_size (int) -- Batch size of the input-output pairs.
Returns:	The train eval data set.
Return type:	tf.data.Dataset