SVHN Data Loading

class deepobs.svhn.svhn_input.data_loading(batch_size)[source]

Class providing the data loading functionality for the SVHN data set.

Parameters:batch_size (int) -- Batch size of the input-output pairs. No default value is given.
batch_size

Batch size of the input-output pairs.

Type:int
train_eval_size

Number of data points to evaluate during the train eval phase. Currently set to 26032 the size of the test set.

Type:int
D_train

The training data set.

Type:tf.data.Dataset
D_train_eval

The training evaluation data set. It is the same data as D_train but we go through it separately.

Type:tf.data.Dataset
D_test

The test data set.

Type:tf.data.Dataset
phase

Variable to describe which phase we are currently in. Can be "train", "train_eval" or "test". The phase variable can determine the behaviour of the network, for example deactivate dropout during evaluation.

Type:tf.Variable
iterator

A single iterator for all three data sets. We us the initialization operators (see below) to switch this iterator to the data sets.

Type:tf.data.Iterator
X

Tensor holding the SVHN images. It has dimension batch_size x 32 (image size) x 32 (image size) x 3 (rgb).

Type:tf.Tensor
y

Label of the SVHN images. It has dimension batch_size x 10 (number of classes).

Type:tf.Tensor
train_init_op

A TensorFlow operation to be performed before starting every training epoch. It sets the phase variable to "train" and initializes the iterator to the training data set.

Type:tf.Operation
train_eval_init_op

A TensorFlow operation to be performed before starting every training eval phase. It sets the phase variable to "train_eval" and initializes the iterator to the training eval data set.

Type:tf.Operation
test_init_op

A TensorFlow operation to be performed before starting every test evaluation phase. It sets the phase variable to "test" and initializes the iterator to the test data set.

Type:tf.Operation
load()[source]

Returns the data (X (images) and y (labels)) and the phase variable.

Returns:Tupel consisting of the images (X), the label (y) and the phase variable (phase).
Return type:tupel
make_dataset(binaries_fname_pattern, batch_size, crop_size=32, per_image_standardization=True, random_crop=False, pad_before_random_crop=0, random_flip_left_right=False, lighting_augmentation=False, one_hot=True, shuffle=True, shuffle_buffer_size=10000, num_prefetched_batches=3, num_preprocessing_threads=8, data_set_size=-1)[source]

Creates a data set from a pattern of the images and label files.

Parameters:
  • binaries_fname_pattern (str) -- Pattern of the .bin files containing the images and labels.
  • batch_size (int) -- Batch size of the input-output pairs.
  • crop_size (int) -- Crop size of each image. Defaults to 32.
  • per_image_standardization (bool) -- Switch to standardize each image to have zero mean and unit norm. Defaults to True.
  • random_crop (bool) -- Switch if random crops should be used. Defaults to False.
  • pad_before_random_crop (int) -- Defines the added padding before a random crop is applied. Defaults to 0.
  • random_flip_left_right (bool) -- Switch to randomly flip the images horizontally. Defaults to False.
  • lighting_augmentation (bool) -- Switch to use random brightness, saturation and contrast on each image. Defaults to False.
  • one_hot (bool) -- Switch to turn on or off one-hot encoding of the labels. Defaults to True.
  • shuffle (bool) -- Switch to turn on or off shuffling of the data set. Defaults to True.
  • shuffle_buffer_size (int) -- Size of the shuffle buffer. Defaults to 10000.
  • num_prefetched_batches (int) -- Number of prefeteched batches, defaults to 3.
  • num_preprocessing_threads (int) -- The number of elements to process in parallel while applying the image transformations. Defaults to 8.
  • data_set_size (int) -- Size of the data set to extract from the images and label files. Defaults to -1 meaning that the full data set is used.
Returns:

Data set object created from the images and label files.

Return type:

tf.data.Dataset

test_dataset(batch_size)[source]

Creates the test data set.

Parameters:batch_size (int) -- Batch size of the input-output pairs.
Returns:The test data set.
Return type:tf.data.Dataset
train_dataset(batch_size, data_augmentation=True)[source]

Creates the training data set.

Parameters:
  • batch_size (int) -- Batch size of the input-output pairs.
  • data_augmentation (bool) -- Switch to turn basic data augmentation on or off while training. Defaults to true.
Returns:

The training data set.

Return type:

tf.data.Dataset

train_eval_dataset(batch_size, data_augmentation=True)[source]

Creates the train eval data set.

Parameters:
  • batch_size (int) -- Batch size of the input-output pairs.
  • data_augmentation (bool) -- Switch to turn basic data augmentation on or off while evaluating the training data set. Defaults to true.
Returns:

The train eval data set.

Return type:

tf.data.Dataset