Tolstoi Char RNN

class deepobs.tolstoi.tolstoi_char_rnn.set_up(batch_size, weight_decay=None)[source]

Class providing the functionality for recurrent neural network providing character-level language modelling on the data set Tolstoi.

The network has two LSTM layers, each with 128 hidden units. We use a sequence length of 50 for the data set. The cell state is automatically stored in variables between subsequent steps and reset after each epoch (or when switching the phase).

Large parts of this code are adapted from here.

Suggested training settings are: Batch size of 50 and training for a total of 200 epochs.

Parameters:
  • batch_size (int) -- Batch size of the data points. No default value is specified.
  • weight_decay (float) -- Weight decay factor. In this model there is no weight decay implemented. Defaults to None.
seq_length

Sequence lenght, defined as the number of characters. Defaults to 50.

Type:int
data_loading

Data loading class for tolstoi data set, tolstoi_input.data_loading.

Type:deepobs.data_loading
losses

Tensor of size batch_size containing the individual losses per data point.

Type:tf.Tensor
accuracy

Tensor containing the accuracy of the model.

Type:tf.Tensor
train_init_op

A TensorFlow operation to be performed before starting every training epoch. Among other things, it sets the state variables to the zero state.

Type:tf.Operation
train_eval_init_op

A TensorFlow operation to be performed before starting every training eval epoch. Among other things, it sets the state variables to the zero state.

Type:tf.Operation
test_init_op

A TensorFlow operation to be performed before starting every test evaluation phase. Among other things, it sets the state variables to the zero state.

Type:tf.Operation
get()[source]

Returns the losses and the accuray of the model.

Returns:Tupel consisting of the losses and the accuracy.
Return type:tupel
get_state_update_op(state_variables, new_states)[source]

Add an operation to update the train states with the last state tensors

Parameters:
  • state_variables (tf.Variable) -- State variables to be updated
  • new_states (tf.Variable) -- New state of the state variable.
Returns:

Return a tuple in order to combine all update_ops into a single operation. The tuple's actual value should not be used.

Return type:

tf.Operation

get_state_variables(batch_size, cell)[source]

For each layer, get the initial state and make a variable out of it to enable updating its value.

Parameters:
  • batch_size (int) -- Batch size.
  • cell (tf.BasicLSTMCell) -- LSTM cell to get the initial state for.
Returns:

Tupel of the state variables and there zero states.

Return type:

tupel

set_up(weight_decay)[source]

Sets up the test problem.

Parameters:weight_decay (float) -- Weight decay factor. In this model there is no weight decay implemented.
Returns:Tupel consisting of the losses and the accuracy.
Return type:tupel