`osl_dynamics.data.tf`#

Function related to TensorFlow datasets.

Module Contents#

Functions#

`get_n_sequences`(arr, sequence_length[, step_size])	Calculate the number of sequences an array will be split into.
`concatenate_datasets`(datasets)	Concatenates a list of TensorFlow datasets.
`create_dataset`(data, sequence_length, step_size)	Creates a TensorFlow dataset of batched time series data.
`save_tfrecord`(data, sequence_length, step_size, filepath)	Save dataset to a TFRecord file.
`load_tfrecord_dataset`(tfrecord_dir, batch_size[, ...])	Load a TFRecord dataset.
`_validate_tf_dataset`(dataset)	Check if the input is a valid TensorFlow dataset.
`get_range`(dataset)	The range (max-min) of values contained in a batched Tensorflow dataset.
`get_n_channels`(dataset)	Get the number of channels in a batched TensorFlow dataset.
`get_n_batches`(dataset)	Get number of batches in a TensorFlow dataset.

osl_dynamics.data.tf.get_n_sequences(arr, sequence_length, step_size=None)[source]#

Calculate the number of sequences an array will be split into.

Parameters:

arr (np.ndarray) – Time series data.
sequence_length (int) – Length of sequences which the data will be segmented in to.
step_size (int, optional) – The number of samples by which to move the sliding window between sequences.

Returns:

n – Number of sequences.

Return type:

int

osl_dynamics.data.tf.concatenate_datasets(datasets)[source]#

Concatenates a list of TensorFlow datasets.

Parameters:: datasets (list) – List of TensorFlow datasets.
Returns:: full_dataset – Concatenated dataset.
Return type:: tf.data.Dataset

osl_dynamics.data.tf.create_dataset(data, sequence_length, step_size)[source]#

Creates a TensorFlow dataset of batched time series data.

Parameters:

data (dict) – Dictionary containing data to batch. Keys correspond to the input name for the model and the value is the data.
sequence_length (int) – Sequence length to batch the data.
step_size (int) – Number of samples to slide the sequence across the data.

Returns:

dataset – TensorFlow dataset.

Return type:

tf.data.Dataset

osl_dynamics.data.tf.save_tfrecord(data, sequence_length, step_size, filepath)[source]#

Save dataset to a TFRecord file.

Parameters:

data (dict) – Dictionary containing data to batch. Keys correspond to the input name for the model and the value is the data.
sequence_length (int) – Sequence length to batch the data.
step_size (int) – Number of samples to slide the sequence across the data.
filepath (str) – Path to save the TFRecord file.

osl_dynamics.data.tf.load_tfrecord_dataset(tfrecord_dir, batch_size, shuffle=True, validation_split=None, concatenate=True, drop_last_batch=False, buffer_size=100000, keep=None)[source]#

Load a TFRecord dataset.

Parameters:

tfrecord_dir (str) – Directory containing the TFRecord datasets.
batch_size (int) – Number sequences in each mini-batch which is used to train the model.
shuffle (bool, optional) – Should we shuffle sequences (within a batch) and batches.
validation_split (float, optional) – Ratio to split the dataset into a training and validation set.
concatenate (bool, optional) – Should we concatenate the datasets for each array?
drop_last_batch (bool, optional) – Should we drop the last batch if it is smaller than the batch size?
buffer_size (int, optional) – Buffer size for shuffling a TensorFlow Dataset. Smaller values will lead to less random shuffling but will be quicker. Default is 100000.
keep (list of int, optional) – List of session indices to keep. If None, then all sessions are kept.

Returns:

dataset – Dataset for training or evaluating the model along with the validation set if validation_split was passed.

Return type:

tf.data.Dataset or tuple

osl_dynamics.data.tf._validate_tf_dataset(dataset)[source]#

Check if the input is a valid TensorFlow dataset.

Parameters:: dataset (tf.data.Dataset or list) – TensorFlow dataset or list of datasets.
Returns:: dataset – TensorFlow dataset.
Return type:: tf.data.Dataset

osl_dynamics.data.tf.get_range(dataset)[source]#

The range (max-min) of values contained in a batched Tensorflow dataset.

Parameters:: dataset (tf.data.Dataset) – TensorFlow dataset.
Returns:: range – Range of each channel.
Return type:: np.ndarray

osl_dynamics.data.tf.get_n_channels(dataset)[source]#

Get the number of channels in a batched TensorFlow dataset.

Parameters:: dataset (tf.data.Dataset) – TensorFlow dataset.
Returns:: n_channels – Number of channels.
Return type:: int

osl_dynamics.data.tf.get_n_batches(dataset)[source]#

Get number of batches in a TensorFlow dataset.

Parameters:: dataset (tf.data.Dataset) – TensorFlow dataset.
Returns:: n_batches – Number of batches.
Return type:: int

osl_dynamics.data.tf#

Module Contents#

Functions#

`osl_dynamics.data.tf`#