osl_dynamics.data.tf
#
Function related to TensorFlow datasets.
Module Contents#
Functions#
|
Calculate the number of sequences an array will be split into. |
|
Concatenates a list of TensorFlow datasets. |
|
Creates a TensorFlow dataset of batched time series data. |
|
Save dataset to a TFRecord file. |
|
Load a TFRecord dataset. |
|
Check if the input is a valid TensorFlow dataset. |
|
The range (max-min) of values contained in a batched Tensorflow dataset. |
|
Get the number of channels in a batched TensorFlow dataset. |
|
Get number of batches in a TensorFlow dataset. |
- osl_dynamics.data.tf.get_n_sequences(arr, sequence_length, step_size=None)[source]#
Calculate the number of sequences an array will be split into.
- Parameters:
arr (np.ndarray) – Time series data.
sequence_length (int) – Length of sequences which the data will be segmented in to.
step_size (int, optional) – The number of samples by which to move the sliding window between sequences.
- Returns:
n – Number of sequences.
- Return type:
int
- osl_dynamics.data.tf.concatenate_datasets(datasets)[source]#
Concatenates a list of TensorFlow datasets.
- Parameters:
datasets (list) – List of TensorFlow datasets.
- Returns:
full_dataset – Concatenated dataset.
- Return type:
tf.data.Dataset
- osl_dynamics.data.tf.create_dataset(data, sequence_length, step_size)[source]#
Creates a TensorFlow dataset of batched time series data.
- Parameters:
data (dict) – Dictionary containing data to batch. Keys correspond to the input name for the model and the value is the data.
sequence_length (int) – Sequence length to batch the data.
step_size (int) – Number of samples to slide the sequence across the data.
- Returns:
dataset – TensorFlow dataset.
- Return type:
tf.data.Dataset
- osl_dynamics.data.tf.save_tfrecord(data, sequence_length, step_size, filepath)[source]#
Save dataset to a TFRecord file.
- Parameters:
data (dict) – Dictionary containing data to batch. Keys correspond to the input name for the model and the value is the data.
sequence_length (int) – Sequence length to batch the data.
step_size (int) – Number of samples to slide the sequence across the data.
filepath (str) – Path to save the TFRecord file.
- osl_dynamics.data.tf.load_tfrecord_dataset(tfrecord_dir, batch_size, shuffle=True, validation_split=None, concatenate=True, drop_last_batch=False, buffer_size=100000, keep=None)[source]#
Load a TFRecord dataset.
- Parameters:
tfrecord_dir (str) – Directory containing the TFRecord datasets.
batch_size (int) – Number sequences in each mini-batch which is used to train the model.
shuffle (bool, optional) – Should we shuffle sequences (within a batch) and batches.
validation_split (float, optional) – Ratio to split the dataset into a training and validation set.
concatenate (bool, optional) – Should we concatenate the datasets for each array?
drop_last_batch (bool, optional) – Should we drop the last batch if it is smaller than the batch size?
buffer_size (int, optional) – Buffer size for shuffling a TensorFlow Dataset. Smaller values will lead to less random shuffling but will be quicker. Default is 100000.
keep (list of int, optional) – List of session indices to keep. If
None
, then all sessions are kept.
- Returns:
dataset – Dataset for training or evaluating the model along with the validation set if
validation_split
was passed.- Return type:
tf.data.Dataset or tuple
- osl_dynamics.data.tf._validate_tf_dataset(dataset)[source]#
Check if the input is a valid TensorFlow dataset.
- Parameters:
dataset (tf.data.Dataset or list) – TensorFlow dataset or list of datasets.
- Returns:
dataset – TensorFlow dataset.
- Return type:
tf.data.Dataset
- osl_dynamics.data.tf.get_range(dataset)[source]#
The range (max-min) of values contained in a batched Tensorflow dataset.
- Parameters:
dataset (tf.data.Dataset) – TensorFlow dataset.
- Returns:
range – Range of each channel.
- Return type:
np.ndarray