Note
Go to the end to download the full example code.
HMM: Multitaper Spectra#
In this tutorial we calculate subject and state-specific multitaper spectra using a trained HMM and source space data. Here, we should not use the prepared (e.g. TDE-PCA) data.
Download source data#
In this tutorial, we’ll download example data from OSF.
import os
def get_data(name, rename):
os.system(f"osf -p by2tc fetch data/{name}.zip")
os.makedirs(rename, exist_ok=True)
os.system(f"unzip -o {name}.zip -d {rename}")
os.remove(f"{name}.zip")
return f"Data downloaded to: {rename}"
# Download the dataset (approximately 52 MB)
get_data("notts_meguk_giles_5_subjects", rename="source_data")
'Data downloaded to: source_data'
Download inferred parameters#
def get_inf_params(name, rename):
os.system(f"osf -p by2tc fetch inf_params/{name}.zip")
os.makedirs(rename, exist_ok=True)
os.system(f"unzip -o {name}.zip -d {rename}")
os.remove(f"{name}.zip")
return f"Data downloaded to: {rename}"
# Download the dataset (approximately 11 MB)
get_inf_params("tde_hmm_notts_meguk_giles_5_subjects", rename="results/inf_params")
'Data downloaded to: results/inf_params'
Load the source data#
We load the data into osl-dynamics using the Data class. See the Loading Data tutorial for further details.
from osl_dynamics.data import Data
data = Data("source_data")
print(data)
Loading files: 0%| | 0/5 [00:00<?, ?it/s]
Loading files: 100%|██████████| 5/5 [00:00<00:00, 324.20it/s]
Data
id: 139850340990224
n_sessions: 5
n_samples: 371752
n_channels: 38
Trim data#
If we trained on time-delay embedded data we lose a few time points from the start and end of the data. Additionally, when we separate the data into sequences we lose time points from the end of the time series (that do not form a complete sequence). Consequently, we need to trim the source-space data to match the inferred state probabilities (alpha).
trimmed_data = data.trim_time_series(n_embeddings=15, sequence_length=2000) # needs to match values used to prepare the data and build the model
If we didn’t do any time-delay embedding and just need to remove time points lost due to separating into sequences we can use:
trimmed_data = data.trim_time_series(sequence_length=2000)
Load the state probabilities#
To calculate the multitaper spectra, we need the inferred state probabilities.
import pickle
alpha = pickle.load(open("results/inf_params/alp.pkl", "rb"))
alpha is a list of numpy arrays that contains a (n_samples, n_states) time series, which is the state probability at each time point. Each item of the list corresponds to a subject.
Let’s double check the number of time points in inferred state probabilities match the source-space data.
for a, x in zip(alpha, trimmed_data):
print(a.shape, x.shape)
(72000, 8) (72000, 38)
(72000, 8) (72000, 38)
(74000, 8) (74000, 38)
(74000, 8) (74000, 38)
(74000, 8) (74000, 38)
If the first dimension of these arrays don’t match then the wrong value for n_embeddings or sequence_length was used when we trimemd the data.
Calculating Subjects/State Spectra#
Power spectra and coherences#
We want to calculate the power spectrum and coherence of each state. This is done by using standard calculation methods (in our case the multitaper for spectrum estimation) to the time points identified as belonging to a particular state. The analysis.spectral.multitaper_spectra function does this for us. Let’s first run this function, then we’ll discuss its output.
from osl_dynamics.analysis import spectral
# Calculate multitaper spectra for each state and subject (will take a few minutes)
f, psd, coh, w = spectral.multitaper_spectra(
data=trimmed_data,
alpha=alpha,
sampling_frequency=250,
frequency_range=[1, 45],
return_weights=True,
)
Calculating spectra: 0%| | 0/5 [00:00<?, ?it/s]
Calculating spectra: 20%|██ | 1/5 [00:04<00:19, 4.97s/it]
Calculating spectra: 40%|████ | 2/5 [00:09<00:14, 4.84s/it]
Calculating spectra: 60%|██████ | 3/5 [00:14<00:09, 4.93s/it]
Calculating spectra: 80%|████████ | 4/5 [00:19<00:04, 4.98s/it]
Calculating spectra: 100%|██████████| 5/5 [00:24<00:00, 5.03s/it]
Calculating spectra: 100%|██████████| 5/5 [00:24<00:00, 4.99s/it]
To understand the f, psd and coh numpy arrays it is useful to print their shape.
print(f.shape)
print(psd.shape)
print(coh.shape)
print(w.shape)
(89,)
(5, 8, 38, 89)
(5, 8, 38, 38, 89)
(5,)
We can see the f array is 1D, it corresponds to the frequency axis for each spectra. The psd array corresponds to the PSDs and is (subjects, states, channels, frequencies) and the coh array corresponds to the pairwise coherences and is (subjects, states, channels, channels, frequencies). w is an array containing weights for each subject when calculating a group average.
Calculating the spectrum can be time consuming so it is useful to save it as a numpy file, which can be loaded very quickly.
import numpy as np
os.makedirs("results/spectra", exist_ok=True)
np.save("results/spectra/f.npy", f)
np.save("results/spectra/psd.npy", psd)
np.save("results/spectra/coh.npy", coh)
np.save("results/spectra/w.npy", w)
Total running time of the script: (0 minutes 45.167 seconds)