osl_dynamics.utils.sklearn_wrappers#

Wrappers for scikit-learn.

Functions#

`linear_regression`(X, y, fit_intercept[, normalize, ...])	Wrapper for sklearn.linear_model.LinearRegression.
`fit_gaussian_mixture`(X[, logit_transform, ...])	Fits a two-component Gaussian Mixture Model (GMM).

Module Contents#

osl_dynamics.utils.sklearn_wrappers.linear_regression(X, y, fit_intercept, normalize=False, log_message=False, n_jobs=-1)[source]#

Wrapper for sklearn.linear_model.LinearRegression.

Parameters:

X (np.ndarray) – Regressors, should be a 2D array (n_targets, n_regressors).
y (np.ndarray) – Targets. Should be a 2D array: (n_targets, n_features). If a higher dimension array is passed, the extra dimensions are concatenated.
fit_intercept (bool) – Should we fit an intercept?
normalize (bool, optional) – Should we z-score the regressors?
log_message (bool, optional) – Should we log a message?
n_jobs (int, optional) – Number of parallel jobs.

Returns:

coefs (np.ndarray) – Regression coefficients. 2D array or higher dimensionality: (n_regressors, n_features).
intercept (np.ndarray) – Regression intercept. 1D array or higher dimensionality: (n_features,). Returned if fit_intercept=True.

Return type:

Union[numpy.ndarray, Tuple[numpy.ndarray, numpy.ndarray]]

osl_dynamics.utils.sklearn_wrappers.fit_gaussian_mixture(X, logit_transform=False, standardize=True, p_value=None, one_component_percentile=None, n_sigma=0, label_order='mean', sklearn_kwargs=None, return_statistics=False, show_plot=False, plot_filename=None, plot_kwargs=None, log_message=True)[source]#

Fits a two-component Gaussian Mixture Model (GMM).

Parameters:

X (np.ndarray) – Data to fit GMM to. Must be 1D.
logit_transform (bool, optional) – Should we logit transform the X?
standardize (bool, optional) – Should we standardize X?
p_value (float, optional) – Used to determine a threshold. We ensure the data points assigned to the ‘on’ component have a probability of less than p_value of belonging to the ‘off’ component.
one_component_percentile (float, optional) – Percentile threshold if only one component is found. Should be between 0 and 100. E.g. for the 95th percentile, one_component_percentile=95.
n_sigma (float, optional) – Number of standard deviations of the ‘off’ component the mean of the ‘on’ component must be for the fit to be considered to have two components.
label_order (str, optional) – How do we order the inferred classes?
sklearn_kwargs (dict, optional) – Dictionary of keyword arguments to pass to sklearn.mixture.GaussianMixture.
return_statistics (bool, optional) – Should we return statistics of the Gaussian mixture components?
show_plot (bool, optional) – Should we show the GMM fit to the distribution of X.
plot_filename (str, optional) – Filename to save a plot of the Gaussian mixture model.
plot_kwargs (dict, optional) – Keyword arguments to pass to osl_dynamics.utils.plotting.plot_gmm() Only used if plot_filename is not None.
log_message (bool) – Should we log a message?

Returns:

threshold – Threshold for the on class.

Return type:

float