brainiak.factoranalysis package

Factor analysis.

Submodules

brainiak.factoranalysis.htfa module

Hierarchical Topographical Factor Analysis (HTFA)

This implementation is based on the work in [Manning2014-1], [Manning2014-2], [AndersonMJ2016], and [Manning2018].

Manning2014-1: “Topographic factor analysis: a bayesian model for inferring brain networks from neural data”, J. R. Manning, R. Ranganath, K. A. Norman, and D. M. Blei. PLoS One, vol. 9, no. 5, 2014.
Manning2014-2: “Hierarchical topographic factor analysis”, Jeremy. R. Manning, R. Ranganath, W. Keung, N. B. Turk-Browne, J. D.Cohen, K. A. Norman, and D. M. Blei. Pattern Recognition in Neuroimaging, 2014 International Workshop on, June 2014.
Manning2018: “A Probabilistic Approach to Discovering Dynamic Full-brain Functional Connectivit Patterns”, J. R. Manning, X. Zhu, T.L. Willke, R. Ranganath, K. Stachenfeld, U. Hasson, D. M. Blei and K. A. Norman. Neuroimage, 2018. https://doi.org/10.1016/j.neuroimage.2018.01.071
AndersonMJ2016: “Enabling Factor Analysis on Thousand-Subject Neuroimaging Datasets”, Michael J. Anderson, Mihai Capotă, Javier S. Turek, Xia Zhu, Theodore L. Willke, Yida Wang, Po-Hsuan Chen, Jeremy R. Manning, Peter J. Ramadge, Kenneth A. Norman, IEEE International Conference on Big Data, 2016. https://doi.org/10.1109/BigData.2016.7840719

class brainiak.factoranalysis.htfa.HTFA(K, n_subj, max_global_iter=10, max_local_iter=10, threshold=0.01, nlss_method='trf', nlss_loss='soft_l1', jac='2-point', x_scale='jac', tr_solver=None, weight_method='rr', upper_ratio=1.8, lower_ratio=0.02, voxel_ratio=0.25, tr_ratio=0.1, max_voxel=5000, max_tr=500, comm=<mpi4py.MPI.Intracomm object>, verbose=False)

Bases: TFA

Hierarchical Topographical Factor Analysis (HTFA)

Given multi-subject data, HTFA factorizes data from each subject as a spatial factor F and a weight matrix W per subject. Also at top level, it estimates global template across subjects:

Parameters

K (int) – Number of factors to compute.
n_subj (int) – Total number of subjects in dataset.
max_global_iter (int, default: 10) – Number of global iterations to run the algorithm.
max_local_iter (int, default: 10) – Number of local iterations to run on each subject within each global interation.
threshold (float, default: 1.0) – Tolerance for terminate the parameter estimation
nlss_method ({'trf', 'dogbox', 'lm'}, default: 'trf') – Non-Linear Least Square (NLSS) algorithm used by scipy.least_suqares to perform minimization. More information at http://docs.scipy.org/doc/scipy-0.17.0/reference/generated/scipy.optimize.least_squares.html
nlss_loss (str or callable, default: 'linear') – Loss function used by scipy.least_squares. More information at http://docs.scipy.org/doc/scipy-0.17.0/reference/generated/scipy.optimize.least_squares.html
jac ({'2-point', '3-point', 'cs', callable}, default: '2-point') – Method of computing the Jacobian matrix. More information at http://docs.scipy.org/doc/scipy-0.17.0/reference/generated/scipy.optimize.least_squares.html
x_scale (float or array_like or 'jac', default: 1.0) – Characteristic scale of each variable for scipy.least_suqares. More information at http://docs.scipy.org/doc/scipy-0.17.0/reference/generated/scipy.optimize.least_squares.html
tr_solver ({None, 'exact', 'lsmr'}, default: None) – Method for solving trust-region subproblems, relevant only for ‘trf’ and ‘dogbox’ methods. More information at http://docs.scipy.org/doc/scipy-0.17.0/reference/generated/scipy.optimize.least_squares.html
weight_method ({'rr','ols'}, default: 'rr') – Method for estimating weight matrix W given X and F. ‘rr’ means ridge regression, ‘ols’ means ordinary least square.
upper_ratio (float, default: 1.8) – The upper bound of the ratio between factor’s width and brain diameter.
lower_ratio (float, default: 0.02) – The lower bound of the ratio between factor’s width and brain diameter.
voxel_ratio (float, default: 0.25) – The percentage of voxels to sample in each inner iteration.
tr_ratio (float, default: 0.1) – The percentage of trs to sample in each inner iteration.
max_voxel (int, default: 5000) – The maximum number of voxels to sample in each inner iteration.
max_tr (int, default: 500) – The maximum number of trs to sample in each inner iteration.
comm (Intracomm) – MPI communication group, default MPI.COMM_WORLD
verbose (boolean, default: False) – Verbose mode flag.

global_prior_

The global prior on mean and variance of centers and widths.

Type: 1D array

global_posterior_

The global posterior on mean and variance of centers and widths.

Type: 1D array

local_posterior_

Local posterior on centers and widths of subjects allocated to this process.

Type: 1D array

local_weights_

Local posterior on weights allocated to this process.

Type: 1D array

Notes

We recommend to use data in MNI space to better interpret global template

fit(X, R)

Compute Hierarchical Topographical Factor Analysis Model: [Manning2014-1][Manning2014-2]

Parameters

X (list of 2D arrays, element i has shape=[voxels_i, samples]) – Each element in the list contains the fMRI data of one subject.
R (list of 2D arrays, element i has shape=[n_voxel, n_dim]) – Each element in the list contains the scanner coordinate matrix of fMRI data of one subject.

Returns

Returns the instance itself.

Return type

HTFA

set_fit_request(*, R: Union[bool, None, str] = '$UNCHANGED$') → HTFA

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters: R (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for R parameter in fit.
Returns: self – The updated object.
Return type: object

brainiak.factoranalysis.tfa module

Topographical Factor Analysis (TFA)

This implementation is based on the work in [Manning2014] and [AndersonM2016].

Manning2014: “Topographic factor analysis: a bayesian model for inferring brain networks from neural data”, J. R. Manning, R. Ranganath, K. A. Norman, and D. M. Blei.PLoS One, vol. 9, no. 5, 2014.
AndersonM2016: “Scaling Up Multi-Subject Neuroimaging Factor Analysis” Michael J. Anderson, Mihai Capota, Javier S. Turek, Xia Zhu, Theodore L. Willke, Yida Wang, Po-Hsuan Chen, Jeremy R. Manning, Peter J. Ramadge, and Kenneth A. Norman 2016.

class brainiak.factoranalysis.tfa.TFA(max_iter=10, threshold=1.0, K=50, nlss_method='trf', nlss_loss='soft_l1', jac='2-point', x_scale='jac', tr_solver=None, weight_method='rr', upper_ratio=1.8, lower_ratio=0.02, max_num_tr=500, max_num_voxel=5000, seed=100, verbose=False)

Bases: BaseEstimator

Topographical Factor Analysis (TFA)

Given data from one subject, factorize it as a spatial factor F and a weight matrix W.

Parameters

max_iter (int, default: 10) – Number of iterations to run the algorithm.
threshold (float, default: 1.0) – Tolerance for terminating the parameter estimation
K (int, default: 50) – Number of factors to compute
nlss_method ({'trf', 'dogbox', 'lm'}, default: 'trf') – Non-Linear Least Square (NLSS) algorithm used by scipy.least_suqares to perform minimization. More information at http://docs.scipy.org/doc/scipy-0.17.0/reference/generated/scipy.optimize.least_squares.html
nlss_loss (str or callable, default: 'linear') – Loss function used by scipy.least_squares. More information at http://docs.scipy.org/doc/scipy-0.17.0/reference/generated/scipy.optimize.least_squares.html
jac ({'2-point', '3-point', 'cs', callable}, default: '2-point') – Method of computing the Jacobian matrix. More information at http://docs.scipy.org/doc/scipy-0.17.0/reference/generated/scipy.optimize.least_squares.html
x_scale (float or array_like or 'jac', default: 1.0) – Characteristic scale of each variable for scipy.least_suqares. More information at http://docs.scipy.org/doc/scipy-0.17.0/reference/generated/scipy.optimize.least_squares.html
tr_solver ({None, 'exact', 'lsmr'}, default: None) – Method for solving trust-region subproblems, relevant only for ‘trf’ and ‘dogbox’ methods. More information at http://docs.scipy.org/doc/scipy-0.17.0/reference/generated/scipy.optimize.least_squares.html
weight_method ({'rr','ols'}, default: 'rr') – Method for estimating weight matrix W given X and F. ‘rr’ means ridge regression, ‘ols’ means ordinary least square.
upper_ratio (float, default: 1.8) – The upper bound of the ratio between factor’s width and maximum sigma of scanner coordinates.
lower_ratio (float, default: 0.02) – The lower bound of the ratio between factor’s width and maximum sigma of scanner coordinates.
max_num_voxel (int, default: 5000) – The maximum number of voxels to subsample.
max_num_tr (int, default: 500) – The maximum number of TRs to subsample.
seed (int, default: 100) – Seed for subsampling voxels and trs.
verbose (boolean, default: False) – Verbose mode flag.

local_posterior_

Local posterior on subject’s centers and widths

Type: 1D array

F_

Latent factors of the subject

Type: 2D array, in shape [n_voxel, K]

W_

Weight matrix of the subject

Type: 2D array, in shape [K, n_tr]

fit(X, R, template_prior=None)

Topographical Factor Analysis (TFA)[Manning2014]

Parameters

X (2D array, in shape [n_voxel, n_sample]) – The fMRI data of one subject
R (2D array, in shape [n_voxel, n_dim]) – The voxel coordinate matrix of fMRI data
template_prior (None or 1D array) – The template prior as an extra constraint None when fitting TFA alone

get_bounds(R)

Calculate lower and upper bounds for centers and widths

Parameters: R (2D array, with shape [n_voxel, n_dim]) – The coordinate matrix of fMRI data from one subject
Returns: bounds – The lower and upper bounds on factor’s centers and widths.
Return type: 2-tuple of array_like, default: None

get_centers(estimation)

Get estimation on centers

Parameters: estimation (1D arrary) – Either prior of posterior estimation
Returns: centers – Estimation on centers
Return type: 2D array, in shape [K, n_dim]

get_centers_mean_cov(estimation)

Get estimation on the covariance of centers’ mean

Parameters: estimation (1D arrary) – Either prior of posterior estimation
Returns: centers_mean_cov – Estimation of the covariance of centers’ mean
Return type: 2D array, in shape [K, cov_vec_size]

get_factors(unique_R, inds, centers, widths)

Calculate factors based on centers and widths

Parameters

unique_R (a list of array,) – Each element contains unique value in one dimension of scanner coordinate matrix R.
inds (a list of array,) – Each element contains the indices to reconstruct one dimension of original cooridnate matrix from the unique array.
centers (2D array, with shape [K, n_dim]) – The centers of factors.
widths (1D array, with shape [K, 1]) – The widths of factors.

Returns

F – The latent factors from fMRI data.

Return type

2D array, with shape [n_voxel,self.K]

get_map_offset()

Compute offset of prior/posterior

Returns: map_offest – The offset to different fields in prior/posterior
Return type: 1D array

get_template(R)

Compute a template on latent factors

Parameters

R (2D array, in format [n_voxel, n_dim]) – The scanner coordinate matrix of one subject’s fMRI data

Returns

template_prior (1D array) – The template prior.
template_centers_cov (2D array, in shape [n_dim, n_dim]) – The template on centers’ covariance.
template_widths_var (float) – The template on widths’ variance

get_unique_R(R)

Get unique vlaues from coordinate matrix

Parameters

R (2D array) – The coordinate matrix of a subject’s fMRI data

Returns

unique_R (a list of array,) – Each element contains unique value in one dimension of coordinate matrix R.
inds (a list of array,) – Each element contains the indices to reconstruct one dimension of original cooridnate matrix from the unique array.

get_weights(data, F)

Calculate weight matrix based on fMRI data and factors

Parameters

data (2D array, with shape [n_voxel, n_tr]) – fMRI data from one subject
F (2D array, with shape [n_voxel,self.K]) – The latent factors from fMRI data.

Returns

W – The weight matrix from fMRI data.

Return type

2D array, with shape [K, n_tr]

get_widths(estimation)

Get estimation on widths

Parameters: estimation (1D arrary) – Either prior of posterior estimation
Returns: fields – Estimation of widths
Return type: 2D array, in shape [K, 1]

get_widths_mean_var(estimation)

Get estimation on the variance of widths’ mean

Parameters: estimation (1D arrary) – Either prior of posterior estimation
Returns: widths_mean_var – Estimation on variance of widths’ mean
Return type: 2D array, in shape [K, 1]

init_centers_widths(R)

Initialize prior of centers and widths

Returns

centers (2D array, with shape [K, n_dim]) – Prior of factors’ centers.
widths (1D array, with shape [K, 1]) – Prior of factors’ widths.

init_prior(R)

initialize prior for the subject

Returns: Returns the instance itself.
Return type: TFA

set_K(K)

set K for the subject

Parameters: K (integer) – Number of latent factor.
Returns: Returns the instance itself.
Return type: TFA

set_centers(estimation, centers)

Set estimation on centers

Parameters

estimation (1D arrary) – Either prior or posterior estimation
centers (2D array, in shape [K, n_dim]) – Estimation on centers

set_centers_mean_cov(estimation, centers_mean_cov)

Set estimation on centers

Parameters

estimation (1D arrary) – Either prior of posterior estimation
centers (2D array, in shape [K, n_dim]) – Estimation on centers

set_fit_request(*, R: Union[bool, None, str] = '$UNCHANGED$', template_prior: Union[bool, None, str] = '$UNCHANGED$') → TFA

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters

R (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for R parameter in fit.
template_prior (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for template_prior parameter in fit.

Returns

self – The updated object.

Return type

object

set_prior(prior)

set prior for the subject

Parameters: prior (1D array, with K*(n_dim+1) elements) – Subject prior of centers and widths.
Returns: Returns the instance itself.
Return type: TFA

set_seed(seed)

set seed for the subject

Parameters: seed (int) – Seed for subsampling voxels and trs
Returns: Returns the instance itself.
Return type: TFA

set_widths(estimation, widths)

Set estimation on widths

Parameters

estimation (1D arrary) – Either prior of posterior estimation
widths (2D array, in shape [K, 1]) – Estimation on widths

set_widths_mean_var(estimation, widths_mean_var)

Set estimation on centers

Parameters

estimation (1D arrary) – Either prior of posterior estimation
centers (2D array, in shape [K, n_dim]) – Estimation on centers