brainiak.factoranalysis package
Factor analysis.
Submodules
brainiak.factoranalysis.htfa module
Hierarchical Topographical Factor Analysis (HTFA)
This implementation is based on the work in [Manning2014-1], [Manning2014-2], [AndersonMJ2016], and [Manning2018].
- Manning2014-1
“Topographic factor analysis: a bayesian model for inferring brain networks from neural data”, J. R. Manning, R. Ranganath, K. A. Norman, and D. M. Blei. PLoS One, vol. 9, no. 5, 2014.
- Manning2014-2
“Hierarchical topographic factor analysis”, Jeremy. R. Manning, R. Ranganath, W. Keung, N. B. Turk-Browne, J. D.Cohen, K. A. Norman, and D. M. Blei. Pattern Recognition in Neuroimaging, 2014 International Workshop on, June 2014.
- Manning2018
“A Probabilistic Approach to Discovering Dynamic Full-brain Functional Connectivit Patterns”, J. R. Manning, X. Zhu, T.L. Willke, R. Ranganath, K. Stachenfeld, U. Hasson, D. M. Blei and K. A. Norman. Neuroimage, 2018. https://doi.org/10.1016/j.neuroimage.2018.01.071
- AndersonMJ2016
“Enabling Factor Analysis on Thousand-Subject Neuroimaging Datasets”, Michael J. Anderson, Mihai Capotă, Javier S. Turek, Xia Zhu, Theodore L. Willke, Yida Wang, Po-Hsuan Chen, Jeremy R. Manning, Peter J. Ramadge, Kenneth A. Norman, IEEE International Conference on Big Data, 2016. https://doi.org/10.1109/BigData.2016.7840719
- class brainiak.factoranalysis.htfa.HTFA(K, n_subj, max_global_iter=10, max_local_iter=10, threshold=0.01, nlss_method='trf', nlss_loss='soft_l1', jac='2-point', x_scale='jac', tr_solver=None, weight_method='rr', upper_ratio=1.8, lower_ratio=0.02, voxel_ratio=0.25, tr_ratio=0.1, max_voxel=5000, max_tr=500, comm=<mpi4py.MPI.Intracomm object>, verbose=False)
Bases:
TFA
Hierarchical Topographical Factor Analysis (HTFA)
Given multi-subject data, HTFA factorizes data from each subject as a spatial factor F and a weight matrix W per subject. Also at top level, it estimates global template across subjects:
- Parameters
K (int) – Number of factors to compute.
n_subj (int) – Total number of subjects in dataset.
max_global_iter (int, default: 10) – Number of global iterations to run the algorithm.
max_local_iter (int, default: 10) – Number of local iterations to run on each subject within each global interation.
threshold (float, default: 1.0) – Tolerance for terminate the parameter estimation
nlss_method ({'trf', 'dogbox', 'lm'}, default: 'trf') – Non-Linear Least Square (NLSS) algorithm used by scipy.least_suqares to perform minimization. More information at http://docs.scipy.org/doc/scipy-0.17.0/reference/generated/scipy.optimize.least_squares.html
nlss_loss (str or callable, default: 'linear') – Loss function used by scipy.least_squares. More information at http://docs.scipy.org/doc/scipy-0.17.0/reference/generated/scipy.optimize.least_squares.html
jac ({'2-point', '3-point', 'cs', callable}, default: '2-point') – Method of computing the Jacobian matrix. More information at http://docs.scipy.org/doc/scipy-0.17.0/reference/generated/scipy.optimize.least_squares.html
x_scale (float or array_like or 'jac', default: 1.0) – Characteristic scale of each variable for scipy.least_suqares. More information at http://docs.scipy.org/doc/scipy-0.17.0/reference/generated/scipy.optimize.least_squares.html
tr_solver ({None, 'exact', 'lsmr'}, default: None) – Method for solving trust-region subproblems, relevant only for ‘trf’ and ‘dogbox’ methods. More information at http://docs.scipy.org/doc/scipy-0.17.0/reference/generated/scipy.optimize.least_squares.html
weight_method ({'rr','ols'}, default: 'rr') – Method for estimating weight matrix W given X and F. ‘rr’ means ridge regression, ‘ols’ means ordinary least square.
upper_ratio (float, default: 1.8) – The upper bound of the ratio between factor’s width and brain diameter.
lower_ratio (float, default: 0.02) – The lower bound of the ratio between factor’s width and brain diameter.
voxel_ratio (float, default: 0.25) – The percentage of voxels to sample in each inner iteration.
tr_ratio (float, default: 0.1) – The percentage of trs to sample in each inner iteration.
max_voxel (int, default: 5000) – The maximum number of voxels to sample in each inner iteration.
max_tr (int, default: 500) – The maximum number of trs to sample in each inner iteration.
comm (Intracomm) – MPI communication group, default MPI.COMM_WORLD
verbose (boolean, default: False) – Verbose mode flag.
- global_prior_
The global prior on mean and variance of centers and widths.
- Type
1D array
- global_posterior_
The global posterior on mean and variance of centers and widths.
- Type
1D array
- local_posterior_
Local posterior on centers and widths of subjects allocated to this process.
- Type
1D array
- local_weights_
Local posterior on weights allocated to this process.
- Type
1D array
Notes
We recommend to use data in MNI space to better interpret global template
- fit(X, R)
- Compute Hierarchical Topographical Factor Analysis Model
[Manning2014-1][Manning2014-2]
- Parameters
X (list of 2D arrays, element i has shape=[voxels_i, samples]) – Each element in the list contains the fMRI data of one subject.
R (list of 2D arrays, element i has shape=[n_voxel, n_dim]) – Each element in the list contains the scanner coordinate matrix of fMRI data of one subject.
- Returns
Returns the instance itself.
- Return type
- set_fit_request(*, R: Union[bool, None, str] = '$UNCHANGED$') HTFA
Request metadata passed to the
fit
method.Note that this method is only relevant if
enable_metadata_routing=True
(seesklearn.set_config()
). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True
: metadata is requested, and passed tofit
if provided. The request is ignored if metadata is not provided.False
: metadata is not requested and the meta-estimator will not pass it tofit
.None
: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str
: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED
) retains the existing request. This allows you to change the request for some parameters and not others.New in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline
. Otherwise it has no effect.- Parameters
R (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
R
parameter infit
.- Returns
self – The updated object.
- Return type
object
brainiak.factoranalysis.tfa module
Topographical Factor Analysis (TFA)
This implementation is based on the work in [Manning2014] and [AndersonM2016].
- Manning2014
“Topographic factor analysis: a bayesian model for inferring brain networks from neural data”, J. R. Manning, R. Ranganath, K. A. Norman, and D. M. Blei.PLoS One, vol. 9, no. 5, 2014.
- AndersonM2016
“Scaling Up Multi-Subject Neuroimaging Factor Analysis” Michael J. Anderson, Mihai Capota, Javier S. Turek, Xia Zhu, Theodore L. Willke, Yida Wang, Po-Hsuan Chen, Jeremy R. Manning, Peter J. Ramadge, and Kenneth A. Norman 2016.
- class brainiak.factoranalysis.tfa.TFA(max_iter=10, threshold=1.0, K=50, nlss_method='trf', nlss_loss='soft_l1', jac='2-point', x_scale='jac', tr_solver=None, weight_method='rr', upper_ratio=1.8, lower_ratio=0.02, max_num_tr=500, max_num_voxel=5000, seed=100, verbose=False)
Bases:
BaseEstimator
Topographical Factor Analysis (TFA)
Given data from one subject, factorize it as a spatial factor F and a weight matrix W.
- Parameters
max_iter (int, default: 10) – Number of iterations to run the algorithm.
threshold (float, default: 1.0) – Tolerance for terminating the parameter estimation
K (int, default: 50) – Number of factors to compute
nlss_method ({'trf', 'dogbox', 'lm'}, default: 'trf') – Non-Linear Least Square (NLSS) algorithm used by scipy.least_suqares to perform minimization. More information at http://docs.scipy.org/doc/scipy-0.17.0/reference/generated/scipy.optimize.least_squares.html
nlss_loss (str or callable, default: 'linear') – Loss function used by scipy.least_squares. More information at http://docs.scipy.org/doc/scipy-0.17.0/reference/generated/scipy.optimize.least_squares.html
jac ({'2-point', '3-point', 'cs', callable}, default: '2-point') – Method of computing the Jacobian matrix. More information at http://docs.scipy.org/doc/scipy-0.17.0/reference/generated/scipy.optimize.least_squares.html
x_scale (float or array_like or 'jac', default: 1.0) – Characteristic scale of each variable for scipy.least_suqares. More information at http://docs.scipy.org/doc/scipy-0.17.0/reference/generated/scipy.optimize.least_squares.html
tr_solver ({None, 'exact', 'lsmr'}, default: None) – Method for solving trust-region subproblems, relevant only for ‘trf’ and ‘dogbox’ methods. More information at http://docs.scipy.org/doc/scipy-0.17.0/reference/generated/scipy.optimize.least_squares.html
weight_method ({'rr','ols'}, default: 'rr') – Method for estimating weight matrix W given X and F. ‘rr’ means ridge regression, ‘ols’ means ordinary least square.
upper_ratio (float, default: 1.8) – The upper bound of the ratio between factor’s width and maximum sigma of scanner coordinates.
lower_ratio (float, default: 0.02) – The lower bound of the ratio between factor’s width and maximum sigma of scanner coordinates.
max_num_voxel (int, default: 5000) – The maximum number of voxels to subsample.
max_num_tr (int, default: 500) – The maximum number of TRs to subsample.
seed (int, default: 100) – Seed for subsampling voxels and trs.
verbose (boolean, default: False) – Verbose mode flag.
- local_posterior_
Local posterior on subject’s centers and widths
- Type
1D array
- F_
Latent factors of the subject
- Type
2D array, in shape [n_voxel, K]
- W_
Weight matrix of the subject
- Type
2D array, in shape [K, n_tr]
- fit(X, R, template_prior=None)
Topographical Factor Analysis (TFA)[Manning2014]
- Parameters
X (2D array, in shape [n_voxel, n_sample]) – The fMRI data of one subject
R (2D array, in shape [n_voxel, n_dim]) – The voxel coordinate matrix of fMRI data
template_prior (None or 1D array) – The template prior as an extra constraint None when fitting TFA alone
- get_bounds(R)
Calculate lower and upper bounds for centers and widths
- Parameters
R (2D array, with shape [n_voxel, n_dim]) – The coordinate matrix of fMRI data from one subject
- Returns
bounds – The lower and upper bounds on factor’s centers and widths.
- Return type
2-tuple of array_like, default: None
- get_centers(estimation)
Get estimation on centers
- Parameters
estimation (1D arrary) – Either prior of posterior estimation
- Returns
centers – Estimation on centers
- Return type
2D array, in shape [K, n_dim]
- get_centers_mean_cov(estimation)
Get estimation on the covariance of centers’ mean
- Parameters
estimation (1D arrary) – Either prior of posterior estimation
- Returns
centers_mean_cov – Estimation of the covariance of centers’ mean
- Return type
2D array, in shape [K, cov_vec_size]
- get_factors(unique_R, inds, centers, widths)
Calculate factors based on centers and widths
- Parameters
unique_R (a list of array,) – Each element contains unique value in one dimension of scanner coordinate matrix R.
inds (a list of array,) – Each element contains the indices to reconstruct one dimension of original cooridnate matrix from the unique array.
centers (2D array, with shape [K, n_dim]) – The centers of factors.
widths (1D array, with shape [K, 1]) – The widths of factors.
- Returns
F – The latent factors from fMRI data.
- Return type
2D array, with shape [n_voxel,self.K]
- get_map_offset()
Compute offset of prior/posterior
- Returns
map_offest – The offset to different fields in prior/posterior
- Return type
1D array
- get_template(R)
Compute a template on latent factors
- Parameters
R (2D array, in format [n_voxel, n_dim]) – The scanner coordinate matrix of one subject’s fMRI data
- Returns
template_prior (1D array) – The template prior.
template_centers_cov (2D array, in shape [n_dim, n_dim]) – The template on centers’ covariance.
template_widths_var (float) – The template on widths’ variance
- get_unique_R(R)
Get unique vlaues from coordinate matrix
- Parameters
R (2D array) – The coordinate matrix of a subject’s fMRI data
- Returns
unique_R (a list of array,) – Each element contains unique value in one dimension of coordinate matrix R.
inds (a list of array,) – Each element contains the indices to reconstruct one dimension of original cooridnate matrix from the unique array.
- get_weights(data, F)
Calculate weight matrix based on fMRI data and factors
- Parameters
data (2D array, with shape [n_voxel, n_tr]) – fMRI data from one subject
F (2D array, with shape [n_voxel,self.K]) – The latent factors from fMRI data.
- Returns
W – The weight matrix from fMRI data.
- Return type
2D array, with shape [K, n_tr]
- get_widths(estimation)
Get estimation on widths
- Parameters
estimation (1D arrary) – Either prior of posterior estimation
- Returns
fields – Estimation of widths
- Return type
2D array, in shape [K, 1]
- get_widths_mean_var(estimation)
Get estimation on the variance of widths’ mean
- Parameters
estimation (1D arrary) – Either prior of posterior estimation
- Returns
widths_mean_var – Estimation on variance of widths’ mean
- Return type
2D array, in shape [K, 1]
- init_centers_widths(R)
Initialize prior of centers and widths
- Returns
centers (2D array, with shape [K, n_dim]) – Prior of factors’ centers.
widths (1D array, with shape [K, 1]) – Prior of factors’ widths.
- init_prior(R)
initialize prior for the subject
- Returns
Returns the instance itself.
- Return type
- set_K(K)
set K for the subject
- Parameters
K (integer) – Number of latent factor.
- Returns
Returns the instance itself.
- Return type
- set_centers(estimation, centers)
Set estimation on centers
- Parameters
estimation (1D arrary) – Either prior or posterior estimation
centers (2D array, in shape [K, n_dim]) – Estimation on centers
- set_centers_mean_cov(estimation, centers_mean_cov)
Set estimation on centers
- Parameters
estimation (1D arrary) – Either prior of posterior estimation
centers (2D array, in shape [K, n_dim]) – Estimation on centers
- set_fit_request(*, R: Union[bool, None, str] = '$UNCHANGED$', template_prior: Union[bool, None, str] = '$UNCHANGED$') TFA
Request metadata passed to the
fit
method.Note that this method is only relevant if
enable_metadata_routing=True
(seesklearn.set_config()
). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True
: metadata is requested, and passed tofit
if provided. The request is ignored if metadata is not provided.False
: metadata is not requested and the meta-estimator will not pass it tofit
.None
: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str
: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED
) retains the existing request. This allows you to change the request for some parameters and not others.New in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline
. Otherwise it has no effect.- Parameters
R (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
R
parameter infit
.template_prior (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
template_prior
parameter infit
.
- Returns
self – The updated object.
- Return type
object
- set_prior(prior)
set prior for the subject
- Parameters
prior (1D array, with K*(n_dim+1) elements) – Subject prior of centers and widths.
- Returns
Returns the instance itself.
- Return type
- set_seed(seed)
set seed for the subject
- Parameters
seed (int) – Seed for subsampling voxels and trs
- Returns
Returns the instance itself.
- Return type
- set_widths(estimation, widths)
Set estimation on widths
- Parameters
estimation (1D arrary) – Either prior of posterior estimation
widths (2D array, in shape [K, 1]) – Estimation on widths
- set_widths_mean_var(estimation, widths_mean_var)
Set estimation on centers
- Parameters
estimation (1D arrary) – Either prior of posterior estimation
centers (2D array, in shape [K, n_dim]) – Estimation on centers