brainiak.funcalign package¶
Functional alignment of volumes from different subjects.
Submodules¶
brainiak.funcalign.fastsrm module¶
Fast Shared Response Model (FastSRM)
The implementation is based on the following publications:
- Richard2019
“Fast Shared Response Model for fMRI data” H. Richard, L. Martin, A. Pinho, J. Pillow, B. Thirion, 2019 https://arxiv.org/pdf/1909.12537.pdf
-
class
brainiak.funcalign.fastsrm.
FastSRM
(atlas=None, n_components=20, n_iter=100, temp_dir=None, low_ram=False, seed=None, n_jobs=1, verbose='warn', aggregate='mean')¶ Bases:
sklearn.base.BaseEstimator
,sklearn.base.TransformerMixin
SRM decomposition using a very low amount of memory and computational power thanks to the use of an atlas as described in [Richard2019].
Given multi-subject data, factorize it as a shared response S among all subjects and an orthogonal transform (basis) W per subject:
\[X_i \approx W_i S, \forall i=1 \dots N\]- Parameters
atlas (array, shape=[n_supervoxels, n_voxels] or array,shape=[n_voxels] or str or None, default=None) – Probabilistic or deterministic atlas on which to project the data. Deterministic atlas is an array of shape [n_voxels,] where values range from 1 to n_supervoxels. Voxels labelled 0 will be ignored. If atlas is a str the corresponding array is loaded with numpy.load and expected shape is (n_voxels,) for a deterministic atlas and (n_supervoxels, n_voxels) for a probabilistic atlas.
n_components (int) – Number of timecourses of the shared coordinates
n_iter (int) – Number of iterations to perform
temp_dir (str or None) – Path to dir where temporary results are stored. If None temporary results will be stored in memory. This can results in memory errors when the number of subjects and/or sessions is large
low_ram (bool) – If True and temp_dir is not None, reduced_data will be saved on disk. This increases the number of IO but reduces memory complexity when the number of subject and/or sessions is large
seed (int) – Seed used for random sampling.
n_jobs (int, optional, default=1) – The number of CPUs to use to do the computation. -1 means all CPUs, -2 all CPUs but one, and so on.
verbose (bool or "warn") – If True, logs are enabled. If False, logs are disabled. If “warn” only warnings are printed.
aggregate (str or None, default="mean") – If “mean”, shared_response is the mean shared response from all subjects. If None, shared_response contains all subject-specific responses in shared space
-
`basis_list`
if basis is a list of array, element i is the basis of subject i
if basis is a list of str, element i is the path to the basis of subject i that is loaded with np.load yielding an array of shape [n_components, n_voxels].
Note that any call to the clean method erases this attribute
- Type
list of array, element i has shape=[n_components, n_voxels] or list of str
Note
References: H. Richard, L. Martin, A. Pinho, J. Pillow, B. Thirion, 2019: Fast shared response model for fMRI data (https://arxiv.org/pdf/1909.12537.pdf)
-
add_subjects
(imgs, shared_response)¶ Add subjects to the current fit. Each new basis will be appended at the end of the list of basis (which can be accessed using self.basis)
- Parameters
imgs (array of str, shape=[n_subjects, n_sessions] or list of list of arrays or list of arrays) –
Element i, j of the array is a path to the data of subject i collected during session j. Data are loaded with numpy.load and expected shape is [n_voxels, n_timeframes] n_timeframes and n_voxels are assumed to be the same across subjects n_timeframes can vary across sessions. Each voxel’s timecourse is assumed to have mean 0 and variance 1
imgs can also be a list of list of arrays where element i, j of the array is a numpy array of shape [n_voxels, n_timeframes] that contains the data of subject i collected during session j.
imgs can also be a list of arrays where element i of the array is a numpy array of shape [n_voxels, n_timeframes] that contains the data of subject i (number of sessions is implicitly 1)
shared_response (list of arrays, list of list of arrays or array) –
if imgs is a list of array and self.aggregate=”mean”: shared response is an array of shape (n_components, n_timeframes)
if imgs is a list of array and self.aggregate=None: shared response is a list of array, element i is the projection of data of subject i in shared space.
if imgs is an array or a list of list of array and self.aggregate=”mean”: shared response is a list of array, element j is the shared response during session j
if imgs is an array or a list of list of array and self.aggregate=None: shared response is a list of list of array, element i, j is the projection of data of subject i collected during session j in shared space.
-
clean
()¶ This erases temporary files and basis_list attribute to free memory. This method should be called when fitted model is not needed anymore.
-
fit
(imgs)¶ Computes basis across subjects from input imgs
- Parameters
imgs (array of str, shape=[n_subjects, n_sessions] or list of list of arrays or list of arrays) –
Element i, j of the array is a path to the data of subject i collected during session j. Data are loaded with numpy.load and expected shape is [n_voxels, n_timeframes] n_timeframes and n_voxels are assumed to be the same across subjects n_timeframes can vary across sessions. Each voxel’s timecourse is assumed to have mean 0 and variance 1
imgs can also be a list of list of arrays where element i, j of the array is a numpy array of shape [n_voxels, n_timeframes] that contains the data of subject i collected during session j.
imgs can also be a list of arrays where element i of the array is a numpy array of shape [n_voxels, n_timeframes] that contains the data of subject i (number of sessions is implicitly 1)
- Returns
self – Returns the instance itself. Contains attributes listed at the object level.
- Return type
object
-
fit_transform
(imgs, subjects_indexes=None)¶ Computes basis across subjects and shared response from input imgs return shared response.
- Parameters
imgs (array of str, shape=[n_subjects, n_sessions] or list of list of arrays or list of arrays) –
Element i, j of the array is a path to the data of subject i collected during session j. Data are loaded with numpy.load and expected shape is [n_voxels, n_timeframes] n_timeframes and n_voxels are assumed to be the same across subjects n_timeframes can vary across sessions. Each voxel’s timecourse is assumed to have mean 0 and variance 1
imgs can also be a list of list of arrays where element i, j of the array is a numpy array of shape [n_voxels, n_timeframes] that contains the data of subject i collected during session j.
imgs can also be a list of arrays where element i of the array is a numpy array of shape [n_voxels, n_timeframes] that contains the data of subject i (number of sessions is implicitly 1)
subjects_indexes (list or None:) – if None imgs[i] will be transformed using basis_list[i]. Otherwise imgs[i] will be transformed using basis_list[subjects_index[i]]
- Returns
shared_response –
if imgs is a list of array and self.aggregate=”mean”: shared response is an array of shape (n_components, n_timeframes)
if imgs is a list of array and self.aggregate=None: shared response is a list of array, element i is the projection of data of subject i in shared space.
if imgs is an array or a list of list of array and self.aggregate=”mean”: shared response is a list of array, element j is the shared response during session j
if imgs is an array or a list of list of array and self.aggregate=None: shared response is a list of list of array, element i, j is the projection of data of subject i collected during session j in shared space.
- Return type
list of arrays, list of list of arrays or array
-
inverse_transform
(shared_response, subjects_indexes=None, sessions_indexes=None)¶ From shared response and basis from training data reconstruct subject’s data
- Parameters
shared_response (list of arrays, list of list of arrays or array) –
if imgs is a list of array and self.aggregate=”mean”: shared response is an array of shape (n_components, n_timeframes)
if imgs is a list of array and self.aggregate=None: shared response is a list of array, element i is the projection of data of subject i in shared space.
if imgs is an array or a list of list of array and self.aggregate=”mean”: shared response is a list of array, element j is the shared response during session j
if imgs is an array or a list of list of array and self.aggregate=None: shared response is a list of list of array, element i, j is the projection of data of subject i collected during session j in shared space.
subjects_indexes (list or None) – if None reconstructs data of all subjects used during train. Otherwise reconstructs data of subjects specified by subjects_indexes.
sessions_indexes (list or None) – if None reconstructs data of all sessions. Otherwise uses reconstructs data of sessions specified by sessions_indexes.
- Returns
reconstructed_data –
if reconstructed_data is a list of list : element i, j is the reconstructed data for subject subjects_indexes[i] and session sessions_indexes[j] as an np array of shape n_voxels, n_timeframes
if reconstructed_data is a list : element i is the reconstructed data for subject subject_indexes[i] as an np array of shape n_voxels, n_timeframes
- Return type
list of list of arrays or list of arrays
-
transform
(imgs, subjects_indexes=None)¶ From data in imgs and basis from training data, computes shared response.
- Parameters
imgs (array of str, shape=[n_subjects, n_sessions] or list of list of arrays or list of arrays) –
Element i, j of the array is a path to the data of subject i collected during session j. Data are loaded with numpy.load and expected shape is [n_voxels, n_timeframes] n_timeframes and n_voxels are assumed to be the same across subjects n_timeframes can vary across sessions. Each voxel’s timecourse is assumed to have mean 0 and variance 1
imgs can also be a list of list of arrays where element i, j of the array is a numpy array of shape [n_voxels, n_timeframes] that contains the data of subject i collected during session j.
imgs can also be a list of arrays where element i of the array is a numpy array of shape [n_voxels, n_timeframes] that contains the data of subject i (number of sessions is implicitly 1)
subjects_indexes (list or None:) – if None imgs[i] will be transformed using basis_list[i]. Otherwise imgs[i] will be transformed using basis[subjects_index[i]]
- Returns
shared_response –
if imgs is a list of array and self.aggregate=”mean”: shared response is an array of shape (n_components, n_timeframes)
if imgs is a list of array and self.aggregate=None: shared response is a list of array, element i is the projection of data of subject i in shared space.
if imgs is an array or a list of list of array and self.aggregate=”mean”: shared response is a list of array, element j is the shared response during session j
if imgs is an array or a list of list of array and self.aggregate=None: shared response is a list of list of array, element i, j is the projection of data of subject i collected during session j in shared space.
- Return type
list of arrays, list of list of arrays or array
brainiak.funcalign.rsrm module¶
Robust Shared Response Model (RSRM)
The implementation is based on the following publications:
- Turek2017(1,2)
“Capturing Shared and Individual Information in fMRI Data”, J. Turek, C. Ellis, L. Skalaban, N. Turk-Browne, T. Willke under review, 2017.
-
class
brainiak.funcalign.rsrm.
RSRM
(n_iter=10, features=50, gamma=1.0, rand_seed=0)¶ Bases:
sklearn.base.BaseEstimator
,sklearn.base.TransformerMixin
Robust Shared Response Model (RSRM)
Given multi-subject data, factorize it as a shared response R among all subjects, an orthogonal transform W per subject, and an individual (outlying) sparse component S per subject:
\[X_i \approx W_i R + S_i, \forall i=1 \dots N\]This unsupervised model allows to learn idiosyncratic information for subjects and simultaneously improve the shared response estimation. The model has similar properties to the Shared Response Model (SRM) with the addition of the individual components.
The model is estimated solving the following optimization problem:
\[\min_{W_i, S_i, R}\sum_i \frac{1}{2}\|X_i - W_i R - S_i\|_F^2\]\[+ \gamma\|S_i\|_1\]\[s.t. \qquad W_i^TW_i = I \quad \forall i=1 \dots N\]The solution to this problem is obtained by applying a Block-Coordinate Descent procedure. More details can be found in [Turek2017].
- Parameters
n_iter (int, default: 10) – Number of iterations to run the algorithm.
features (int, default: 50) – Number of features to compute.
gamma (float, default: 1.0) – Regularization parameter for the sparseness of the individual components. Higher values yield sparser individual components.
rand_seed (int, default: 0) – Seed for initializing the random number generator.
-
w_
¶ The orthogonal transforms (mappings) for each subject.
- Type
list of array, element i has shape=[voxels_i, features]
-
r_
¶ The shared response.
- Type
array, shape=[features, timepoints]
-
s_
¶ The individual components for each subject.
- Type
list of array, element i has shape=[voxels_i, timepoints]
-
random_state_
¶ Random number generator initialized using rand_seed
- Type
RandomState
Note
The number of voxels may be different between subjects. However, the number of timepoints for the alignment data must be the same across subjects.
The Robust Shared Response Model is approximated using the Block-Coordinate Descent (BCD) algorithm proposed in [Turek2017].
This is a single node version.
-
fit
(X)¶ Compute the Robust Shared Response Model
- Parameters
X (list of 2D arrays, element i has shape=[voxels_i, timepoints]) – Each element in the list contains the fMRI data of one subject.
-
transform
(X)¶ Use the model to transform new data to Shared Response space
- Parameters
X (list of 2D arrays, element i has shape=[voxels_i, timepoints_i]) – Each element in the list contains the fMRI data of one subject.
- Returns
r (list of 2D arrays, element i has shape=[features_i, timepoints_i]) – Shared responses from input data (X)
s (list of 2D arrays, element i has shape=[voxels_i, timepoints_i]) – Individual data obtained from fitting model to input data (X)
-
transform_subject
(X)¶ Transform a new subject using the existing model
- Parameters
X (2D array, shape=[voxels, timepoints]) – The fMRI data of the new subject.
- Returns
w (2D array, shape=[voxels, features]) – Orthogonal mapping
W_{new}
for new subjects (2D array, shape=[voxels, timepoints]) – Individual term
S_{new}
for new subject
brainiak.funcalign.srm module¶
Shared Response Model (SRM)
The implementations are based on the following publications:
- Chen2015(1,2)
“A Reduced-Dimension fMRI Shared Response Model”, P.-H. Chen, J. Chen, Y. Yeshurun-Dishon, U. Hasson, J. Haxby, P. Ramadge Advances in Neural Information Processing Systems (NIPS), 2015. http://papers.nips.cc/paper/5855-a-reduced-dimension-fmri-shared-response-model
- Anderson2016
“Enabling Factor Analysis on Thousand-Subject Neuroimaging Datasets”, Michael J. Anderson, Mihai Capotă, Javier S. Turek, Xia Zhu, Theodore L. Willke, Yida Wang, Po-Hsuan Chen, Jeremy R. Manning, Peter J. Ramadge, Kenneth A. Norman, IEEE International Conference on Big Data, 2016. https://doi.org/10.1109/BigData.2016.7840719
-
class
brainiak.funcalign.srm.
DetSRM
(n_iter=10, features=50, rand_seed=0)¶ Bases:
sklearn.base.BaseEstimator
,sklearn.base.TransformerMixin
Deterministic Shared Response Model (DetSRM)
Given multi-subject data, factorize it as a shared response S among all subjects and an orthogonal transform W per subject:
\[X_i \approx W_i S, \forall i=1 \dots N\]- Parameters
n_iter (int, default: 10) – Number of iterations to run the algorithm.
features (int, default: 50) – Number of features to compute.
rand_seed (int, default: 0) – Seed for initializing the random number generator.
-
w_
¶ The orthogonal transforms (mappings) for each subject.
- Type
list of array, element i has shape=[voxels_i, features]
-
s_
¶ The shared response.
- Type
array, shape=[features, samples]
-
random_state_
¶ Random number generator initialized using rand_seed
- Type
RandomState
Note
The number of voxels may be different between subjects. However, the number of samples must be the same across subjects.
The Deterministic Shared Response Model is approximated using the Block Coordinate Descent (BCD) algorithm proposed in [Chen2015].
This is a single node version.
The run-time complexity is \(O(I (V T K + V K^2))\) and the memory complexity is \(O(V T)\) with I - the number of iterations, V - the sum of voxels from all subjects, T - the number of samples, K - the number of features (typically, \(V \gg T \gg K\)), and N - the number of subjects.
-
fit
(X, y=None)¶ Compute the Deterministic Shared Response Model
- Parameters
X (list of 2D arrays, element i has shape=[voxels_i, samples]) – Each element in the list contains the fMRI data of one subject.
y (not used) –
-
transform
(X, y=None)¶ Use the model to transform data to the Shared Response subspace
- Parameters
X (list of 2D arrays, element i has shape=[voxels_i, samples_i]) – Each element in the list contains the fMRI data of one subject.
y (not used) –
- Returns
s – Shared responses from input data (X)
- Return type
list of 2D arrays, element i has shape=[features_i, samples_i]
-
transform_subject
(X)¶ Transform a new subject using the existing model. The subject is assumed to have recieved equivalent stimulation
- Parameters
X (2D array, shape=[voxels, timepoints]) – The fMRI data of the new subject.
- Returns
w – Orthogonal mapping
W_{new}
for new subject- Return type
2D array, shape=[voxels, features]
-
class
brainiak.funcalign.srm.
SRM
(n_iter=10, features=50, rand_seed=0, comm=<mpi4py.MPI.Intracomm object>)¶ Bases:
sklearn.base.BaseEstimator
,sklearn.base.TransformerMixin
Probabilistic Shared Response Model (SRM)
Given multi-subject data, factorize it as a shared response S among all subjects and an orthogonal transform W per subject:
\[X_i \approx W_i S, \forall i=1 \dots N\]- Parameters
n_iter (int, default: 10) – Number of iterations to run the algorithm.
features (int, default: 50) – Number of features to compute.
rand_seed (int, default: 0) – Seed for initializing the random number generator.
comm (mpi4py.MPI.Intracomm) – The MPI communicator containing the data
-
w_
¶ The orthogonal transforms (mappings) for each subject.
- Type
list of array, element i has shape=[voxels_i, features]
-
s_
¶ The shared response.
- Type
array, shape=[features, samples]
-
sigma_s_
¶ The covariance of the shared response Normal distribution.
- Type
array, shape=[features, features]
-
mu_
¶ The voxel means over the samples for each subject.
- Type
list of array, element i has shape=[voxels_i]
-
rho2_
¶ The estimated noise variance \(\rho_i^2\) for each subject
- Type
array, shape=[subjects]
-
comm
¶ The MPI communicator containing the data
- Type
mpi4py.MPI.Intracomm
-
random_state_
¶ Random number generator initialized using rand_seed
- Type
RandomState
Note
The number of voxels may be different between subjects. However, the number of samples must be the same across subjects.
The probabilistic Shared Response Model is approximated using the Expectation Maximization (EM) algorithm proposed in [Chen2015]. The implementation follows the optimizations published in [Anderson2016].
This is a single node version.
The run-time complexity is \(O(I (V T K + V K^2 + K^3))\) and the memory complexity is \(O(V T)\) with I - the number of iterations, V - the sum of voxels from all subjects, T - the number of samples, and K - the number of features (typically, \(V \gg T \gg K\)).
-
fit
(X, y=None)¶ Compute the probabilistic Shared Response Model
- Parameters
X (list of 2D arrays, element i has shape=[voxels_i, samples]) – Each element in the list contains the fMRI data of one subject.
y (not used) –
-
save
(file)¶ Save fitted SRM to .npz file.
- Parameters
file (str, file-like object, or pathlib.Path) – Filename (string), open file (file-like object) or pathlib.Path where the fitted SRM will be saved. If file is a string or a Path, the .npz extension will be appended to the filename if it is not already there.
- Returns
- Return type
None
-
transform
(X, y=None)¶ Use the model to transform matrix to Shared Response space
- Parameters
X (list of 2D arrays, element i has shape=[voxels_i, samples_i]) – Each element in the list contains the fMRI data of one subject note that number of voxels and samples can vary across subjects
y (not used (as it is unsupervised learning)) –
- Returns
s – Shared responses from input data (X)
- Return type
list of 2D arrays, element i has shape=[features_i, samples_i]
-
transform_subject
(X)¶ Transform a new subject using the existing model. The subject is assumed to have recieved equivalent stimulation
- Parameters
X (2D array, shape=[voxels, timepoints]) – The fMRI data of the new subject.
- Returns
w – Orthogonal mapping
W_{new}
for new subject- Return type
2D array, shape=[voxels, features]
brainiak.funcalign.sssrm module¶
Semi-Supervised Shared Response Model (SS-SRM)
The implementations are based on the following publications:
- Turek2016(1,2)
“A Semi-Supervised Method for Multi-Subject fMRI Functional Alignment”, J. S. Turek, T. L. Willke, P.-H. Chen, P. J. Ramadge IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2017, pp. 1098-1102. https://doi.org/10.1109/ICASSP.2017.7952326
-
class
brainiak.funcalign.sssrm.
SSSRM
(n_iter=10, features=50, gamma=1.0, alpha=0.5, rand_seed=0)¶ Bases:
sklearn.base.BaseEstimator
,sklearn.base.ClassifierMixin
,sklearn.base.TransformerMixin
Semi-Supervised Shared Response Model (SS-SRM)
Given multi-subject data, factorize it as a shared response S among all subjects and an orthogonal transform W per subject, using also labeled data to train a Multinomial Logistic Regression (MLR) classifier (with l2 regularization) in a semi-supervised manner:
(1)¶\[(1-\alpha) Loss_{SRM}(W_i,S;X_i) + \alpha/\gamma Loss_{MLR}(\theta, bias; {(W_i^T \times Z_i, y_i}) + R(\theta)\](see Equations (1) and (4) in [Turek2016]).
- Parameters
n_iter (int, default: 10) – Number of iterations to run the algorithm.
features (int, default: 50) – Number of features to compute.
gamma (float, default: 1.0) – Regularization parameter for the classifier.
alpha (float, default: 0.5) – Balance parameter between the SRM term and the MLR term.
rand_seed (int, default: 0) – Seed for initializing the random number generator.
-
w_
¶ The orthogonal transforms (mappings) for each subject.
- Type
list of array, element i has shape=[voxels_i, features]
-
s_
¶ The shared response.
- Type
array, shape=[features, samples]
-
theta_
¶ The MLR class plane parameters.
- Type
array, shape=[classes, features]
-
bias_
¶ The MLR class biases.
- Type
array, shape=[classes]
-
classes_
¶ Mapping table for each classes to original class label.
- Type
array of int, shape=[classes]
-
random_state_
¶ Random number generator initialized using rand_seed
- Type
RandomState
Note
The number of voxels may be different between subjects. However, the number of samples for the alignment data must be the same across subjects. The number of labeled samples per subject can be different.
The Semi-Supervised Shared Response Model is approximated using the Block-Coordinate Descent (BCD) algorithm proposed in [Turek2016].
This is a single node version.
-
fit
(X, y, Z)¶ Compute the Semi-Supervised Shared Response Model
- Parameters
X (list of 2D arrays, element i has shape=[voxels_i, n_align]) – Each element in the list contains the fMRI data for alignment of one subject. There are n_align samples for each subject.
y (list of arrays of int, element i has shape=[samples_i]) – Each element in the list contains the labels for the data samples in Z.
Z (list of 2D arrays, element i has shape=[voxels_i, samples_i]) – Each element in the list contains the fMRI data of one subject for training the MLR classifier.
-
predict
(X)¶ Classify the output for given data
- Parameters
X (list of 2D arrays, element i has shape=[voxels_i, samples_i]) – Each element in the list contains the fMRI data of one subject The number of voxels should be according to each subject at the moment of training the model.
- Returns
p – Predictions for each data sample.
- Return type
list of arrays, element i has shape=[samples_i]
-
transform
(X, y=None)¶ Use the model to transform matrix to Shared Response space
- Parameters
X (list of 2D arrays, element i has shape=[voxels_i, samples_i]) – Each element in the list contains the fMRI data of one subject note that number of voxels and samples can vary across subjects.
y (not used as it only applies the mappings) –
- Returns
s – Shared responses from input data (X)
- Return type
list of 2D arrays, element i has shape=[features_i, samples_i]