brainiak.matnormal package¶
Some properties of the matrix-variate normal distribution¶
The matrix-variate normal distribution is a generalization to matrices of the normal distribution. Another name for it is the multivariate normal distribution with kronecker separable covariance. The distributional intuition is as follows: if \(X \sim \mathcal{MN}(M,R,C)\) then \(\mathrm{vec}(X)\sim\mathcal{N}(\mathrm{vec}(M), C \otimes R)\), where \(\mathrm{vec}(\cdot)\) is the vectorization operator and \(\otimes\) is the Kronecker product. If we think of X as a matrix of TRs by voxels in the fMRI setting, then this model assumes that each voxel has the same TR-by-TR covariance structure (represented by the matrix R), and each volume has the same spatial covariance (represented by the matrix C). This assumption allows us to model both covariances separately. We can assume that the spatial covariance itself is kronecker-structured, which implies that the spatial covariance of voxels is the same in the X, Y and Z dimensions.
The log-likelihood for the matrix-normal density is:
Here \(X\) and \(M\) are both \(m\times n\) matrices, \(\R\) is \(m\times m\) and \(\C\) is \(n\times n\).
The brainiak.matnormal
package provides structure to infer models that
can be stated in the matrix-normal notation that are useful for fMRI analysis.
It provides a few interfaces. MatnormModelBase
is intended as a
base class for matrix-variate models. It provides a wrapper for the tensorflow
optimizer that provides convergence checks based on thresholds on the function
value and gradient, and simple verbose outputs. It also provides an interface
for noise covariances (CovBase
). Any class that follows the interface
can be used as a noise covariance in any of the matrix normal models. The
package includes a variety of noise covariances to work with.
Matrix normal marginals¶
Here we extend the multivariate gaussian marginalization identity to matrix normals. This is used in a number of the models in the package. Below, we use lowercase subscripts for sizes to make dimensionalities easier to track. Uppercase subscripts for covariances help keep track where they come from.
We vectorize, and convert to a form we recognize as \(y \sim \mathcal{N}(Mx+b, \Sigma)\).
Now we can use our standard gaussian marginalization identity:
Collect terms using the mixed-product property of kronecker products:
Now, we can see that the marginal density is a matrix-variate normal only if \(\Sigma_{\mathbf{Z}_k}= \Sigma_{\mathbf{Y}_k}\) – that is, the variable we’re marginalizing over has the same covariance in the dimension we’re not marginalizing over as the marginal density. Otherwise the densit is well-defined but the covariance retains its kronecker structure. So we let \(\Sigma_k:=\Sigma_{\mathbf{Z}_k}= \Sigma_{\mathbf{Y}_k}\), factor, and transform it back into a matrix normal:
We can do it in the other direction as well, because if \(\X \sim \mathcal{MN}(M, U, V)\) then \(\X\trp \sim \mathcal{MN}(M\trp, V, U)\):
These marginal likelihoods are implemented relatively efficiently in
MatnormModelBase.matnorm_logp_marginal_row
and
MatnormModelBase.matnorm_logp_marginal_col
.
Partitioned matrix normal conditionals¶
Here we extend the multivariate gaussian conditional identity to matrix normals. This is used for prediction in some models. Below, we use lowercase subscripts for sizes to make dimensionalities easier to track. Uppercase subscripts for covariances help keep track where they come from.
Next, we do the same for the partitioned gaussian identity. First two vectorized matrix-normals that form our partition:
We apply the standard partitioned Gaussian identity and simplify using the properties of the \(\vecop\) operator and the mixed product property of kronecker products:
Next, we recognize that this multivariate gaussian is equivalent to the following matrix variate gaussian:
The conditional in the other direction can be written by working through the same algebra:
Finally, vertical rather than horizontal concatenation (yielding a partitioned row rather than column covariance) can be written by recognizing the behavior of the matrix normal under transposition:
These conditional likelihoods are implemented relatively efficiently
in MatnormModelBase.matnorm_logp_conditional_row
and
MatnormModelBase.matnorm_logp_conditional_col
.
Submodules¶
brainiak.matnormal.covs module¶
-
class
brainiak.matnormal.covs.
CovAR1
(size, rho=None, sigma=None, scan_onsets=None)¶ Bases:
brainiak.matnormal.covs.CovBase
AR(1) covariance parameterized by autoregressive parameter rho and new noise sigma.
- Parameters
size (int) – size of covariance matrix
rho (float or None) – initial value of autoregressive parameter (if None, initialize randomly)
sigma (float or None) – initial value of new noise parameter (if None, initialize randomly)
-
get_optimize_vars
()¶ Returns a list of tf variables that need to get optimized to fit this covariance
-
property
logdet
¶ log-determinant of this covariance
-
solve
(X)¶ Given this covariance \(\Sigma\) and some input \(X\), compute \(\Sigma^{-1}x\)
-
class
brainiak.matnormal.covs.
CovBase
(size)¶ Bases:
abc.ABC
Base metaclass for residual covariances. For more on abstract classes, see https://docs.python.org/3/library/abc.html
- Parameters
size (int) – The size of the covariance matrix.
-
abstract
get_optimize_vars
()¶ Returns a list of tf variables that need to get optimized to fit this covariance
-
property
logdet
¶ log determinant of this covariance
-
abstract
solve
(X)¶ Given this covariance \(\Sigma\) and some input \(X\), compute \(\Sigma^{-1}x\)
-
class
brainiak.matnormal.covs.
CovDiagonal
(size, diag_var=None)¶ Bases:
brainiak.matnormal.covs.CovBase
Uncorrelated (diagonal) noise covariance
- Parameters
size (int) – size of covariance matrix
diag_var (float or None) – initial value of (diagonal) variance vector (if None, initialize randomly)
-
get_optimize_vars
()¶ Returns a list of tf variables that need to get optimized to fit this covariance
-
property
logdet
¶ log determinant of this covariance
-
solve
(X)¶ Given this covariance \(\Sigma\) and some input \(X\), compute \(\Sigma^{-1}x\)
- Parameters
X (tf.Tensor) – Tensor to multiply by inverse of this covariance
-
class
brainiak.matnormal.covs.
CovDiagonalGammaPrior
(size, sigma=None, alpha=1.5, beta=1e-10)¶ Bases:
brainiak.matnormal.covs.CovDiagonal
Uncorrelated (diagonal) noise covariance
-
class
brainiak.matnormal.covs.
CovIdentity
(size)¶ Bases:
brainiak.matnormal.covs.CovBase
Identity noise covariance.
-
get_optimize_vars
()¶ Returns a list of tf variables that need to get optimized to fit this covariance
-
property
logdet
¶ log determinant of this covariance
-
solve
(X)¶ Given this covariance \(\Sigma\) and some input \(X\), compute \(\Sigma^{-1}x\)
-
-
class
brainiak.matnormal.covs.
CovIsotropic
(size, var=None)¶ Bases:
brainiak.matnormal.covs.CovBase
Scaled identity (isotropic) noise covariance.
- Parameters
size (int) – size of covariance matrix
var (float or None) – initial value of new variance parameter (if None, initialize randomly)
-
get_optimize_vars
()¶ Returns a list of tf variables that need to get optimized to fit this covariance
-
property
logdet
¶ log determinant of this covariance
-
solve
(X)¶ Given this covariance \(\Sigma\) and some input \(X\), compute \(\Sigma^{-1}x\)
- Parameters
X (tf.Tensor) – Tensor to multiply by inverse of this covariance
-
class
brainiak.matnormal.covs.
CovKroneckerFactored
(sizes, Sigmas=None, mask=None)¶ Bases:
brainiak.matnormal.covs.CovBase
Kronecker product noise covariance parameterized in terms of its component cholesky factors
-
property
L
¶
-
get_optimize_vars
()¶ Returns a list of tf variables that need to get optimized to fit this covariance
-
property
logdet
¶ log|Sigma| using the diagonals of the cholesky factors.
-
solve
(X)¶ Given this covariance \(\Sigma\) and some input \(X\), compute \(\Sigma^{-1}x\) using traingular solves with the cholesky factors.
Specifically, we solve \(L L^T x = y\) by solving \(L z = y\) and \(L^T x = z\).
- Parameters
X (tf.Tensor) – Tensor to multiply by inverse of this covariance
-
property
-
class
brainiak.matnormal.covs.
CovUnconstrainedCholesky
(size=None, Sigma=None)¶ Bases:
brainiak.matnormal.covs.CovBase
Unconstrained noise covariance parameterized in terms of its cholesky
-
property
L
¶ Cholesky factor of this covariance
-
get_optimize_vars
()¶ Returns a list of tf variables that need to get optimized to fit this covariance
-
property
logdet
¶ log determinant of this covariance
-
solve
(X)¶ Given this covariance \(\Sigma\) and some input \(X\), compute \(\Sigma^{-1}x\) (using cholesky solve)
- Parameters
X (tf.Tensor) – Tensor to multiply by inverse of this covariance
-
property
-
class
brainiak.matnormal.covs.
CovUnconstrainedCholeskyWishartReg
(size, Sigma=None)¶ Bases:
brainiak.matnormal.covs.CovUnconstrainedCholesky
Unconstrained noise covariance parameterized in terms of its cholesky factor. Regularized using the trick from Chung et al. 2015 such that as the covariance approaches singularity, the likelihood goes to 0.
References
Chung, Y., Gelman, A., Rabe-Hesketh, S., Liu, J., & Dorie, V. (2015). Weakly Informative Prior for Point Estimation of Covariance Matrices in Hierarchical Models. Journal of Educational and Behavioral Statistics, 40(2), 136–157. https://doi.org/10.3102/1076998615570945
-
class
brainiak.matnormal.covs.
CovUnconstrainedInvCholesky
(size=None, invSigma=None)¶ Bases:
brainiak.matnormal.covs.CovBase
Unconstrained noise covariance parameterized in terms of its precision cholesky. Use this over the regular cholesky unless you have a good reason not to, since this saves a cholesky solve on every step of optimization
-
property
Linv
¶ Inverse of Cholesky factor of this covariance
-
get_optimize_vars
()¶ Returns a list of tf variables that need to get optimized to fit this covariance
-
property
logdet
¶ log determinant of this covariance
-
solve
(X)¶ Given this covariance \(\Sigma\) and some input \(X\), compute \(\Sigma^{-1}x\) (using cholesky solve)
- Parameters
X (tf.Tensor) – Tensor to multiply by inverse of this covariance
-
property
brainiak.matnormal.matnormal_likelihoods module¶
-
brainiak.matnormal.matnormal_likelihoods.
matnorm_logp
(x, row_cov, col_cov)¶ Log likelihood for centered matrix-variate normal density. Assumes that row_cov and col_cov follow the API defined in CovBase.
-
brainiak.matnormal.matnormal_likelihoods.
matnorm_logp_conditional_col
(x, row_cov, col_cov, cond, cond_cov)¶ Log likelihood for centered conditional matrix-variate normal density.
Consider the following partitioned matrix-normal density:
\[\begin{split}\begin{bmatrix} \operatorname{vec}\left[\mathbf{X}_{i j}\right] \\ \operatorname{vec}\left[\mathbf{Y}_{i k}\right]\end{bmatrix} \sim \mathcal{N}\left(0,\begin{bmatrix} \Sigma_{j} \otimes \Sigma_{i} & \Sigma_{j k} \otimes \Sigma_{i}\\ \Sigma_{k j} \otimes \Sigma_{i} & \Sigma_{k} \otimes \Sigma_{i} \end{bmatrix}\right)\end{split}\]Then we can write the conditional:
\[\mathbf{X}_{i j} \mid \mathbf{Y}_{i k} \sim \mathcal{M}\ \mathcal{N}\left(0, \Sigma_{i}, \Sigma_{j}-\Sigma_{j k}\ \Sigma_{k}^{-1} \Sigma_{k j}\right)\]This function efficiently computes the conditionals by unpacking some info in the covariance classes and then dispatching to
solve_det_conditional
.- Parameters
x (tf.Tensor) – Observation tensor
row_cov (CovBase) – Row covariance (\(\Sigma_{i}\) in the notation above).
col_cov (CovBase) – Column covariance (\(\Sigma_{j}\) in the notation above).
cond (tf.Tensor) – Off-diagonal block of the partitioned covariance (\(\Sigma_{jk}\) in the notation above).
cond_cov (CovBase) – Covariance of conditioning variable (\(\Sigma_{k}\) in the notation above).
-
brainiak.matnormal.matnormal_likelihoods.
matnorm_logp_conditional_row
(x, row_cov, col_cov, cond, cond_cov)¶ Log likelihood for centered conditional matrix-variate normal density.
Consider the following partitioned matrix-normal density:
\[\begin{split}\begin{bmatrix} \operatorname{vec}\left[\mathbf{X}_{i j}\right] \\ \operatorname{vec}\left[\mathbf{Y}_{i k}\right]\end{bmatrix} \sim \mathcal{N}\left(0,\begin{bmatrix} \Sigma_{j} \otimes \Sigma_{i} & \Sigma_{j k} \otimes \Sigma_{i}\\ \Sigma_{k j} \otimes \Sigma_{i} & \Sigma_{k} \otimes \Sigma_{i} \end{bmatrix}\right)\end{split}\]Then we can write the conditional:
\[\mathbf{X}^T j i \mid \mathbf{Y}_{k i}^T \sim \mathcal{M}\ \mathcal{N}\left(0, \Sigma_{j}-\Sigma_{j k} \Sigma_{k}^{-1} \Sigma_{k j},\ \Sigma_{i}\right)\]This function efficiently computes the conditionals by unpacking some info in the covariance classes and then dispatching to
solve_det_conditional
.- Parameters
x (tf.Tensor) – Observation tensor
row_cov (CovBase) – Row covariance (\(\Sigma_{i}\) in the notation above).
col_cov (CovBase) – Column covariance (\(\Sigma_{j}\) in the notation above).
cond (tf.Tensor) – Off-diagonal block of the partitioned covariance (\(\Sigma_{jk}\) in the notation above).
cond_cov (CovBase) – Covariance of conditioning variable (\(\Sigma_{k}\) in the notation above).
-
brainiak.matnormal.matnormal_likelihoods.
matnorm_logp_marginal_col
(x, row_cov, col_cov, marg, marg_cov)¶ Log likelihood for centered marginal matrix-variate normal density.
\[ \begin{align}\begin{aligned}X &\sim \mathcal{MN}(0, R, Q)\\\Y \mid \X &\sim \mathcal{MN}(XA, R, C),\\\Y &\sim \mathcal{MN}(0, R, C + A^TQA)\end{aligned}\end{align} \]This function efficiently computes the marginals by unpacking some info in the covariance classes and then dispatching to
solve_det_marginal
.- Parameters
x (tf.Tensor) – Observation tensor
row_cov (CovBase) – Row covariance implementing the CovBase API (\(R\) above).
col_cov (CovBase) – Column Covariance implementing the CovBase API (\(C\) above).
marg (tf.Tensor) – Marginal factor (\(A\) above).
marg_cov (CovBase) – Prior covariance implementing the CovBase API (\(Q\) above).
-
brainiak.matnormal.matnormal_likelihoods.
matnorm_logp_marginal_row
(x, row_cov, col_cov, marg, marg_cov)¶ Log likelihood for marginal centered matrix-variate normal density.
\[ \begin{align}\begin{aligned}X &\sim \mathcal{MN}(0, Q, C)\\\Y \mid \X &\sim \mathcal{MN}(AX, R, C),\\\Y &\sim \mathcal{MN}(0, R + AQA^T, C)\end{aligned}\end{align} \]This function efficiently computes the marginals by unpacking some info in the covariance classes and then dispatching to
solve_det_marginal
.- Parameters
x (tf.Tensor) – Observation tensor
row_cov (CovBase) – Row covariance implementing the CovBase API (\(R\) above).
col_cov (CovBase) – Column Covariance implementing the CovBase API (\(C\) above).
marg (tf.Tensor) – Marginal factor (\(A\) above).
marg_cov (CovBase) – Prior covariance implementing the CovBase API (\(Q\) above).
-
brainiak.matnormal.matnormal_likelihoods.
solve_det_conditional
(x, sigma, A, Q)¶ Use matrix inversion lemma for the solve:
\[(\Sigma - AQ^{-1}A^T)^{-1} X =\ (\Sigma^{-1} + \Sigma^{-1} A (Q - A^T \Sigma^{-1} A)^{-1} A^T \Sigma^{-1}) X\]Use matrix determinant lemma for determinant:
\[\log|(\Sigma - AQ^{-1}A^T)| = \log|Q - A^T \Sigma^{-1} A| - \log|Q| + \log|\Sigma|\]- Parameters
x (tf.Tensor) – Tensor to multiply the solve by
sigma (brainiak.matnormal.CovBase) – Covariance object implementing solve and logdet
A (tf.Tensor) – Factor multiplying the variable we conditioned on
Q (brainiak.matnormal.CovBase) – Covariance object of conditioning variable, implementing solve and logdet
-
brainiak.matnormal.matnormal_likelihoods.
solve_det_marginal
(x, sigma, A, Q)¶ Use matrix inversion lemma for the solve:
\[(\Sigma + AQA^T)^{-1} X =\ (\Sigma^{-1} - \Sigma^{-1} A (Q^{-1} + A^T \Sigma^{-1} A)^{-1} A^T \Sigma^{-1}) X\]Use matrix determinant lemma for determinant:
\[\log|(\Sigma + AQA^T)| = \log|Q^{-1} + A^T \Sigma^{-1} A| + \log|Q| + \log|\Sigma|\]- Parameters
x (tf.Tensor) – Tensor to multiply the solve by
sigma (brainiak.matnormal.CovBase) – Covariance object implementing solve and logdet
A (tf.Tensor) – Factor multiplying the variable we marginalized out
Q (brainiak.matnormal.CovBase) – Covariance object of marginalized variable, implementing solve and logdet
brainiak.matnormal.mnrsa module¶
-
class
brainiak.matnormal.mnrsa.
MNRSA
(time_cov, space_cov, n_nureg=5, optimizer='L-BFGS-B', optCtrl=None)¶ Bases:
sklearn.base.BaseEstimator
Matrix normal version of RSA.
The goal of this analysis is to find the covariance of the mapping from some design matrix X to the fMRI signal Y. It does so by marginalizing over the actual mapping (i.e. averaging over the uncertainty in it), which happens to correct a bias imposed by structure in the design matrix on the RSA estimate (see Cai et al., NIPS 2016).
This implementation makes different choices about residual covariance relative to
brainiak.reprsimil.BRSA
: Here, the noise covariance is assumed to be kronecker-separable. Informally, this means that all voxels have the same temporal covariance, and all time points have the same spatial covariance. This is in contrast to BRSA, which allows different temporal covariance for each voxel. On the other hand, computational efficiencies enabled by this choice allow MNRSA to support a richer class of space and time covariances (anything inbrainiak.matnormal.covs
).For users: in general, if you are worried about voxels each having different temporal noise structure,you should use
brainiak.reprsimil.BRSA
. If you are worried about between-voxel correlations or temporal covaraince structures that BRSA does not support, you should use MNRSA.\[ \begin{align}\begin{aligned}Y &\sim \mathcal{MN}(0, \Sigma_t + XLL^TX^T+ X_0X_0^T, \Sigma_s)\\\U &= LL^T\end{aligned}\end{align} \]- Parameters
time_cov (subclass of CovBase) – Temporal noise covariance class following CovBase interface.
space_cov (subclass of CovBase) – Spatial noise covariance class following CovBase interface.
optimizer (string, Default :'L-BFGS') – Name of scipy optimizer to use.
optCtrl (dict, default: None) – Additional arguments to pass to scipy.optimize.minimize.
-
property
L
¶ Cholesky factor of the RSA matrix.
-
fit
(X, y, naive_init=True)¶ Estimate dimension reduction and cognitive model parameters
- Parameters
X (2d array) – Brain data matrix (TRs by voxels). Y in the math
y (2d array or vector) – Behavior data matrix (TRs by behavioral obsevations). X in the math
max_iter (int, default=1000) – Maximum number of iterations to run
step (int, default=100) – Number of steps between optimizer output
restart (bool, default=True) – If this is true, optimizer is restarted (e.g. for a new dataset). Otherwise optimizer will continue from where it is now (for example for running more iterations if the initial number was not enough).
-
logp
(X, Y)¶ MNRSA Log-likelihood
brainiak.matnormal.regression module¶
-
class
brainiak.matnormal.regression.
MatnormalRegression
(time_cov, space_cov, optimizer='L-BFGS-B', optCtrl=None)¶ Bases:
sklearn.base.BaseEstimator
This analysis allows maximum likelihood estimation of regression models in the presence of both spatial and temporal covariance.
- ..math::
Y sim mathcal{MN}(Xeta, time_cov, space_cov)
- Parameters
time_cov (subclass of CovBase) – TR noise covariance class following CovBase interface.
space_cov (subclass of CovBase) – Voxel noise covariance class following CovBase interface.
optimizer (string, default="L-BFGS-B") – Scipy optimizer to use. For other options, see “method” argument of scipy.optimize.minimize
optCtrl (dict, default=None) – Additional arguments to pass to scipy.optimize.minimize.
-
calibrate
(Y)¶ Decode design matrix from fMRI dataset, based on a previously trained mapping. This method just does naive MLE:
\[X = Y \Sigma_s^{-1}B^T(B \Sigma_s^{-1} B^T)^{-1}\]- Parameters
Y (np.array, TRs by voxels.) – fMRI dataset
-
fit
(X, y, naive_init=True)¶ Compute the regression fit.
- Parameters
X (np.array, TRs by conditions.) – Design matrix
y (np.array, TRs by voxels.) – fMRI data
-
logp
(X, Y)¶ Log likelihood of model (internal)
-
predict
(X)¶ Predict fMRI signal from design matrix.
- Parameters
X (np.array, TRs by conditions.) – Design matrix
brainiak.matnormal.utils module¶
-
brainiak.matnormal.utils.
flatten_cholesky_unique
(L)¶ Flattens nonzero-elements Cholesky (triangular) factor into a vector, and logs diagonal to make parameterization unique. Inverse of unflatten_cholesky_unique.
-
brainiak.matnormal.utils.
make_val_and_grad
(lossfn, train_vars)¶ Makes a function that ouptuts the loss and gradient in a format compatible with scipy.optimize.minimize
-
brainiak.matnormal.utils.
pack_trainable_vars
(trainable_vars)¶ Pack trainable vars in a model into a single vector that can be passed to scipy.optimize
-
brainiak.matnormal.utils.
rmn
(rowcov, colcov)¶ Generate random draws from a zero-mean matrix-normal distribution.
- Parameters
rowcov (np.ndarray) – Row covariance (assumed to be positive definite)
colcov (np.ndarray) – Column covariance (assumed to be positive definite)
-
brainiak.matnormal.utils.
scaled_I
(x, size)¶ Scaled identity matrix \(x I_{size}\)
- Parameters
x (float or coercable to float) – Scale to multiply the identity matrix by
size (int or otherwise coercable to a size) – Dimension of the scaled identity matrix to return
-
brainiak.matnormal.utils.
unflatten_cholesky_unique
(L_flat)¶ Converts a vector of elements into a triangular matrix (Cholesky factor). Exponentiates diagonal to make parameterization unique. Inverse of flatten_cholesky_unique.
-
brainiak.matnormal.utils.
unpack_trainable_vars
(x, trainable_vars)¶ Unpack trainable vars from a single vector as used/returned by scipy.optimize
-
brainiak.matnormal.utils.
x_tx
(x)¶ Inner product \(x^T x\)
- Parameters
x (tf.Tensor) –
-
brainiak.matnormal.utils.
xx_t
(x)¶ Outer product \(xx^T\)
- Parameters
x (tf.Tensor) –