# brainiak.matnormal package¶

## Some properties of the matrix-variate normal distribution¶

$\DeclareMathOperator{\Tr}{Tr} \newcommand{\trp}{^{T}} % transpose \newcommand{\trace}{\text{Trace}} % trace \newcommand{\inv}{^{-1}} \newcommand{\mb}{\mathbf{b}} \newcommand{\M}{\mathbf{M}} \newcommand{\C}{\mathbf{C}} \newcommand{\G}{\mathbf{G}} \newcommand{\A}{\mathbf{A}} \newcommand{\R}{\mathbf{R}} \renewcommand{\S}{\mathbf{S}} \newcommand{\B}{\mathbf{B}} \newcommand{\Q}{\mathbf{Q}} \newcommand{\mH}{\mathbf{H}} \newcommand{\U}{\mathbf{U}} \newcommand{\mL}{\mathbf{L}} \newcommand{\diag}{\mathrm{diag}} \newcommand{\etr}{\mathrm{etr}} \renewcommand{\H}{\mathbf{H}} \newcommand{\vecop}{\mathrm{vec}} \newcommand{\I}{\mathbf{I}} \newcommand{\X}{\mathbf{X}} \newcommand{\Y}{\mathbf{Y}} \newcommand{\Z}{\mathbf{Z}} \renewcommand{\L}{\mathbf{L}}$

The matrix-variate normal distribution is a generalization to matrices of the normal distribution. Another name for it is the multivariate normal distribution with kronecker separable covariance. The distributional intuition is as follows: if $$X \sim \mathcal{MN}(M,R,C)$$ then $$\mathrm{vec}(X)\sim\mathcal{N}(\mathrm{vec}(M), C \otimes R)$$, where $$\mathrm{vec}(\cdot)$$ is the vectorization operator and $$\otimes$$ is the Kronecker product. If we think of X as a matrix of TRs by voxels in the fMRI setting, then this model assumes that each voxel has the same TR-by-TR covariance structure (represented by the matrix R), and each volume has the same spatial covariance (represented by the matrix C). This assumption allows us to model both covariances separately. We can assume that the spatial covariance itself is kronecker-structured, which implies that the spatial covariance of voxels is the same in the X, Y and Z dimensions.

The log-likelihood for the matrix-normal density is:

$\log p(X\mid \M,\R, \C) = -2\log mn - m \log|\C| - n \log|\R| - \Tr\left[\C\inv(\X-\M)\trp\R\inv(\X-\M)\right]$

Here $$X$$ and $$M$$ are both $$m\times n$$ matrices, $$\R$$ is $$m\times m$$ and $$\C$$ is $$n\times n$$.

The brainiak.matnormal package provides structure to infer models that can be stated in the matrix-normal notation that are useful for fMRI analysis. It provides a few interfaces. MatnormModelBase is intended as a base class for matrix-variate models. It provides a wrapper for the tensorflow optimizer that provides convergence checks based on thresholds on the function value and gradient, and simple verbose outputs. It also provides an interface for noise covariances (CovBase). Any class that follows the interface can be used as a noise covariance in any of the matrix normal models. The package includes a variety of noise covariances to work with.

## Matrix normal marginals¶

Here we extend the multivariate gaussian marginalization identity to matrix normals. This is used in a number of the models in the package. Below, we use lowercase subscripts for sizes to make dimensionalities easier to track. Uppercase subscripts for covariances help keep track where they come from.

$\begin{split}\mathbf{X}_{ij} &\sim \mathcal{MN}(\mathbf{A}_{ij}, \Sigma_{\mathbf{X}i},\Sigma_{\mathbf{X}j})\\ \mathbf{Y}_{jk} &\sim \mathcal{MN}(\mathbf{B}_{jk}, \Sigma_{\mathbf{Y}j},\Sigma_{\mathbf{Y}k})\\ \mathbf{Z}_{ik}\mid\mathbf{X}_{ij},\mathbf{Y}_{jk} &\sim \mathcal{MN}(\mathbf{X}_{ij}\mathbf{Y}_{jk} + \mathbf{C}_{ik}, \Sigma_{\mathbf{Z}_i}, \Sigma_{\mathbf{Z}_k})\\\end{split}$

We vectorize, and convert to a form we recognize as $$y \sim \mathcal{N}(Mx+b, \Sigma)$$.

$\begin{split}\vecop(\mathbf{Z}_{ik})\mid\mathbf{X}_{ij},\mathbf{Y}_{jk} &\sim \mathcal{N}(\vecop(\X_{ij}\mathbf{Y}_{jk}+\mathbf{C}_{ik}), \Sigma_{\mathbf{Z}_k}\otimes\Sigma_{\mathbf{Z}_i})\\ \vecop(\mathbf{Z}_{ik})\mid\mathbf{X}_{ij},\mathbf{Y}_{jk} &\sim \mathcal{N}((\I_k\otimes\X_{ij})\vecop(\mathbf{Y}_{jk}) + \vecop(\mathbf{C}_{ik}), \Sigma_{\mathbf{Z}_k}\otimes\Sigma_{\mathbf{Z}_i})\end{split}$

Now we can use our standard gaussian marginalization identity:

$\vecop(\mathbf{Z}_{ik})\mid\mathbf{X}_{ij} \sim \mathcal{N}((\I_k\otimes\X_{ij})\vecop(\mathbf{B}_{jk}) + \vecop(\mathbf{C}_{ik}), \Sigma_{\mathbf{Z}_k}\otimes\Sigma_{\mathbf{Z}_i} + (\I_k\otimes\X_{ij})(\Sigma_{\mathbf{Y}_k}\otimes \Sigma_{\mathbf{Y}_j})(\I_k\otimes\X_{ij})\trp )$

Collect terms using the mixed-product property of kronecker products:

$\vecop(\mathbf{Z}_{ik})\mid\mathbf{X}_{ij} \sim \mathcal{N}(\vecop(\X_{ij}\mathbf{B}_{jk}) + \vecop(\mathbf{C}_{ik}), \Sigma_{\mathbf{Z}_k}\otimes \Sigma_{\mathbf{Z}_i} + \Sigma_{\mathbf{Y}_k}\otimes \X_{ij}\Sigma_{\mathbf{Y}_j}\X_{ij}\trp)$

Now, we can see that the marginal density is a matrix-variate normal only if $$\Sigma_{\mathbf{Z}_k}= \Sigma_{\mathbf{Y}_k}$$ – that is, the variable we’re marginalizing over has the same covariance in the dimension we’re not marginalizing over as the marginal density. Otherwise the densit is well-defined but the covariance retains its kronecker structure. So we let $$\Sigma_k:=\Sigma_{\mathbf{Z}_k}= \Sigma_{\mathbf{Y}_k}$$, factor, and transform it back into a matrix normal:

$\begin{split}\vecop(\mathbf{Z}_{ik})\mid\mathbf{X}_{ij} &\sim \mathcal{N}(\vecop(\X\mathbf{B}_{jk}) + \vecop(\mathbf{C}_{ik}), \Sigma_{k}\otimes\Sigma_{\mathbf{Z}_i} + \Sigma_{_k}\otimes \X\Sigma_{\mathbf{Y}_j}\X\trp)\\ \vecop(\mathbf{Z}_{ik})\mid\mathbf{X}_{ij} &\sim \mathcal{N}(\vecop(\X\mathbf{B}_{jk}) + \vecop(\mathbf{C}_{ik}), \Sigma_{k}\otimes(\Sigma_{\mathbf{Z}_i} +\X\Sigma_{\mathbf{Y}_j}\X\trp))\\ \mathbf{Z}_{ik}\mid\mathbf{X}_{ij} &\sim \mathcal{MN}(\X\mathbf{B}_{jk} + \mathbf{C}_{ik}, \Sigma_{\mathbf{Z}_i} +\X\Sigma_{\mathbf{Y}_j}\X\trp,\Sigma_{k})\end{split}$

We can do it in the other direction as well, because if $$\X \sim \mathcal{MN}(M, U, V)$$ then $$\X\trp \sim \mathcal{MN}(M\trp, V, U)$$:

$\begin{split}\mathbf{Z\trp}_{ik}\mid\mathbf{X}_{ij},\mathbf{Y}_{jk} &\sim \mathcal{MN}(\mathbf{Y}_{jk}\trp\mathbf{X}_{ij}\trp + \mathbf{C}\trp_{ik}, \Sigma_{\mathbf{Z}_k},\Sigma_{\mathbf{Z}_i})\\ \mbox{let } \Sigma_i := \Sigma_{\mathbf{Z}_i}=\Sigma_{\mathbf{X}_i} \\ \cdots\\ \mathbf{Z\trp}_{ik}\mid\mathbf{Y}_{jk} &\sim \mathcal{MN}(\mathbf{A}_{jk}\trp\mathbf{X}_{ij}\trp + \mathbf{C}\trp_{ik}, \Sigma_{\mathbf{Z}_k} + \Y\trp\Sigma_{\mathbf{Y}_j}\Y,\Sigma_{\mathbf{Z}_i})\\ \mathbf{Z}_{ik}\mid\mathbf{Y}_{jk} &\sim \mathcal{MN}(\mathbf{X}_{ij}\mathbf{A}_{jk}+ \mathbf{C}_{ik},\Sigma_{\mathbf{Z}_i},\Sigma_{\mathbf{Z}_k} + \Y\trp\Sigma_{\mathbf{Y}_j}\Y)\end{split}$

These marginal likelihoods are implemented relatively efficiently in MatnormModelBase.matnorm_logp_marginal_row and MatnormModelBase.matnorm_logp_marginal_col.

## Partitioned matrix normal conditionals¶

Here we extend the multivariate gaussian conditional identity to matrix normals. This is used for prediction in some models. Below, we use lowercase subscripts for sizes to make dimensionalities easier to track. Uppercase subscripts for covariances help keep track where they come from.

Next, we do the same for the partitioned gaussian identity. First two vectorized matrix-normals that form our partition:

$\begin{split}\mathbf{X}_{ij} &\sim \mathcal{MN}(\mathbf{A}_{ij}, \Sigma_{i}, \Sigma_{j}) \rightarrow \vecop[\mathbf{X}_{ij}] \sim \mathcal{N}(\vecop[\mathbf{A}_{ij}], \Sigma_{j}\otimes\Sigma_{i})\\ \mathbf{Y}_{ik} &\sim \mathcal{MN}(\mathbf{B}_{ik}, \Sigma_{i}, \Sigma_{k}) \rightarrow \vecop[\mathbf{Y}_{ik}] \sim \mathcal{N}(\vecop[\mathbf{B}_{ik}], \Sigma_{k}\otimes\Sigma_{i})\\ \begin{bmatrix}\vecop[\mathbf{X}_{ij}] \\ \vecop[\mathbf{Y}_{ik}] \end{bmatrix} & \sim \mathcal{N}\left(\vecop\begin{bmatrix}\mathbf{A}_{ij} \\ \mathbf{B}_{ik} \end{bmatrix} , \begin{bmatrix} \Sigma_{j}\otimes \Sigma_i & \Sigma_{jk} \otimes \Sigma_i \\ \Sigma_{kj}\otimes \Sigma_i & \Sigma_{k} \otimes \Sigma_i\end{bmatrix}\right)\end{split}$

We apply the standard partitioned Gaussian identity and simplify using the properties of the $$\vecop$$ operator and the mixed product property of kronecker products:

$\begin{split}\vecop[\X_{ij}] \mid \vecop[\Y_{ik}]\sim \mathcal{N}(&\vecop[\A_{ij}] + (\Sigma_{jk}\otimes\Sigma_i) (\Sigma_k\inv\otimes\Sigma_i\inv)(\vecop[\Y_{ik}]-\vecop[\B_{ik}]),\\ & \Sigma_j\otimes\Sigma_i - (\Sigma_{jk}\otimes\Sigma_i) (\Sigma_k\inv\otimes\Sigma_i\inv) (\Sigma_{kj}\otimes\Sigma_i))\\ =\mathcal{N}(&\vecop[\A_{ij}] + (\Sigma_{jk}\Sigma_k\inv\otimes\Sigma_i\Sigma_i\inv) (\vecop[\Y_{ik}]-\vecop[\B_{ik}]), \\ & \Sigma_j\otimes\Sigma_i - (\Sigma_{jk}\Sigma_k\inv\Sigma_{kj}\otimes \Sigma_i\Sigma_i\inv \Sigma_i))\\ =\mathcal{N}(&\vecop[\A_{ij}] + (\Sigma_{jk}\Sigma_k\inv\otimes\I) (\vecop[\Y_{ik}]-\vecop[\B_{ik}]), \\ & \Sigma_j\otimes\Sigma_i - (\Sigma_{jk}\Sigma_k\inv\Sigma_{kj}\otimes\Sigma_i)\\ =\mathcal{N}(&\vecop[\A_{ij}] + \vecop[\Y_{ik}-\B_{ik}\Sigma_k\inv\Sigma_{kj}], (\Sigma_j-\Sigma_{jk}\Sigma_k\inv\Sigma_{kj})\otimes\Sigma_i)\end{split}$

Next, we recognize that this multivariate gaussian is equivalent to the following matrix variate gaussian:

$\X_{ij} \mid \Y_{ik}\sim \mathcal{MN}(\A_{ij} + (\Y_{ik}-\B_{ik})\Sigma_k\inv\Sigma_{kj}, \Sigma_i, \Sigma_j-\Sigma_{jk}\Sigma_k\inv\Sigma_{kj})$

The conditional in the other direction can be written by working through the same algebra:

$\Y_{ik} \mid \X_{ij}\sim \mathcal{MN}(\B_{ik} +(\X_{ij}- \A_{ij})\Sigma_j\inv\Sigma_{jk}, \Sigma_i, \Sigma_k-\Sigma_{kj}\Sigma_j\inv\Sigma_{jk})$

Finally, vertical rather than horizontal concatenation (yielding a partitioned row rather than column covariance) can be written by recognizing the behavior of the matrix normal under transposition:

$\begin{split}\X\trp_{ji} \mid \Y\trp_{ki}\sim \mathcal{MN}(&\A\trp_{ji} + \Sigma_{jk}\Sigma_k\inv(\Y\trp_{ki}-\B\trp_{ki}), \Sigma_j-\Sigma_{jk}\Sigma_k\inv\Sigma_{kj}, \Sigma_i)\\ \Y\trp_{ki} \mid \X\trp_{ji}\sim \mathcal{MN}(&\B\trp_{ki} + \Sigma_{kj}\Sigma_j\inv(\X\trp_{ji}-\A\trp_{ji}), \Sigma_k-\Sigma_{kj}\Sigma_j\inv\Sigma_{jk}, \Sigma_i)\end{split}$

These conditional likelihoods are implemented relatively efficiently in MatnormModelBase.matnorm_logp_conditional_row and MatnormModelBase.matnorm_logp_conditional_col.

## brainiak.matnormal.covs module¶

class brainiak.matnormal.covs.CovAR1(size, rho=None, sigma=None, scan_onsets=None)

AR(1) covariance parameterized by autoregressive parameter rho and new noise sigma.

Parameters
• size (int) – size of covariance matrix

• rho (float or None) – initial value of autoregressive parameter (if None, initialize randomly)

• sigma (float or None) – initial value of new noise parameter (if None, initialize randomly)

get_optimize_vars()

Returns a list of tf variables that need to get optimized to fit this covariance

property logdet

log-determinant of this covariance

solve(X)

Given this covariance $$\Sigma$$ and some input $$X$$, compute $$\Sigma^{-1}x$$

class brainiak.matnormal.covs.CovBase(size)

Bases: abc.ABC

Base metaclass for residual covariances. For more on abstract classes, see https://docs.python.org/3/library/abc.html

Parameters

size (int) – The size of the covariance matrix.

abstract get_optimize_vars()

Returns a list of tf variables that need to get optimized to fit this covariance

property logdet

log determinant of this covariance

abstract solve(X)

Given this covariance $$\Sigma$$ and some input $$X$$, compute $$\Sigma^{-1}x$$

class brainiak.matnormal.covs.CovDiagonal(size, diag_var=None)

Uncorrelated (diagonal) noise covariance

Parameters
• size (int) – size of covariance matrix

• diag_var (float or None) – initial value of (diagonal) variance vector (if None, initialize randomly)

get_optimize_vars()

Returns a list of tf variables that need to get optimized to fit this covariance

property logdet

log determinant of this covariance

solve(X)

Given this covariance $$\Sigma$$ and some input $$X$$, compute $$\Sigma^{-1}x$$

Parameters

X (tf.Tensor) – Tensor to multiply by inverse of this covariance

class brainiak.matnormal.covs.CovDiagonalGammaPrior(size, sigma=None, alpha=1.5, beta=1e-10)

Uncorrelated (diagonal) noise covariance

class brainiak.matnormal.covs.CovIdentity(size)

Identity noise covariance.

get_optimize_vars()

Returns a list of tf variables that need to get optimized to fit this covariance

property logdet

log determinant of this covariance

solve(X)

Given this covariance $$\Sigma$$ and some input $$X$$, compute $$\Sigma^{-1}x$$

class brainiak.matnormal.covs.CovIsotropic(size, var=None)

Scaled identity (isotropic) noise covariance.

Parameters
• size (int) – size of covariance matrix

• var (float or None) – initial value of new variance parameter (if None, initialize randomly)

get_optimize_vars()

Returns a list of tf variables that need to get optimized to fit this covariance

property logdet

log determinant of this covariance

solve(X)

Given this covariance $$\Sigma$$ and some input $$X$$, compute $$\Sigma^{-1}x$$

Parameters

X (tf.Tensor) – Tensor to multiply by inverse of this covariance

class brainiak.matnormal.covs.CovKroneckerFactored(sizes, Sigmas=None, mask=None)

Kronecker product noise covariance parameterized in terms of its component cholesky factors

property L
get_optimize_vars()

Returns a list of tf variables that need to get optimized to fit this covariance

property logdet

log|Sigma| using the diagonals of the cholesky factors.

solve(X)

Given this covariance $$\Sigma$$ and some input $$X$$, compute $$\Sigma^{-1}x$$ using traingular solves with the cholesky factors.

Specifically, we solve $$L L^T x = y$$ by solving $$L z = y$$ and $$L^T x = z$$.

Parameters

X (tf.Tensor) – Tensor to multiply by inverse of this covariance

class brainiak.matnormal.covs.CovUnconstrainedCholesky(size=None, Sigma=None)

Unconstrained noise covariance parameterized in terms of its cholesky

property L

Cholesky factor of this covariance

get_optimize_vars()

Returns a list of tf variables that need to get optimized to fit this covariance

property logdet

log determinant of this covariance

solve(X)

Given this covariance $$\Sigma$$ and some input $$X$$, compute $$\Sigma^{-1}x$$ (using cholesky solve)

Parameters

X (tf.Tensor) – Tensor to multiply by inverse of this covariance

class brainiak.matnormal.covs.CovUnconstrainedCholeskyWishartReg(size, Sigma=None)

Unconstrained noise covariance parameterized in terms of its cholesky factor. Regularized using the trick from Chung et al. 2015 such that as the covariance approaches singularity, the likelihood goes to 0.

References

Chung, Y., Gelman, A., Rabe-Hesketh, S., Liu, J., & Dorie, V. (2015). Weakly Informative Prior for Point Estimation of Covariance Matrices in Hierarchical Models. Journal of Educational and Behavioral Statistics, 40(2), 136–157. https://doi.org/10.3102/1076998615570945

class brainiak.matnormal.covs.CovUnconstrainedInvCholesky(size=None, invSigma=None)

Unconstrained noise covariance parameterized in terms of its precision cholesky. Use this over the regular cholesky unless you have a good reason not to, since this saves a cholesky solve on every step of optimization

property Linv

Inverse of Cholesky factor of this covariance

get_optimize_vars()

Returns a list of tf variables that need to get optimized to fit this covariance

property logdet

log determinant of this covariance

solve(X)

Given this covariance $$\Sigma$$ and some input $$X$$, compute $$\Sigma^{-1}x$$ (using cholesky solve)

Parameters

X (tf.Tensor) – Tensor to multiply by inverse of this covariance

## brainiak.matnormal.matnormal_likelihoods module¶

brainiak.matnormal.matnormal_likelihoods.matnorm_logp(x, row_cov, col_cov)

Log likelihood for centered matrix-variate normal density. Assumes that row_cov and col_cov follow the API defined in CovBase.

Parameters
• x (tf.Tensor) – Observation tensor

• row_cov (CovBase) – Row covariance implementing the CovBase API

• col_cov (CovBase) – Column Covariance implementing the CovBase API

brainiak.matnormal.matnormal_likelihoods.matnorm_logp_conditional_col(x, row_cov, col_cov, cond, cond_cov)

Log likelihood for centered conditional matrix-variate normal density.

Consider the following partitioned matrix-normal density:

$\begin{split}\begin{bmatrix} \operatorname{vec}\left[\mathbf{X}_{i j}\right] \\ \operatorname{vec}\left[\mathbf{Y}_{i k}\right]\end{bmatrix} \sim \mathcal{N}\left(0,\begin{bmatrix} \Sigma_{j} \otimes \Sigma_{i} & \Sigma_{j k} \otimes \Sigma_{i}\\ \Sigma_{k j} \otimes \Sigma_{i} & \Sigma_{k} \otimes \Sigma_{i} \end{bmatrix}\right)\end{split}$

Then we can write the conditional:

$\mathbf{X}_{i j} \mid \mathbf{Y}_{i k} \sim \mathcal{M}\ \mathcal{N}\left(0, \Sigma_{i}, \Sigma_{j}-\Sigma_{j k}\ \Sigma_{k}^{-1} \Sigma_{k j}\right)$

This function efficiently computes the conditionals by unpacking some info in the covariance classes and then dispatching to solve_det_conditional.

Parameters
• x (tf.Tensor) – Observation tensor

• row_cov (CovBase) – Row covariance ($$\Sigma_{i}$$ in the notation above).

• col_cov (CovBase) – Column covariance ($$\Sigma_{j}$$ in the notation above).

• cond (tf.Tensor) – Off-diagonal block of the partitioned covariance ($$\Sigma_{jk}$$ in the notation above).

• cond_cov (CovBase) – Covariance of conditioning variable ($$\Sigma_{k}$$ in the notation above).

brainiak.matnormal.matnormal_likelihoods.matnorm_logp_conditional_row(x, row_cov, col_cov, cond, cond_cov)

Log likelihood for centered conditional matrix-variate normal density.

Consider the following partitioned matrix-normal density:

$\begin{split}\begin{bmatrix} \operatorname{vec}\left[\mathbf{X}_{i j}\right] \\ \operatorname{vec}\left[\mathbf{Y}_{i k}\right]\end{bmatrix} \sim \mathcal{N}\left(0,\begin{bmatrix} \Sigma_{j} \otimes \Sigma_{i} & \Sigma_{j k} \otimes \Sigma_{i}\\ \Sigma_{k j} \otimes \Sigma_{i} & \Sigma_{k} \otimes \Sigma_{i} \end{bmatrix}\right)\end{split}$

Then we can write the conditional:

$\mathbf{X}^T j i \mid \mathbf{Y}_{k i}^T \sim \mathcal{M}\ \mathcal{N}\left(0, \Sigma_{j}-\Sigma_{j k} \Sigma_{k}^{-1} \Sigma_{k j},\ \Sigma_{i}\right)$

This function efficiently computes the conditionals by unpacking some info in the covariance classes and then dispatching to solve_det_conditional.

Parameters
• x (tf.Tensor) – Observation tensor

• row_cov (CovBase) – Row covariance ($$\Sigma_{i}$$ in the notation above).

• col_cov (CovBase) – Column covariance ($$\Sigma_{j}$$ in the notation above).

• cond (tf.Tensor) – Off-diagonal block of the partitioned covariance ($$\Sigma_{jk}$$ in the notation above).

• cond_cov (CovBase) – Covariance of conditioning variable ($$\Sigma_{k}$$ in the notation above).

brainiak.matnormal.matnormal_likelihoods.matnorm_logp_marginal_col(x, row_cov, col_cov, marg, marg_cov)

Log likelihood for centered marginal matrix-variate normal density.

\begin{align}\begin{aligned}X &\sim \mathcal{MN}(0, R, Q)\\\Y \mid \X &\sim \mathcal{MN}(XA, R, C),\\\Y &\sim \mathcal{MN}(0, R, C + A^TQA)\end{aligned}\end{align}

This function efficiently computes the marginals by unpacking some info in the covariance classes and then dispatching to solve_det_marginal.

Parameters
• x (tf.Tensor) – Observation tensor

• row_cov (CovBase) – Row covariance implementing the CovBase API ($$R$$ above).

• col_cov (CovBase) – Column Covariance implementing the CovBase API ($$C$$ above).

• marg (tf.Tensor) – Marginal factor ($$A$$ above).

• marg_cov (CovBase) – Prior covariance implementing the CovBase API ($$Q$$ above).

brainiak.matnormal.matnormal_likelihoods.matnorm_logp_marginal_row(x, row_cov, col_cov, marg, marg_cov)

Log likelihood for marginal centered matrix-variate normal density.

\begin{align}\begin{aligned}X &\sim \mathcal{MN}(0, Q, C)\\\Y \mid \X &\sim \mathcal{MN}(AX, R, C),\\\Y &\sim \mathcal{MN}(0, R + AQA^T, C)\end{aligned}\end{align}

This function efficiently computes the marginals by unpacking some info in the covariance classes and then dispatching to solve_det_marginal.

Parameters
• x (tf.Tensor) – Observation tensor

• row_cov (CovBase) – Row covariance implementing the CovBase API ($$R$$ above).

• col_cov (CovBase) – Column Covariance implementing the CovBase API ($$C$$ above).

• marg (tf.Tensor) – Marginal factor ($$A$$ above).

• marg_cov (CovBase) – Prior covariance implementing the CovBase API ($$Q$$ above).

brainiak.matnormal.matnormal_likelihoods.solve_det_conditional(x, sigma, A, Q)

Use matrix inversion lemma for the solve:

$(\Sigma - AQ^{-1}A^T)^{-1} X =\ (\Sigma^{-1} + \Sigma^{-1} A (Q - A^T \Sigma^{-1} A)^{-1} A^T \Sigma^{-1}) X$

Use matrix determinant lemma for determinant:

$\log|(\Sigma - AQ^{-1}A^T)| = \log|Q - A^T \Sigma^{-1} A| - \log|Q| + \log|\Sigma|$
Parameters
• x (tf.Tensor) – Tensor to multiply the solve by

• sigma (brainiak.matnormal.CovBase) – Covariance object implementing solve and logdet

• A (tf.Tensor) – Factor multiplying the variable we conditioned on

• Q (brainiak.matnormal.CovBase) – Covariance object of conditioning variable, implementing solve and logdet

brainiak.matnormal.matnormal_likelihoods.solve_det_marginal(x, sigma, A, Q)

Use matrix inversion lemma for the solve:

$(\Sigma + AQA^T)^{-1} X =\ (\Sigma^{-1} - \Sigma^{-1} A (Q^{-1} + A^T \Sigma^{-1} A)^{-1} A^T \Sigma^{-1}) X$

Use matrix determinant lemma for determinant:

$\log|(\Sigma + AQA^T)| = \log|Q^{-1} + A^T \Sigma^{-1} A| + \log|Q| + \log|\Sigma|$
Parameters
• x (tf.Tensor) – Tensor to multiply the solve by

• sigma (brainiak.matnormal.CovBase) – Covariance object implementing solve and logdet

• A (tf.Tensor) – Factor multiplying the variable we marginalized out

• Q (brainiak.matnormal.CovBase) – Covariance object of marginalized variable, implementing solve and logdet

## brainiak.matnormal.mnrsa module¶

class brainiak.matnormal.mnrsa.MNRSA(time_cov, space_cov, n_nureg=5, optimizer='L-BFGS-B', optCtrl=None)

Bases: sklearn.base.BaseEstimator

Matrix normal version of RSA.

The goal of this analysis is to find the covariance of the mapping from some design matrix X to the fMRI signal Y. It does so by marginalizing over the actual mapping (i.e. averaging over the uncertainty in it), which happens to correct a bias imposed by structure in the design matrix on the RSA estimate (see Cai et al., NIPS 2016).

This implementation makes different choices about residual covariance relative to brainiak.reprsimil.BRSA: Here, the noise covariance is assumed to be kronecker-separable. Informally, this means that all voxels have the same temporal covariance, and all time points have the same spatial covariance. This is in contrast to BRSA, which allows different temporal covariance for each voxel. On the other hand, computational efficiencies enabled by this choice allow MNRSA to support a richer class of space and time covariances (anything in brainiak.matnormal.covs).

For users: in general, if you are worried about voxels each having different temporal noise structure,you should use brainiak.reprsimil.BRSA. If you are worried about between-voxel correlations or temporal covaraince structures that BRSA does not support, you should use MNRSA.

\begin{align}\begin{aligned}Y &\sim \mathcal{MN}(0, \Sigma_t + XLL^TX^T+ X_0X_0^T, \Sigma_s)\\\U &= LL^T\end{aligned}\end{align}
Parameters
• time_cov (subclass of CovBase) – Temporal noise covariance class following CovBase interface.

• space_cov (subclass of CovBase) – Spatial noise covariance class following CovBase interface.

• optimizer (string, Default :'L-BFGS') – Name of scipy optimizer to use.

• optCtrl (dict, default: None) – Additional arguments to pass to scipy.optimize.minimize.

property L

Cholesky factor of the RSA matrix.

fit(X, y, naive_init=True)

Estimate dimension reduction and cognitive model parameters

Parameters
• X (2d array) – Brain data matrix (TRs by voxels). Y in the math

• y (2d array or vector) – Behavior data matrix (TRs by behavioral obsevations). X in the math

• max_iter (int, default=1000) – Maximum number of iterations to run

• step (int, default=100) – Number of steps between optimizer output

• restart (bool, default=True) – If this is true, optimizer is restarted (e.g. for a new dataset). Otherwise optimizer will continue from where it is now (for example for running more iterations if the initial number was not enough).

logp(X, Y)

MNRSA Log-likelihood

## brainiak.matnormal.regression module¶

class brainiak.matnormal.regression.MatnormalRegression(time_cov, space_cov, optimizer='L-BFGS-B', optCtrl=None)

Bases: sklearn.base.BaseEstimator

This analysis allows maximum likelihood estimation of regression models in the presence of both spatial and temporal covariance.

..math::

Y sim mathcal{MN}(Xeta, time_cov, space_cov)

Parameters
• time_cov (subclass of CovBase) – TR noise covariance class following CovBase interface.

• space_cov (subclass of CovBase) – Voxel noise covariance class following CovBase interface.

• optimizer (string, default="L-BFGS-B") – Scipy optimizer to use. For other options, see “method” argument of scipy.optimize.minimize

• optCtrl (dict, default=None) – Additional arguments to pass to scipy.optimize.minimize.

calibrate(Y)

Decode design matrix from fMRI dataset, based on a previously trained mapping. This method just does naive MLE:

$X = Y \Sigma_s^{-1}B^T(B \Sigma_s^{-1} B^T)^{-1}$
Parameters

Y (np.array, TRs by voxels.) – fMRI dataset

fit(X, y, naive_init=True)

Compute the regression fit.

Parameters
• X (np.array, TRs by conditions.) – Design matrix

• y (np.array, TRs by voxels.) – fMRI data

logp(X, Y)

Log likelihood of model (internal)

predict(X)

Predict fMRI signal from design matrix.

Parameters

X (np.array, TRs by conditions.) – Design matrix

## brainiak.matnormal.utils module¶

brainiak.matnormal.utils.flatten_cholesky_unique(L)

Flattens nonzero-elements Cholesky (triangular) factor into a vector, and logs diagonal to make parameterization unique. Inverse of unflatten_cholesky_unique.

brainiak.matnormal.utils.make_val_and_grad(lossfn, train_vars)

Makes a function that ouptuts the loss and gradient in a format compatible with scipy.optimize.minimize

brainiak.matnormal.utils.pack_trainable_vars(trainable_vars)

Pack trainable vars in a model into a single vector that can be passed to scipy.optimize

brainiak.matnormal.utils.rmn(rowcov, colcov)

Generate random draws from a zero-mean matrix-normal distribution.

Parameters
• rowcov (np.ndarray) – Row covariance (assumed to be positive definite)

• colcov (np.ndarray) – Column covariance (assumed to be positive definite)

brainiak.matnormal.utils.scaled_I(x, size)

Scaled identity matrix $$x I_{size}$$

Parameters
• x (float or coercable to float) – Scale to multiply the identity matrix by

• size (int or otherwise coercable to a size) – Dimension of the scaled identity matrix to return

brainiak.matnormal.utils.unflatten_cholesky_unique(L_flat)

Converts a vector of elements into a triangular matrix (Cholesky factor). Exponentiates diagonal to make parameterization unique. Inverse of flatten_cholesky_unique.

brainiak.matnormal.utils.unpack_trainable_vars(x, trainable_vars)

Unpack trainable vars from a single vector as used/returned by scipy.optimize

brainiak.matnormal.utils.x_tx(x)

Inner product $$x^T x$$

Parameters

x (tf.Tensor) –

brainiak.matnormal.utils.xx_t(x)

Outer product $$xx^T$$

Parameters

x (tf.Tensor) –