brainiak.matnormal package

Some properties of the matrix-variate normal distribution

\[\DeclareMathOperator{\Tr}{Tr} \newcommand{\trp}{^{T}} % transpose \newcommand{\trace}{\text{Trace}} % trace \newcommand{\inv}{^{-1}} \newcommand{\mb}{\mathbf{b}} \newcommand{\M}{\mathbf{M}} \newcommand{\C}{\mathbf{C}} \newcommand{\G}{\mathbf{G}} \newcommand{\A}{\mathbf{A}} \newcommand{\R}{\mathbf{R}} \renewcommand{\S}{\mathbf{S}} \newcommand{\B}{\mathbf{B}} \newcommand{\Q}{\mathbf{Q}} \newcommand{\mH}{\mathbf{H}} \newcommand{\U}{\mathbf{U}} \newcommand{\mL}{\mathbf{L}} \newcommand{\diag}{\mathrm{diag}} \newcommand{\etr}{\mathrm{etr}} \renewcommand{\H}{\mathbf{H}} \newcommand{\vecop}{\mathrm{vec}} \newcommand{\I}{\mathbf{I}} \newcommand{\X}{\mathbf{X}} \newcommand{\Y}{\mathbf{Y}} \newcommand{\Z}{\mathbf{Z}} \renewcommand{\L}{\mathbf{L}}\]

The matrix-variate normal distribution is a generalization to matrices of the normal distribution. Another name for it is the multivariate normal distribution with kronecker separable covariance. The distributional intuition is as follows: if $X \sim \mathcal{MN}(M,R,C)$ then $\mathrm{vec}(X)\sim\mathcal{N}(\mathrm{vec}(M), C \otimes R)$, where $\mathrm{vec}(\cdot)$ is the vectorization operator and $\otimes$ is the Kronecker product. If we think of X as a matrix of TRs by voxels in the fMRI setting, then this model assumes that each voxel has the same TR-by-TR covariance structure (represented by the matrix R), and each volume has the same spatial covariance (represented by the matrix C). This assumption allows us to model both covariances separately. We can assume that the spatial covariance itself is kronecker-structured, which implies that the spatial covariance of voxels is the same in the X, Y and Z dimensions.

The log-likelihood for the matrix-normal density is:

\[\log p(X\mid \M,\R, \C) = -2\log mn - m \log|\C| - n \log|\R| - \Tr\left[\C\inv(\X-\M)\trp\R\inv(\X-\M)\right]\]

Here $X$ and $M$ are both $m\times n$ matrices, $\R$ is $m\times m$ and $\C$ is $n\times n$.

The brainiak.matnormal package provides structure to infer models that can be stated in the matrix-normal notation that are useful for fMRI analysis. It provides a few interfaces. MatnormModelBase is intended as a base class for matrix-variate models. It provides a wrapper for the tensorflow optimizer that provides convergence checks based on thresholds on the function value and gradient, and simple verbose outputs. It also provides an interface for noise covariances (CovBase). Any class that follows the interface can be used as a noise covariance in any of the matrix normal models. The package includes a variety of noise covariances to work with.

Matrix normal marginals

Here we extend the multivariate gaussian marginalization identity to matrix normals. This is used in a number of the models in the package. Below, we use lowercase subscripts for sizes to make dimensionalities easier to track. Uppercase subscripts for covariances help keep track where they come from.

\[\begin{split}\mathbf{X}_{ij} &\sim \mathcal{MN}(\mathbf{A}_{ij}, \Sigma_{\mathbf{X}i},\Sigma_{\mathbf{X}j})\\ \mathbf{Y}_{jk} &\sim \mathcal{MN}(\mathbf{B}_{jk}, \Sigma_{\mathbf{Y}j},\Sigma_{\mathbf{Y}k})\\ \mathbf{Z}_{ik}\mid\mathbf{X}_{ij},\mathbf{Y}_{jk} &\sim \mathcal{MN}(\mathbf{X}_{ij}\mathbf{Y}_{jk} + \mathbf{C}_{ik}, \Sigma_{\mathbf{Z}_i}, \Sigma_{\mathbf{Z}_k})\\\end{split}\]

We vectorize, and convert to a form we recognize as $y \sim \mathcal{N}(Mx+b, \Sigma)$.

\[\begin{split}\vecop(\mathbf{Z}_{ik})\mid\mathbf{X}_{ij},\mathbf{Y}_{jk} &\sim \mathcal{N}(\vecop(\X_{ij}\mathbf{Y}_{jk}+\mathbf{C}_{ik}), \Sigma_{\mathbf{Z}_k}\otimes\Sigma_{\mathbf{Z}_i})\\ \vecop(\mathbf{Z}_{ik})\mid\mathbf{X}_{ij},\mathbf{Y}_{jk} &\sim \mathcal{N}((\I_k\otimes\X_{ij})\vecop(\mathbf{Y}_{jk}) + \vecop(\mathbf{C}_{ik}), \Sigma_{\mathbf{Z}_k}\otimes\Sigma_{\mathbf{Z}_i})\end{split}\]

Now we can use our standard gaussian marginalization identity:

\[\vecop(\mathbf{Z}_{ik})\mid\mathbf{X}_{ij} \sim \mathcal{N}((\I_k\otimes\X_{ij})\vecop(\mathbf{B}_{jk}) + \vecop(\mathbf{C}_{ik}), \Sigma_{\mathbf{Z}_k}\otimes\Sigma_{\mathbf{Z}_i} + (\I_k\otimes\X_{ij})(\Sigma_{\mathbf{Y}_k}\otimes \Sigma_{\mathbf{Y}_j})(\I_k\otimes\X_{ij})\trp )\]

Collect terms using the mixed-product property of kronecker products:

\[\vecop(\mathbf{Z}_{ik})\mid\mathbf{X}_{ij} \sim \mathcal{N}(\vecop(\X_{ij}\mathbf{B}_{jk}) + \vecop(\mathbf{C}_{ik}), \Sigma_{\mathbf{Z}_k}\otimes \Sigma_{\mathbf{Z}_i} + \Sigma_{\mathbf{Y}_k}\otimes \X_{ij}\Sigma_{\mathbf{Y}_j}\X_{ij}\trp)\]

Now, we can see that the marginal density is a matrix-variate normal only if $\Sigma_{\mathbf{Z}_k}= \Sigma_{\mathbf{Y}_k}$ – that is, the variable we’re marginalizing over has the same covariance in the dimension we’re not marginalizing over as the marginal density. Otherwise the densit is well-defined but the covariance retains its kronecker structure. So we let $\Sigma_k:=\Sigma_{\mathbf{Z}_k}= \Sigma_{\mathbf{Y}_k}$, factor, and transform it back into a matrix normal:

\[\begin{split}\vecop(\mathbf{Z}_{ik})\mid\mathbf{X}_{ij} &\sim \mathcal{N}(\vecop(\X\mathbf{B}_{jk}) + \vecop(\mathbf{C}_{ik}), \Sigma_{k}\otimes\Sigma_{\mathbf{Z}_i} + \Sigma_{_k}\otimes \X\Sigma_{\mathbf{Y}_j}\X\trp)\\ \vecop(\mathbf{Z}_{ik})\mid\mathbf{X}_{ij} &\sim \mathcal{N}(\vecop(\X\mathbf{B}_{jk}) + \vecop(\mathbf{C}_{ik}), \Sigma_{k}\otimes(\Sigma_{\mathbf{Z}_i} +\X\Sigma_{\mathbf{Y}_j}\X\trp))\\ \mathbf{Z}_{ik}\mid\mathbf{X}_{ij} &\sim \mathcal{MN}(\X\mathbf{B}_{jk} + \mathbf{C}_{ik}, \Sigma_{\mathbf{Z}_i} +\X\Sigma_{\mathbf{Y}_j}\X\trp,\Sigma_{k})\end{split}\]

We can do it in the other direction as well, because if $\X \sim \mathcal{MN}(M, U, V)$ then $\X\trp \sim \mathcal{MN}(M\trp, V, U)$:

\[\begin{split}\mathbf{Z\trp}_{ik}\mid\mathbf{X}_{ij},\mathbf{Y}_{jk} &\sim \mathcal{MN}(\mathbf{Y}_{jk}\trp\mathbf{X}_{ij}\trp + \mathbf{C}\trp_{ik}, \Sigma_{\mathbf{Z}_k},\Sigma_{\mathbf{Z}_i})\\ \mbox{let } \Sigma_i := \Sigma_{\mathbf{Z}_i}=\Sigma_{\mathbf{X}_i} \\ \cdots\\ \mathbf{Z\trp}_{ik}\mid\mathbf{Y}_{jk} &\sim \mathcal{MN}(\mathbf{A}_{jk}\trp\mathbf{X}_{ij}\trp + \mathbf{C}\trp_{ik}, \Sigma_{\mathbf{Z}_k} + \Y\trp\Sigma_{\mathbf{Y}_j}\Y,\Sigma_{\mathbf{Z}_i})\\ \mathbf{Z}_{ik}\mid\mathbf{Y}_{jk} &\sim \mathcal{MN}(\mathbf{X}_{ij}\mathbf{A}_{jk}+ \mathbf{C}_{ik},\Sigma_{\mathbf{Z}_i},\Sigma_{\mathbf{Z}_k} + \Y\trp\Sigma_{\mathbf{Y}_j}\Y)\end{split}\]

These marginal likelihoods are implemented relatively efficiently in MatnormModelBase.matnorm_logp_marginal_row and MatnormModelBase.matnorm_logp_marginal_col.

Partitioned matrix normal conditionals

Here we extend the multivariate gaussian conditional identity to matrix normals. This is used for prediction in some models. Below, we use lowercase subscripts for sizes to make dimensionalities easier to track. Uppercase subscripts for covariances help keep track where they come from.

Next, we do the same for the partitioned gaussian identity. First two vectorized matrix-normals that form our partition:

\[\begin{split}\mathbf{X}_{ij} &\sim \mathcal{MN}(\mathbf{A}_{ij}, \Sigma_{i}, \Sigma_{j}) \rightarrow \vecop[\mathbf{X}_{ij}] \sim \mathcal{N}(\vecop[\mathbf{A}_{ij}], \Sigma_{j}\otimes\Sigma_{i})\\ \mathbf{Y}_{ik} &\sim \mathcal{MN}(\mathbf{B}_{ik}, \Sigma_{i}, \Sigma_{k}) \rightarrow \vecop[\mathbf{Y}_{ik}] \sim \mathcal{N}(\vecop[\mathbf{B}_{ik}], \Sigma_{k}\otimes\Sigma_{i})\\ \begin{bmatrix}\vecop[\mathbf{X}_{ij}] \\ \vecop[\mathbf{Y}_{ik}] \end{bmatrix} & \sim \mathcal{N}\left(\vecop\begin{bmatrix}\mathbf{A}_{ij} \\ \mathbf{B}_{ik} \end{bmatrix} , \begin{bmatrix} \Sigma_{j}\otimes \Sigma_i & \Sigma_{jk} \otimes \Sigma_i \\ \Sigma_{kj}\otimes \Sigma_i & \Sigma_{k} \otimes \Sigma_i\end{bmatrix}\right)\end{split}\]

We apply the standard partitioned Gaussian identity and simplify using the properties of the $\vecop$ operator and the mixed product property of kronecker products:

\[\begin{split}\vecop[\X_{ij}] \mid \vecop[\Y_{ik}]\sim \mathcal{N}(&\vecop[\A_{ij}] + (\Sigma_{jk}\otimes\Sigma_i) (\Sigma_k\inv\otimes\Sigma_i\inv)(\vecop[\Y_{ik}]-\vecop[\B_{ik}]),\\ & \Sigma_j\otimes\Sigma_i - (\Sigma_{jk}\otimes\Sigma_i) (\Sigma_k\inv\otimes\Sigma_i\inv) (\Sigma_{kj}\otimes\Sigma_i))\\ =\mathcal{N}(&\vecop[\A_{ij}] + (\Sigma_{jk}\Sigma_k\inv\otimes\Sigma_i\Sigma_i\inv) (\vecop[\Y_{ik}]-\vecop[\B_{ik}]), \\ & \Sigma_j\otimes\Sigma_i - (\Sigma_{jk}\Sigma_k\inv\Sigma_{kj}\otimes \Sigma_i\Sigma_i\inv \Sigma_i))\\ =\mathcal{N}(&\vecop[\A_{ij}] + (\Sigma_{jk}\Sigma_k\inv\otimes\I) (\vecop[\Y_{ik}]-\vecop[\B_{ik}]), \\ & \Sigma_j\otimes\Sigma_i - (\Sigma_{jk}\Sigma_k\inv\Sigma_{kj}\otimes\Sigma_i)\\ =\mathcal{N}(&\vecop[\A_{ij}] + \vecop[\Y_{ik}-\B_{ik}\Sigma_k\inv\Sigma_{kj}], (\Sigma_j-\Sigma_{jk}\Sigma_k\inv\Sigma_{kj})\otimes\Sigma_i)\end{split}\]

Next, we recognize that this multivariate gaussian is equivalent to the following matrix variate gaussian:

\[\X_{ij} \mid \Y_{ik}\sim \mathcal{MN}(\A_{ij} + (\Y_{ik}-\B_{ik})\Sigma_k\inv\Sigma_{kj}, \Sigma_i, \Sigma_j-\Sigma_{jk}\Sigma_k\inv\Sigma_{kj})\]

The conditional in the other direction can be written by working through the same algebra:

\[\Y_{ik} \mid \X_{ij}\sim \mathcal{MN}(\B_{ik} +(\X_{ij}- \A_{ij})\Sigma_j\inv\Sigma_{jk}, \Sigma_i, \Sigma_k-\Sigma_{kj}\Sigma_j\inv\Sigma_{jk})\]

Finally, vertical rather than horizontal concatenation (yielding a partitioned row rather than column covariance) can be written by recognizing the behavior of the matrix normal under transposition:

\[\begin{split}\X\trp_{ji} \mid \Y\trp_{ki}\sim \mathcal{MN}(&\A\trp_{ji} + \Sigma_{jk}\Sigma_k\inv(\Y\trp_{ki}-\B\trp_{ki}), \Sigma_j-\Sigma_{jk}\Sigma_k\inv\Sigma_{kj}, \Sigma_i)\\ \Y\trp_{ki} \mid \X\trp_{ji}\sim \mathcal{MN}(&\B\trp_{ki} + \Sigma_{kj}\Sigma_j\inv(\X\trp_{ji}-\A\trp_{ji}), \Sigma_k-\Sigma_{kj}\Sigma_j\inv\Sigma_{jk}, \Sigma_i)\end{split}\]

These conditional likelihoods are implemented relatively efficiently in MatnormModelBase.matnorm_logp_conditional_row and MatnormModelBase.matnorm_logp_conditional_col.

Submodules

brainiak.matnormal.covs module

class brainiak.matnormal.covs.CovAR1(size, rho=None, sigma=None, scan_onsets=None)

Bases: CovBase

AR(1) covariance parameterized by autoregressive parameter rho and new noise sigma.

Parameters

size (int) – size of covariance matrix
rho (float or None) – initial value of autoregressive parameter (if None, initialize randomly)
sigma (float or None) – initial value of new noise parameter (if None, initialize randomly)

get_optimize_vars(): Returns a list of tf variables that need to get optimized to fit this covariance

property logdet: log-determinant of this covariance

solve(X): Given this covariance $\Sigma$ and some input $X$, compute $\Sigma^{-1}x$

class brainiak.matnormal.covs.CovBase(size)

Bases: ABC

Base metaclass for residual covariances. For more on abstract classes, see https://docs.python.org/3/library/abc.html

Parameters: size (int) – The size of the covariance matrix.

abstract get_optimize_vars(): Returns a list of tf variables that need to get optimized to fit this covariance

property logdet: log determinant of this covariance

abstract solve(X): Given this covariance $\Sigma$ and some input $X$, compute $\Sigma^{-1}x$

class brainiak.matnormal.covs.CovDiagonal(size, diag_var=None)

Bases: CovBase

Uncorrelated (diagonal) noise covariance

Parameters

size (int) – size of covariance matrix
diag_var (float or None) – initial value of (diagonal) variance vector (if None, initialize randomly)

get_optimize_vars(): Returns a list of tf variables that need to get optimized to fit this covariance

property logdet: log determinant of this covariance

solve(X)

Given this covariance $\Sigma$ and some input $X$, compute $\Sigma^{-1}x$

Parameters: X (tf.Tensor) – Tensor to multiply by inverse of this covariance

class brainiak.matnormal.covs.CovDiagonalGammaPrior(size, sigma=None, alpha=1.5, beta=1e-10)

Bases: CovDiagonal

Uncorrelated (diagonal) noise covariance

class brainiak.matnormal.covs.CovIdentity(size)

Bases: CovBase

Identity noise covariance.

get_optimize_vars(): Returns a list of tf variables that need to get optimized to fit this covariance

property logdet: log determinant of this covariance

solve(X): Given this covariance $\Sigma$ and some input $X$, compute $\Sigma^{-1}x$

class brainiak.matnormal.covs.CovIsotropic(size, var=None)

Bases: CovBase

Scaled identity (isotropic) noise covariance.

Parameters

size (int) – size of covariance matrix
var (float or None) – initial value of new variance parameter (if None, initialize randomly)

get_optimize_vars(): Returns a list of tf variables that need to get optimized to fit this covariance

property logdet: log determinant of this covariance

solve(X)

Given this covariance $\Sigma$ and some input $X$, compute $\Sigma^{-1}x$

Parameters: X (tf.Tensor) – Tensor to multiply by inverse of this covariance

class brainiak.matnormal.covs.CovKroneckerFactored(sizes, Sigmas=None, mask=None)

Bases: CovBase

Kronecker product noise covariance parameterized in terms of its component cholesky factors

property L

get_optimize_vars(): Returns a list of tf variables that need to get optimized to fit this covariance

property logdet: log|Sigma| using the diagonals of the cholesky factors.

solve(X)

Given this covariance $\Sigma$ and some input $X$, compute $\Sigma^{-1}x$ using traingular solves with the cholesky factors.

Specifically, we solve $L L^T x = y$ by solving $L z = y$ and $L^T x = z$.

Parameters: X (tf.Tensor) – Tensor to multiply by inverse of this covariance

class brainiak.matnormal.covs.CovUnconstrainedCholesky(size=None, Sigma=None)

Bases: CovBase

Unconstrained noise covariance parameterized in terms of its cholesky

property L: Cholesky factor of this covariance

get_optimize_vars(): Returns a list of tf variables that need to get optimized to fit this covariance

property logdet: log determinant of this covariance

solve(X)

Given this covariance $\Sigma$ and some input $X$, compute $\Sigma^{-1}x$ (using cholesky solve)

Parameters: X (tf.Tensor) – Tensor to multiply by inverse of this covariance

class brainiak.matnormal.covs.CovUnconstrainedCholeskyWishartReg(size, Sigma=None)

Bases: CovUnconstrainedCholesky

Unconstrained noise covariance parameterized in terms of its cholesky factor. Regularized using the trick from Chung et al. 2015 such that as the covariance approaches singularity, the likelihood goes to 0.

References

Chung, Y., Gelman, A., Rabe-Hesketh, S., Liu, J., & Dorie, V. (2015). Weakly Informative Prior for Point Estimation of Covariance Matrices in Hierarchical Models. Journal of Educational and Behavioral Statistics, 40(2), 136–157. https://doi.org/10.3102/1076998615570945

class brainiak.matnormal.covs.CovUnconstrainedInvCholesky(size=None, invSigma=None)

Bases: CovBase

Unconstrained noise covariance parameterized in terms of its precision cholesky. Use this over the regular cholesky unless you have a good reason not to, since this saves a cholesky solve on every step of optimization

property Linv: Inverse of Cholesky factor of this covariance

get_optimize_vars(): Returns a list of tf variables that need to get optimized to fit this covariance

property logdet: log determinant of this covariance

solve(X)

Given this covariance $\Sigma$ and some input $X$, compute $\Sigma^{-1}x$ (using cholesky solve)

Parameters: X (tf.Tensor) – Tensor to multiply by inverse of this covariance

brainiak.matnormal.matnormal_likelihoods module

brainiak.matnormal.matnormal_likelihoods.matnorm_logp(x, row_cov, col_cov)

Log likelihood for centered matrix-variate normal density. Assumes that row_cov and col_cov follow the API defined in CovBase.

Parameters

x (tf.Tensor) – Observation tensor
row_cov (CovBase) – Row covariance implementing the CovBase API
col_cov (CovBase) – Column Covariance implementing the CovBase API

brainiak.matnormal.matnormal_likelihoods.matnorm_logp_conditional_col(x, row_cov, col_cov, cond, cond_cov)

Log likelihood for centered conditional matrix-variate normal density.

Consider the following partitioned matrix-normal density:

\[\begin{split}\begin{bmatrix} \operatorname{vec}\left[\mathbf{X}_{i j}\right] \\ \operatorname{vec}\left[\mathbf{Y}_{i k}\right]\end{bmatrix} \sim \mathcal{N}\left(0,\begin{bmatrix} \Sigma_{j} \otimes \Sigma_{i} & \Sigma_{j k} \otimes \Sigma_{i}\\ \Sigma_{k j} \otimes \Sigma_{i} & \Sigma_{k} \otimes \Sigma_{i} \end{bmatrix}\right)\end{split}\]

Then we can write the conditional:

\[\mathbf{X}_{i j} \mid \mathbf{Y}_{i k} \sim \mathcal{M}\ \mathcal{N}\left(0, \Sigma_{i}, \Sigma_{j}-\Sigma_{j k}\ \Sigma_{k}^{-1} \Sigma_{k j}\right)\]

This function efficiently computes the conditionals by unpacking some info in the covariance classes and then dispatching to solve_det_conditional.

Parameters

x (tf.Tensor) – Observation tensor
row_cov (CovBase) – Row covariance ($\Sigma_{i}$ in the notation above).
col_cov (CovBase) – Column covariance ($\Sigma_{j}$ in the notation above).
cond (tf.Tensor) – Off-diagonal block of the partitioned covariance ($\Sigma_{jk}$ in the notation above).
cond_cov (CovBase) – Covariance of conditioning variable ($\Sigma_{k}$ in the notation above).

brainiak.matnormal.matnormal_likelihoods.matnorm_logp_conditional_row(x, row_cov, col_cov, cond, cond_cov)

Log likelihood for centered conditional matrix-variate normal density.

Consider the following partitioned matrix-normal density:

\[\begin{split}\begin{bmatrix} \operatorname{vec}\left[\mathbf{X}_{i j}\right] \\ \operatorname{vec}\left[\mathbf{Y}_{i k}\right]\end{bmatrix} \sim \mathcal{N}\left(0,\begin{bmatrix} \Sigma_{j} \otimes \Sigma_{i} & \Sigma_{j k} \otimes \Sigma_{i}\\ \Sigma_{k j} \otimes \Sigma_{i} & \Sigma_{k} \otimes \Sigma_{i} \end{bmatrix}\right)\end{split}\]

Then we can write the conditional:

\[\mathbf{X}^T j i \mid \mathbf{Y}_{k i}^T \sim \mathcal{M}\ \mathcal{N}\left(0, \Sigma_{j}-\Sigma_{j k} \Sigma_{k}^{-1} \Sigma_{k j},\ \Sigma_{i}\right)\]

This function efficiently computes the conditionals by unpacking some info in the covariance classes and then dispatching to solve_det_conditional.

Parameters

x (tf.Tensor) – Observation tensor
row_cov (CovBase) – Row covariance ($\Sigma_{i}$ in the notation above).
col_cov (CovBase) – Column covariance ($\Sigma_{j}$ in the notation above).
cond (tf.Tensor) – Off-diagonal block of the partitioned covariance ($\Sigma_{jk}$ in the notation above).
cond_cov (CovBase) – Covariance of conditioning variable ($\Sigma_{k}$ in the notation above).

brainiak.matnormal.matnormal_likelihoods.matnorm_logp_marginal_col(x, row_cov, col_cov, marg, marg_cov)

Log likelihood for centered marginal matrix-variate normal density.

\[ \begin{align}\begin{aligned}X &\sim \mathcal{MN}(0, R, Q)\\\Y \mid \X &\sim \mathcal{MN}(XA, R, C),\\\Y &\sim \mathcal{MN}(0, R, C + A^TQA)\end{aligned}\end{align} \]

This function efficiently computes the marginals by unpacking some info in the covariance classes and then dispatching to solve_det_marginal.

Parameters

x (tf.Tensor) – Observation tensor
row_cov (CovBase) – Row covariance implementing the CovBase API ($R$ above).
col_cov (CovBase) – Column Covariance implementing the CovBase API ($C$ above).
marg (tf.Tensor) – Marginal factor ($A$ above).
marg_cov (CovBase) – Prior covariance implementing the CovBase API ($Q$ above).

brainiak.matnormal.matnormal_likelihoods.matnorm_logp_marginal_row(x, row_cov, col_cov, marg, marg_cov)

Log likelihood for marginal centered matrix-variate normal density.

\[ \begin{align}\begin{aligned}X &\sim \mathcal{MN}(0, Q, C)\\\Y \mid \X &\sim \mathcal{MN}(AX, R, C),\\\Y &\sim \mathcal{MN}(0, R + AQA^T, C)\end{aligned}\end{align} \]

This function efficiently computes the marginals by unpacking some info in the covariance classes and then dispatching to solve_det_marginal.

Parameters

x (tf.Tensor) – Observation tensor
row_cov (CovBase) – Row covariance implementing the CovBase API ($R$ above).
col_cov (CovBase) – Column Covariance implementing the CovBase API ($C$ above).
marg (tf.Tensor) – Marginal factor ($A$ above).
marg_cov (CovBase) – Prior covariance implementing the CovBase API ($Q$ above).

brainiak.matnormal.matnormal_likelihoods.solve_det_conditional(x, sigma, A, Q)

Use matrix inversion lemma for the solve:

\[(\Sigma - AQ^{-1}A^T)^{-1} X =\ (\Sigma^{-1} + \Sigma^{-1} A (Q - A^T \Sigma^{-1} A)^{-1} A^T \Sigma^{-1}) X\]

Use matrix determinant lemma for determinant:

\[\log|(\Sigma - AQ^{-1}A^T)| = \log|Q - A^T \Sigma^{-1} A| - \log|Q| + \log|\Sigma|\]

Parameters

x (tf.Tensor) – Tensor to multiply the solve by
sigma (brainiak.matnormal.CovBase) – Covariance object implementing solve and logdet
A (tf.Tensor) – Factor multiplying the variable we conditioned on
Q (brainiak.matnormal.CovBase) – Covariance object of conditioning variable, implementing solve and logdet

brainiak.matnormal.matnormal_likelihoods.solve_det_marginal(x, sigma, A, Q)

Use matrix inversion lemma for the solve:

\[(\Sigma + AQA^T)^{-1} X =\ (\Sigma^{-1} - \Sigma^{-1} A (Q^{-1} + A^T \Sigma^{-1} A)^{-1} A^T \Sigma^{-1}) X\]

Use matrix determinant lemma for determinant:

\[\log|(\Sigma + AQA^T)| = \log|Q^{-1} + A^T \Sigma^{-1} A| + \log|Q| + \log|\Sigma|\]

Parameters

x (tf.Tensor) – Tensor to multiply the solve by
sigma (brainiak.matnormal.CovBase) – Covariance object implementing solve and logdet
A (tf.Tensor) – Factor multiplying the variable we marginalized out
Q (brainiak.matnormal.CovBase) – Covariance object of marginalized variable, implementing solve and logdet

brainiak.matnormal.mnrsa module

class brainiak.matnormal.mnrsa.MNRSA(time_cov, space_cov, n_nureg=5, optimizer='L-BFGS-B', optCtrl=None)

Bases: BaseEstimator

Matrix normal version of RSA.

The goal of this analysis is to find the covariance of the mapping from some design matrix X to the fMRI signal Y. It does so by marginalizing over the actual mapping (i.e. averaging over the uncertainty in it), which happens to correct a bias imposed by structure in the design matrix on the RSA estimate (see Cai et al., NIPS 2016).

This implementation makes different choices about residual covariance relative to brainiak.reprsimil.BRSA: Here, the noise covariance is assumed to be kronecker-separable. Informally, this means that all voxels have the same temporal covariance, and all time points have the same spatial covariance. This is in contrast to BRSA, which allows different temporal covariance for each voxel. On the other hand, computational efficiencies enabled by this choice allow MNRSA to support a richer class of space and time covariances (anything in brainiak.matnormal.covs).

For users: in general, if you are worried about voxels each having different temporal noise structure,you should use brainiak.reprsimil.BRSA. If you are worried about between-voxel correlations or temporal covaraince structures that BRSA does not support, you should use MNRSA.

\[ \begin{align}\begin{aligned}Y &\sim \mathcal{MN}(0, \Sigma_t + XLL^TX^T+ X_0X_0^T, \Sigma_s)\\\U &= LL^T\end{aligned}\end{align} \]

Parameters

time_cov (subclass of CovBase) – Temporal noise covariance class following CovBase interface.
space_cov (subclass of CovBase) – Spatial noise covariance class following CovBase interface.
optimizer (string, Default :'L-BFGS') – Name of scipy optimizer to use.
optCtrl (dict, default: None) – Additional arguments to pass to scipy.optimize.minimize.

property L: Cholesky factor of the RSA matrix.

fit(X, y, naive_init=True)

Estimate dimension reduction and cognitive model parameters

Parameters

X (2d array) – Brain data matrix (TRs by voxels). Y in the math
y (2d array or vector) – Behavior data matrix (TRs by behavioral obsevations). X in the math
max_iter (int, default=1000) – Maximum number of iterations to run
step (int, default=100) – Number of steps between optimizer output
restart (bool, default=True) – If this is true, optimizer is restarted (e.g. for a new dataset). Otherwise optimizer will continue from where it is now (for example for running more iterations if the initial number was not enough).

logp(X, Y): MNRSA Log-likelihood

set_fit_request(*, naive_init: Union[bool, None, str] = '$UNCHANGED$') → MNRSA

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters: naive_init (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for naive_init parameter in fit.
Returns: self – The updated object.
Return type: object

brainiak.matnormal.regression module

class brainiak.matnormal.regression.MatnormalRegression(time_cov, space_cov, optimizer='L-BFGS-B', optCtrl=None)

Bases: BaseEstimator

This analysis allows maximum likelihood estimation of regression models in the presence of both spatial and temporal covariance.

..math::: Y sim mathcal{MN}(Xeta, time_cov, space_cov)

Parameters

time_cov (subclass of CovBase) – TR noise covariance class following CovBase interface.
space_cov (subclass of CovBase) – Voxel noise covariance class following CovBase interface.
optimizer (string, default="L-BFGS-B") – Scipy optimizer to use. For other options, see “method” argument of scipy.optimize.minimize
optCtrl (dict, default=None) – Additional arguments to pass to scipy.optimize.minimize.

calibrate(Y)

Decode design matrix from fMRI dataset, based on a previously trained mapping. This method just does naive MLE:

\[X = Y \Sigma_s^{-1}B^T(B \Sigma_s^{-1} B^T)^{-1}\]

Parameters: Y (np.array, TRs by voxels.) – fMRI dataset

fit(X, y, naive_init=True)

Compute the regression fit.

Parameters

X (np.array, TRs by conditions.) – Design matrix
y (np.array, TRs by voxels.) – fMRI data

logp(X, Y): Log likelihood of model (internal)

predict(X)

Predict fMRI signal from design matrix.

Parameters: X (np.array, TRs by conditions.) – Design matrix

set_fit_request(*, naive_init: Union[bool, None, str] = '$UNCHANGED$') → MatnormalRegression

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters: naive_init (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for naive_init parameter in fit.
Returns: self – The updated object.
Return type: object

brainiak.matnormal.utils module

brainiak.matnormal.utils.flatten_cholesky_unique(L): Flattens nonzero-elements Cholesky (triangular) factor into a vector, and logs diagonal to make parameterization unique. Inverse of unflatten_cholesky_unique.

brainiak.matnormal.utils.make_val_and_grad(lossfn, train_vars): Makes a function that ouptuts the loss and gradient in a format compatible with scipy.optimize.minimize

brainiak.matnormal.utils.pack_trainable_vars(trainable_vars): Pack trainable vars in a model into a single vector that can be passed to scipy.optimize

brainiak.matnormal.utils.rmn(rowcov, colcov)

Generate random draws from a zero-mean matrix-normal distribution.

Parameters

rowcov (np.ndarray) – Row covariance (assumed to be positive definite)
colcov (np.ndarray) – Column covariance (assumed to be positive definite)

brainiak.matnormal.utils.scaled_I(x, size)

Scaled identity matrix $x I_{size}$

Parameters

x (float or coercable to float) – Scale to multiply the identity matrix by
size (int or otherwise coercable to a size) – Dimension of the scaled identity matrix to return

brainiak.matnormal.utils.unflatten_cholesky_unique(L_flat): Converts a vector of elements into a triangular matrix (Cholesky factor). Exponentiates diagonal to make parameterization unique. Inverse of flatten_cholesky_unique.

brainiak.matnormal.utils.unpack_trainable_vars(x, trainable_vars): Unpack trainable vars from a single vector as used/returned by scipy.optimize

brainiak.matnormal.utils.x_tx(x)

Inner product $x^T x$

Parameters: x (tf.Tensor) –

brainiak.matnormal.utils.xx_t(x)

Outer product $xx^T$

Parameters: x (tf.Tensor) –