# brainiak.utils package¶

Utilities used by multiple subpackages.

## brainiak.utils.fmrisim module¶

fMRI Simulator

Simulate fMRI data for a single subject.

This code provides a set of functions necessary to produce realistic simulations of fMRI data. There are two main steps: characterizing the signal and generating the noise model, which are then combined to simulate brain data. Tools are included to support the creation of different types of signal, such as region specific differences in univariate activity. To create the noise model the parameters can either be set manually or can be estimated from real fMRI data with reasonable accuracy ( works best when fMRI data has not been preprocessed)

Functions:

generate_signal Create a volume with activity, of a specified shape and either multivariate or univariate pattern, in a specific region to represent the signal in the neural data.

generate_stimfunction Create a timecourse of the signal activation. This can be specified using event onsets and durations from a timing file. This is the time course before convolution and therefore can be at any temporal precision.

export_3_column: Generate a three column timing file that can be used with software like FSL to represent event event onsets and duration

export_epoch_file: Generate an epoch file from the time course which can be used as an input to brainiak functions

convolve_hrf Convolve the signal timecourse with the HRF to model the expected evoked activity

apply_signal Combine the signal volume with the HRF, thus giving the signal the temporal properties of the HRF (such as smoothing and lag)

calc_noise Estimate the noise properties of a given fMRI volume. Prominently, estimate the smoothing and SFNR of the data

generate_noise Create the noise for this run. This creates temporal, spatial task and white noise. Various parameters can be tuned depending on need

mask_brain Create a mask volume that has similar contrast as an fMRI image. Defaults to use an MNI grey matter atlas but any image can be supplied to create an estimate.

compute_signal_change Convert the signal function into useful metric units according to metrics used by others (Welvaert & Rosseel, 2013)

Authors: Cameron Ellis (Princeton & Yale) 2016-2018 Chris Baldassano (Princeton) 2016-2017 Mingbo Cai (Princeton) 2017

brainiak.utils.fmrisim.apply_signal(signal_function, volume_signal)

Combine the signal volume with its timecourse

Apply the convolution of the HRF and stimulus time course to the volume.

Parameters
• signal_function (timepoint by timecourse array, float) – The timecourse of the signal over time. If there is only one column then the same timecourse is applied to all non-zero voxels in volume_signal. If there is more than one column then each column is paired with a non-zero voxel in the volume_signal (a 3d numpy array generated in generate_signal).

• volume_signal (multi dimensional array, float) – The volume containing the signal to be convolved with the same dimensions as the output volume. The elements in volume_signal indicate how strong each signal in signal_function are modulated by in the output volume

Returns

signal – The convolved signal volume with the same 3d as volume signal and the same 4th dimension as signal_function

Return type

multidimensional array, float

brainiak.utils.fmrisim.calc_noise(volume, mask, template, noise_dict=None)

Calculates the noise properties of the volume supplied. This estimates what noise properties the volume has. For instance it determines the spatial smoothness, the autoregressive noise, system noise etc. Read the doc string for generate_noise to understand how these different types of noise interact.

Parameters
• volume (4d numpy array, float) – Take in a functional volume (either the file name or the numpy array) to be used to estimate the noise properties of this

• mask (3d numpy array, binary) – A binary mask of the brain, the same size as the volume

• template (3d array, float) – A continuous (0 -> 1) volume describing the likelihood a voxel is in the brain. This can be used to contrast the brain and non brain.

• noise_dict (dict) – The initialized dictionary of the calculated noise parameters of the provided dataset (usually it is only the voxel size)

Returns

noise_dict – Return a dictionary of the calculated noise parameters of the provided dataset

Return type

dict

brainiak.utils.fmrisim.compute_signal_change(signal_function, noise_function, noise_dict, magnitude, method='PSC')

Rescale the signal to be a given magnitude, based on a specified metric (e.g. percent signal change). Metrics are heavily inspired by Welvaert & Rosseel (2013). The rescaling is based on the maximal activity in the timecourse. Importantly, all values within the signal_function are scaled to have a min of -1 or max of 1, meaning that the voxel value will be the same as the magnitude.

Parameters
• signal_function (timepoint by voxel array) – The signal time course to be altered. This can have multiple time courses specified as different columns in this array. Conceivably you could use the output of generate_stimfunction as the input but the temporal variance will be incorrect. Critically, different values across voxels are considered relative to each other, not independently. E.g., if the voxel has a peak signal twice as high as another voxel’s, then this means that the signal after these transformations will still be twice as high (according to the metric) in the first voxel relative to the second

• noise_function (timepoint by voxel numpy array) – The time course of noise (a voxel created from generate_noise) for each voxel specified in signal_function. This is necessary for computing the mean evoked activity and the noise variability

• noise_dict (dict) –

A dictionary specifying the types of noise in this experiment. The noise types interact in important ways. First, autoregressive, physiological and task-based noise types are mixed together in _generate_temporal_noise. The parameter values for ‘auto_reg_sigma’, ‘physiological_sigma’ and ‘task_sigma’ describe the proportion of mixing of these elements, respectively. However critically, ‘SFNR’ is the parameter that controls how much noise these components contribute to the brain. ‘auto_reg_rho’ and ‘ma_rho’ set parameters for the autoregressive noise being simulated. Second, drift noise is added to this, according to the size of ‘drift_sigma’. Thirdly, system noise is added based on the ‘SNR’ parameter. Finally, ‘fwhm’ is used to estimate the smoothness of the noise being inserted. If ‘matched’ is set to true, then it will fit the parameters to match the participant as best as possible.

Variables defined as follows:

snr [float]: Ratio of MR signal to the spatial noise sfnr [float]: Ratio of the MR signal to the temporal noise. This is the total variability that the following sigmas ‘sum’ to:

task_sigma [float]: Size of the variance of task specific noise auto_reg_sigma [float]: Size of the variance of autoregressive noise. This is an ARMA process where the AR and MA components can be separately specified physiological_sigma [float]: Size of the variance of physiological noise

drift_sigma [float]: Size of the variance of drift noise

auto_reg_rho [list]: The coefficients of the autoregressive components you are modeling ma_rho [list]:The coefficients of the moving average components you are modeling max_activity [float]: The max value of the averaged brain in order to reference the template voxel_size [list]: The mm size of the voxels fwhm [float]: The gaussian smoothing kernel size (mm) matched [bool]: Specify whether you are fitting the noise parameters

The volumes of brain noise that are generated have smoothness specified by ‘fwhm’

• magnitude (list of floats) – This specifies the size, in terms of the metric choosen below, of the signal being generated. This can be a single number, and thus apply to all signal timecourses, or it can be array and thus different for each voxel.

• method (str) – Select the procedure used to calculate the signal magnitude, some of which are based on the definitions outlined in Welvaert & Rosseel (2013): - ‘SFNR’: Change proportional to the temporal variability, as represented by the (desired) SFNR - ‘CNR_Amp/Noise-SD’: Signal magnitude relative to the temporal noise - ‘CNR_Amp2/Noise-Var_dB’: Same as above but converted to decibels - ‘CNR_Signal-SD/Noise-SD’: Standard deviation in signal relative to standard deviation in noise - ‘CNR_Signal-Var/Noise-Var_dB’: Same as above but converted to decibels - ‘PSC’: Calculate the percent signal change based on the average activity of the noise (mean / 100 * magnitude)

Returns

signal_function_scaled – The new signal volume with the appropriately set signal change

Return type

4d numpy array

brainiak.utils.fmrisim.convolve_hrf(stimfunction, tr_duration, hrf_type='double_gamma', scale_function=True, temporal_resolution=100.0)

Convolve the specified hrf with the timecourse. The output of this is a downsampled convolution of the stimfunction and the HRF function. If temporal_resolution is 1 / tr_duration then the output will be the same length as stimfunction. This time course assumes that slice time correction has occurred and all slices have been aligned to the middle time point in the TR.

Be aware that if scaling is on and event durations are less than the duration of a TR then the hrf may or may not come out as anticipated. This is because very short events would evoke a small absolute response after convolution but if there are only short events and you scale then this will look similar to a convolution with longer events. In general scaling is useful, which is why it is the default, but be aware of this edge case and if it is a concern, set the scale_function to false.

Parameters
• stimfunction (timepoint by feature array) – What is the time course of events to be modelled in this experiment. This can specify one or more timecourses of events. The events can be weighted or binary

• tr_duration (float) – How long (in s) between each volume onset

• hrf_type (str or list) – Takes in a string describing the hrf that ought to be created. Can instead take in a vector describing the HRF as it was specified by any function. The default is ‘double_gamma’ in which an initial rise and an undershoot are modelled.

• scale_function (bool) – Do you want to scale the function to a range of 1

• temporal_resolution (float) – How many elements per second are you modeling for the stimfunction

Returns

signal_function – The time course of the HRF convolved with the stimulus function. This can have multiple time courses specified as different columns in this array.

Return type

timepoint by timecourse array

brainiak.utils.fmrisim.export_3_column(stimfunction, filename, temporal_resolution=100.0)

Output a tab separated three column timing file

This produces a three column tab separated text file, with the three columns representing onset time (s), event duration (s) and weight, respectively. Useful if you want to run the simulated data through FEAT analyses. In a way, this is the reverse of generate_stimfunction

Parameters
• stimfunction (timepoint by 1 array) – The stimulus function describing the time course of events. For instance output from generate_stimfunction.

• filename (str) – The name of the three column text file to be output

• temporal_resolution (float) – How many elements per second are you modeling with the stimfunction?

brainiak.utils.fmrisim.export_epoch_file(stimfunction, filename, tr_duration, temporal_resolution=100.0)

Output an epoch file, necessary for some inputs into brainiak

This takes in the time course of stimulus events and outputs the epoch file used in Brainiak. The epoch file is a way to structure the timing information in fMRI that allows you to flexibly input different stimulus sequences. This is a list with each entry a 3d matrix corresponding to a participant. The dimensions of the 3d matrix are condition by epoch by time. For the i-th condition, if its k-th epoch spans time points t_m to t_n-1, then [i, k, t_m:t_n] are 1 in the epoch file.

Parameters
• stimfunction (list of timepoint by condition arrays) – The stimulus function describing the time course of events. Each list entry is from a different participant, each row is a different timepoint (with the given temporal precision), each column is a different condition. export_epoch_file is looking for differences in the value of stimfunction to identify the start and end of an epoch. If epochs in stimfunction are coded with the same weight and there is no time between blocks then export_epoch_file won’t be able to label them as different epochs

• filename (str) – The name of the epoch file to be output

• tr_duration (float) – How long is each TR in seconds

• temporal_resolution (float) – How many elements per second are you modeling with the stimfunction?

brainiak.utils.fmrisim.generate_1d_gaussian_rfs(n_voxels, feature_resolution, feature_range, rf_size=15, random_tuning=True, rf_noise=0.0)

Creates a numpy matrix of Gaussian-shaped voxel receptive fields (RFs) along one dimension. Can specify whether they are evenly tiled or randomly tuned along the axis. RF range will be between 0 and 1.

Parameters
• n_voxels (int) – Number of voxel RFs to create.

• feature_resolution (int) – Number of points along the feature axis.

• feature_range (tuple (numeric)) – A tuple indicating the start and end values of the feature range. e.g. (0, 359) for motion directions.

• rf_size (numeric) – Width of the Gaussian receptive field. Should be given in units of the feature dimension. e.g., 15 degrees wide in motion direction space.

• random_tuning (boolean [default True]) – Indicates whether or not the voxels are randomly tuned along the 1D feature axis or whether tuning is evenly spaced.

• rf_noise (float [default 0.]) – Amount of uniform noise to add to the Gaussian RF. This will cause the generated responses to be distorted by the same uniform noise for a given voxel.

Returns

• voxel_rfs (2d numpy array (float)) – The receptive fields in feature space. Dimensions are n_voxels by feature_resolution.

• voxel_tuning (1d numpy array (float)) – The centers of the voxel RFs, in feature space.

brainiak.utils.fmrisim.generate_1d_rf_responses(rfs, trial_list, feature_resolution, feature_range, trial_noise=0.25)

Generates trial-wise data for a given set of receptive fields (RFs) and a 1d array of features presented across trials.

Parameters
• voxel_rfs (2d numpy array (float)) – The receptive fields in feature space. Dimensions must be n_voxels by feature_resolution.

• trial_list (1d numpy array (numeric)) – The feature value of the stimulus presented on individual trials. Array size be n_trials.

• feature_resolution (int) – Number of points along the feature axis.

• feature_range (tuple (numeric)) – A tuple indicating the start and end values of the feature range. e.g. (0, 359) for motion directions.

• trial_noise (float [default 0.25]) – Amount of uniform noise to inject into the synthetic data. This is generated independently for every trial and voxel.

Returns

trial_data – The synthetic data for each voxel and trial. Dimensions are n_voxels by n_trials.

Return type

2d numpy array (float)

brainiak.utils.fmrisim.generate_noise(dimensions, stimfunction_tr, tr_duration, template, mask=None, noise_dict=None, temporal_proportion=0.5, iterations=None, fit_thresh=0.05, fit_delta=0.5)

Generate the noise to be added to the signal. Default noise parameters will create a noise volume with a standard deviation of 0.1 (where the signal defaults to a value of 1). This has built into estimates of how different types of noise mix. All noise values can be set by the user or estimated with calc_noise.

Parameters
• dimensions (nd array) – What is the shape of the volume to be generated

• stimfunction_tr (Iterable, list) – When do the stimuli events occur. Each element is a TR

• tr_duration (float) – What is the duration, in seconds, of each TR?

• template (3d array, float) – A continuous (0 -> 1) volume describing the likelihood a voxel is in the brain. This can be used to contrast the brain and non brain.

• mask (3d array, binary) – The mask of the brain volume, distinguishing brain from non-brain

• noise_dict (dict) –

A dictionary specifying the types of noise in this experiment. The noise types interact in important ways. First, autoregressive, physiological and task-based noise types are mixed together in _generate_temporal_noise. The parameter values for ‘auto_reg_sigma’, ‘physiological_sigma’ and ‘task_sigma’ describe the proportion of mixing of these elements, respectively. However critically, ‘SFNR’ is the parameter that controls how much noise these components contribute to the brain. ‘auto_reg_rho’ and ‘ma_rho’ set parameters for the autoregressive noise being simulated. Second, drift noise is added to this, according to the size of ‘drift_sigma’. Thirdly, system noise is added based on the ‘SNR’ parameter. Finally, ‘fwhm’ is used to estimate the smoothness of the noise being inserted. If ‘matched’ is set to true, then it will fit the parameters to match the participant as best as possible.

Variables defined as follows:

snr [float]: Ratio of MR signal to the spatial noise sfnr [float]: Ratio of the MR signal to the temporal noise. This is the total variability that the following sigmas ‘sum’ to:

task_sigma [float]: Size of the variance of task specific noise auto_reg_sigma [float]: Size of the variance of autoregressive noise. This is an ARMA process where the AR and MA components can be separately specified physiological_sigma [float]: Size of the variance of physiological noise

drift_sigma [float]: Size of the variance of drift noise

auto_reg_rho [list]: The coefficients of the autoregressive components you are modeling ma_rho [list]:The coefficients of the moving average components you are modeling max_activity [float]: The max value of the averaged brain in order to reference the template voxel_size [list]: The mm size of the voxels fwhm [float]: The gaussian smoothing kernel size (mm) matched [bool]: Specify whether you are fitting the noise parameters

The volumes of brain noise that are generated have smoothness specified by ‘fwhm’

• temporal_proportion – What is the proportion of the temporal variance (as specified by the SFNR noise parameter) that is accounted for by the system noise. If this number is high then all of the temporal variability is due to system noise, if it is low then all of the temporal variability is due to brain variability.

• float – What is the proportion of the temporal variance (as specified by the SFNR noise parameter) that is accounted for by the system noise. If this number is high then all of the temporal variability is due to system noise, if it is low then all of the temporal variability is due to brain variability.

• iterations (list, int) – The first element is how many steps of fitting the SFNR and SNR values will be performed. Usually converges after < 5. The second element is the number of iterations for the AR fitting. This is much more time consuming (has to make a new timecourse on each iteration) so be careful about setting this appropriately.

• fit_thresh (float) – What proportion of the target parameter value is sufficient error to warrant finishing fit search.

• fit_delta (float) – How much are the parameters attenuated during the fitting process, in terms of the proportion of difference between the target parameter and the actual parameter

Returns

noise – Generates the noise volume for these parameters

Return type

multidimensional array, float

brainiak.utils.fmrisim.generate_signal(dimensions, feature_coordinates, feature_size, feature_type, signal_magnitude=[1], signal_constant=1)

Generate volume containing signal

Generate signal, of a specific shape in specific regions, for a single volume. This will then be convolved with the HRF across time

Parameters
• dimensions (1d array, ndarray) – What are the dimensions of the volume you wish to create

• feature_coordinates (multidimensional array) – What are the feature_coordinates of the signal being created. Be aware of clipping: features far from the centre of the brain will be clipped. If you wish to have multiple features then list these as a features x 3 array. To create a feature of a unique shape then supply all the individual feature_coordinates of the shape and set the feature_size to 1.

• feature_size (list, int) – How big is the signal. If feature_coordinates=1 then only one value is accepted, if feature_coordinates>1 then either one value must be supplied or m values

• feature_type (list, string) – What feature_type of signal is being inserted? Options are cube, loop, cavity, sphere. If feature_coordinates = 1 then only one value is accepted, if feature_coordinates > 1 then either one value must be supplied or m values

• signal_magnitude (list, float) – What is the (average) magnitude of the signal being generated? A value of 1 means that the signal is one standard deviation from the noise

• signal_constant (list, bool) – Is the signal constant across the feature (for univariate activity) or is it a random pattern of a given magnitude across the feature (for multivariate activity)

Returns

volume_signal – Creates a single volume containing the signal

Return type

3 dimensional array, float

brainiak.utils.fmrisim.generate_stimfunction(onsets, event_durations, total_time, weights=[1], timing_file=None, temporal_resolution=100.0)

Return the function for the timecourse events

When do stimuli onset, how long for and to what extent should you resolve the fMRI time course. There are two ways to create this, either by supplying onset, duration and weight information or by supplying a timing file (in the three column format used by FSL).

Parameters
• onsets (list, int) – What are the timestamps (in s) for when an event you want to generate onsets?

• event_durations (list, int) – What are the durations (in s) of the events you want to generate? If there is only one value then this will be assigned to all onsets

• total_time (int) – How long (in s) is the experiment in total.

• weights (list, float) – What is the weight for each event (how high is the box car)? If there is only one value then this will be assigned to all onsets

• timing_file (string) – The filename (with path) to a three column timing file (FSL) to make the events. Still requires total_time to work

• temporal_resolution (float) – How many elements per second are you modeling for the timecourse. This is useful when you want to model the HRF at an arbitrarily high resolution (and then downsample to your TR later).

Returns

stim_function – The time course of stimulus evoked activation. This has a temporal resolution of temporal resolution / 1.0 elements per second

Return type

1 by timepoint array, float

brainiak.utils.fmrisim.mask_brain(volume, template_name=None, mask_threshold=None, mask_self=True)

Mask the simulated volume This creates a mask specifying the approximate likelihood that a voxel is part of the brain. All values are bounded to the range of 0 to 1. An appropriate threshold to isolate brain voxels is >0.2. Critically, the data that should be used to create a template shouldn’t already be masked/skull stripped. If it is then it will give in accurate estimates of non-brain noise and corrupt estimations of SNR.

Parameters
• volume (multidimensional array) – Either numpy array of a volume or a tuple describing the dimensions of the mask to be created

• template_name (str) – What is the path to the template to be loaded? If empty then it defaults to an MNI152 grey matter mask. This is ignored if mask_self is True.

• mask_threshold (float) – What is the threshold (0 -> 1) for including a voxel in the mask? If None then the program will try and identify the last wide peak in a histogram of the template (assumed to be the brain voxels) and takes the minima before that peak as the threshold. Won’t work when the data is not bimodal.

• mask_self (bool or None) – If set to true then it makes a mask from the volume supplied (by averaging across time points and changing the range). If it is set to false then it will use the template_name as an input.

Returns

• mask (3 dimensional array, binary) – The masked brain, thresholded to distinguish brain and non-brain

• template (3 dimensional array, float) – A continuous (0 -> 1) volume describing the likelihood a voxel is in the brain. This can be used to contrast the brain and non brain.

## brainiak.utils.fmrisim_real_time_generator module¶

This code can be run as a function in python or from the command line: python fmrisim_real-time_generator –outputDir data

The input arguments are: Required: outputDir - Specify output data dir where the data should be saved

Optional (can be modified by flags from the command line): data_dict contains: numTRs - Specify the number of time points multivariate_patterns - Is the difference between conditions univariate (0) or multivariate (1) different_ROIs - Are there different ROIs for each condition (1) or is it in the same ROI (0). If it is the same ROI and you are using univariate differences, the second condition will have a smaller evoked response than the other. event_duration - How long, in seconds, is each event scale_percentage - What is the percent signal change trDuration - How many seconds per volume save_dicom - Do you want to save data as a dicom (1) or numpy (0) save_realtime - Do you want to save the data in real time (1) or as fast as possible (0)? isi - What is the time between each event (in seconds) burn_in - How long before the first event (in seconds)

brainiak.utils.fmrisim_real_time_generator.generate_data(outputDir, user_settings)

Generate simulated fMRI data Use a few parameters that might be relevant for real time analysis

Parameters
• outputDir (str) – Specify output data dir where the data should be saved

• user_settings (dict) – A dictionary to specify the parameters used for making data, specifying the following keys numTRs - int - Specify the number of time points multivariate_patterns - bool - Is the difference between conditions univariate (0) or multivariate (1) different_ROIs - bool - Are there different ROIs for each condition ( 1) or is it in the same ROI (0). If it is the same ROI and you are using univariate differences, the second condition will have a smaller evoked response than the other. event_duration - int - How long, in seconds, is each event scale_percentage - float - What is the percent signal change trDuration - float - How many seconds per volume save_dicom - bool - Save to data as a dicom (1) or numpy (0) save_realtime - bool - Do you want to save the data in real time (1) or as fast as possible (0)? isi - float - What is the time between each event (in seconds) burn_in - int - How long before the first event (in seconds)

## brainiak.utils.kronecker_solvers module¶

brainiak.utils.kronecker_solvers.tf_kron_mult(L, x)

Tensorflow multiply with kronecker product matrix Returns kron(L[0], L[1] …) * x

Parameters
• L (list of 2-D tensors) – Each element of the list must be a tensorflow tensor and must be a square matrix of dimension n_i x n_i

• x (1-D or 2-D tensor) – Dimension (n_0*n_1*..n_(m-1)) x p

Returns

y – Dimension (n_0*n_1*..n_(m-1)) x p

Return type

1-D or 2-D tensor

brainiak.utils.kronecker_solvers.tf_masked_triangular_solve(L, y, mask, lower=True, adjoint=False)

Tensorflow function to solve L x = y where L is a lower triangular matrix with a mask

Parameters
• L (2-D tensor) – Must be a tensorflow tensor and must be a triangular matrix of dimension n x n

• y (1-D or 2-D tensor) – Dimension n x p

• mask (1-D tensor) – Dimension n x 1, should be 1 if element is valid, 0 if invalid

• lower (boolean (default : True)) – True if L is lower triangular, False if upper triangular

• adjoint (boolean (default : False)) – True if solving for L^T x = y, False if solving for Lx = y

Returns

x – Dimension n x p, values at rows for which mask == 0 are set to zero

Return type

1-D or 2-D tensor

## brainiak.utils.utils module¶

class brainiak.utils.utils.ReadDesign(fname=None, include_orth=True, include_pols=True)

Bases: object

A class which has the ability of reading in design matrix in .1D file,

generated by AFNI’s 3dDeconvolve.

Parameters

• include_orth (Boollean, whether to include "orthogonal" regressors in) – the nuisance regressors which are usually head motion parameters. All the columns of design matrix are still going to be read in, but the attribute cols_used will reflect whether these orthogonal regressors are to be included for furhter analysis. Note that these are not entered into design_task attribute which include only regressors related to task conditions.

• include_pols (Boollean, whether to include polynomial regressors in) – the nuisance regressors which are used to capture slow drift of signals.

design
Type

2d array. The design matrix read in from the csv file.

design_task

Type

2d array. The part of design matrix corresponding to

n_col
Type

number of total columns in the design matrix.

column_types

0 for orthogonal regressors (usually head motion parameters), -1 for polynomial basis (capturing slow drift of signals), values > 0 for stimulus conditions

Type

1d array. the types of each column in the design matrix.

n_basis
Type

scalar. The number of polynomial bases in the designn matrix.

n_stim
Type

scalar. The number of stimulus conditions.

n_orth

motions)

Type

scalar. The number of orthogoanal regressors (usually head

StimLabels
Type

list. The names of each column in the design matrix.

read_afni(fname)
brainiak.utils.utils.array_correlation(x, y, axis=0)

Column- or row-wise Pearson correlation between two arrays

Computes sample Pearson correlation between two 1D or 2D arrays (e.g., two n_TRs by n_voxels arrays). For 2D arrays, computes correlation between each corresponding column (axis=0) or row (axis=1) where axis indexes observations. If axis=0 (default), each column is considered to be a variable and each row is an observation; if axis=1, each row is a variable and each column is an observation (equivalent to transposing the input arrays). Input arrays must be the same shape with corresponding variables and observations. This is intended to be an efficient method for computing correlations between two corresponding arrays with many variables (e.g., many voxels).

Parameters
• x (1D or 2D ndarray) – Array of observations for one or more variables

• y (1D or 2D ndarray) – Array of observations for one or more variables (same shape as x)

• axis (int (0 or 1), default: 0) – Correlation between columns (axis=0) or rows (axis=1)

Returns

r – Pearson correlation values for input variables

Return type

float or 1D ndarray

brainiak.utils.utils.center_mass_exp(interval, scale=1.0)
Calculate the center of mass of negative exponential distribution

p(x) = exp(-x / scale) / scale in the interval of (interval_left, interval_right). scale is the same scale parameter as scipy.stats.expon.pdf

Parameters
• interval (size 2 tuple, float) – interval must be in the form of (interval_left, interval_right), where interval_left/interval_right is the starting/end point of the interval in which the center of mass is calculated for exponential distribution. Note that interval_left must be non-negative, since exponential is not supported in the negative domain, and interval_right must be bigger than interval_left (thus positive) to form a well-defined interval.

• scale (float, positive) – The scale parameter of the exponential distribution. See above.

Returns

m – The center of mass in the interval of (interval_left, interval_right) for exponential distribution.

Return type

float

brainiak.utils.utils.circ_dist(x, y)

Computes the pairwise circular distance between two arrays of points (in radians).

Parameters
• x (numpy vector of positions on a circle, in radians.) –

• y (numpy vector of positions on a circle, in radians.) –

Returns

r

Return type

numpy vector of distances between inputs.

brainiak.utils.utils.concatenate_not_none(data, axis=0)

Construct a numpy array by stacking not-None arrays in a list

Parameters
• data (list of arrays) – The list of arrays to be concatenated. Arrays have same shape in all but one dimension or are None, in which case they are ignored.

• axis (int, default = 0) – Axis for the concatenation

Returns

data_stacked – The resulting concatenated array.

Return type

array

brainiak.utils.utils.cov2corr(cov)
Calculate the correlation matrix based on a

covariance matrix

Parameters

cov (2D array) –

Returns

corr – correlation converted from the covarince matrix

Return type

2D array

brainiak.utils.utils.from_sym_2_tri(symm)
convert a 2D symmetric matrix to an upper

triangular matrix in 1D format

Parameters

symm (2D array) – Symmetric matrix

Returns

tri – Contains elements of upper triangular matrix

Return type

1D array

brainiak.utils.utils.from_tri_2_sym(tri, dim)
convert a upper triangular matrix in 1D format

to 2D symmetric matrix

Parameters
• tri (1D array) – Contains elements of upper triangular matrix

• dim (int) – The dimension of target matrix.

Returns

symm – Symmetric matrix in shape=[dim, dim]

Return type

2D array

brainiak.utils.utils.gen_design(stimtime_files, scan_duration, TR, style='FSL', temp_res=0.01, hrf_para={'response_delay': 6, 'response_dispersion': 0.9, 'undershoot_delay': 12, 'undershoot_dispersion': 0.9, 'undershoot_scale': 0.035})
Generate design matrix based on a list of names of stimulus

timing files. The function will read each file, and generate a numpy array of size [time_points * condition], where time_points equals duration / TR, and condition is the size of stimtime_filenames. Each column is the hypothetical fMRI response based on the stimulus timing in the corresponding file of stimtime_files. This function uses generate_stimfunction and double_gamma_hrf of brainiak.utils.fmrisim.

Parameters
• stimtime_files (a string or a list of string.) – Each string is the name of the file storing the stimulus timing information of one task condition. The contents in the files will be interpretated based on the style parameter. Details are explained under the style parameter.

• scan_duration (float or a list (or a 1D numpy array) of numbers.) – Total duration of each fMRI scan, in unit of seconds. If there are multiple runs, the duration should be a list (or 1-d numpy array) of numbers. If it is a list, then each number in the list represents the duration of the corresponding scan in the stimtime_files. If only a number is provided, it is assumed that there is only one fMRI scan lasting for scan_duration.

• TR (float.) – The sampling period of fMRI, in unit of seconds.

• style (string, default: 'FSL') –

Acceptable inputs: ‘FSL’, ‘AFNI’ The formating style of the stimtime_files. ‘FSL’ style has one line for each event of the same condition. Each line contains three numbers. The first number is the onset of the event relative to the onset of the first scan, in units of seconds. (Multiple scans should be treated as a concatenated long scan for the purpose of calculating onsets. However, the design matrix from one scan won’t leak into the next). The second number is the duration of the event, in unit of seconds. The third number is the amplitude modulation (or weight) of the response. It is acceptable to not provide the weight, or not provide both duration and weight. In such cases, these parameters will default to 1.0. This code will accept timing files with only 1 or 2 columns for convenience but please note that the FSL package does not allow this

’AFNI’ style has one line for each scan (run). Each line has a few triplets in the format of stim_onsets*weight:duration (or simpler, see below), separated by spaces. For example, 3.2*2.0:1.5 means that one event starts at 3.2s, modulated by weight of 2.0 and lasts for 1.5s. If some run does not include a single event of a condition (stimulus type), then you can put *, or a negative number, or a very large number in that line. Either duration or weight can be neglected. In such cases, they will default to 1.0. For example, 3.0, 3.0*1.0, 3.0:1.0 and 3.0*1.0:1.0 all means an event starting at 3.0s, lasting for 1.0s, with amplitude modulation of 1.0.

• temp_res (float, default: 0.01) – Temporal resolution of fMRI, in second.

• hrf_para (dictionary) – The parameters of the double-Gamma hemodynamic response function. To set different parameters, supply a dictionary with the same set of keys as the default, and replace the corresponding values with the new values.

Returns

design – design matrix. Each time row represents one TR (fMRI sampling time point) and each column represents one experiment condition, in the order in stimtime_files

Return type

2D numpy array

brainiak.utils.utils.p_from_null(observed, distribution, side='two-sided', exact=False, axis=None)

Compute p-value from null distribution

Returns the p-value for an observed test statistic given a null distribution. Performs either a ‘two-sided’ (i.e., two-tailed) test (default) or a one-sided (i.e., one-tailed) test for either the ‘left’ or ‘right’ side. For an exact test (exact=True), does not adjust for the observed test statistic; otherwise, adjusts for observed test statistic (prevents p-values of zero). If a multidimensional distribution is provided, use axis argument to specify which axis indexes resampling iterations.

The implementation is based on the work in [PhipsonSmyth2010].

PhipsonSmyth2010

“Permutation p-values should never be zero: calculating exact p-values when permutations are randomly drawn.”, B. Phipson, G. K., Smyth, 2010, Statistical Applications in Genetics and Molecular Biology, 9, 1544-6115. https://doi.org/10.2202/1544-6115.1585

Parameters
• observed (float) – Observed test statistic

• distribution (ndarray) – Null distribution of test statistic

• side (str, default: 'two-sided') – Perform one-sided (‘left’ or ‘right’) or ‘two-sided’ test

• axis (None or int, default: None) – Axis indicating resampling iterations in input distribution

Returns

p – p-value for observed test statistic based on null distribution

Return type

float

brainiak.utils.utils.phase_randomize(data, voxelwise=False, random_state=None)

Randomize phase of time series across subjects

For each subject, apply Fourier transform to voxel time series and then randomly shift the phase of each frequency before inverting back into the time domain. This yields time series with the same power spectrum (and thus the same autocorrelation) as the original time series but will remove any meaningful temporal relationships among time series across subjects. By default (voxelwise=False), the same phase shift is applied across all voxels; however if voxelwise=True, different random phase shifts are applied to each voxel. The typical input is a time by voxels by subjects ndarray. The first dimension is assumed to be the time dimension and will be phase randomized. If a 2-dimensional ndarray is provided, the last dimension is assumed to be subjects, and different phase randomizations will be applied to each subject.

The implementation is based on the work in [Lerner2011] and [Simony2016].

Parameters
• data (ndarray (n_TRs x n_voxels x n_subjects)) – Data to be phase randomized (per subject)

• voxelwise (bool, default: False) – Apply same (False) or different (True) randomizations across voxels

• random_state (RandomState or an int seed (0 by default)) – A random number generator instance to define the state of the random permutations generator.

Returns

shifted_data – Phase-randomized time series

Return type

ndarray (n_TRs x n_voxels x n_subjects)

brainiak.utils.utils.sumexp_stable(data)

Compute the sum of exponents for a list of samples

Parameters

data (array, shape=[features, samples]) – A data array containing samples.

Returns

• result_sum (array, shape=[samples,]) – The sum of exponents for each sample divided by the exponent of the maximum feature value in the sample.

• max_value (array, shape=[samples,]) – The maximum feature value for each sample.

• result_exp (array, shape=[features, samples]) – The exponent of each element in each sample divided by the exponent of the maximum feature value in the sample.

Note

This function is more stable than computing the sum(exp(v)). It useful for computing the softmax_i(v)=exp(v_i)/sum(exp(v)) function.

brainiak.utils.utils.usable_cpu_count()

Get number of CPUs usable by the current process.

Takes into consideration cpusets restrictions.

Returns

Return type

int