Example script to demonstrate fmrisim functionality. This generates data for a two condition, event-related design in which each condition evokes different activity within the same voxels. It then runs simple univariate and multivariate analyses on the data.

If you would like a script that can be executed from the command line and scaffolds the simulation process, refer to the `fmrisim_real_time_generator.py`

script in the `brainiak/utils`

folder.

Ellis, C. T., Baldassano, C., Schapiro, A. C., Cai, M. B., Cohen, J. D. (2020). Facilitating open-science with realistic fMRI simulation: validation and application.

*PeerJ*8:e8564`link`

*Describes and validates the fmrisim method. Applies it to a dataset to test alternative design parameters and evaluate how these parameters influence the effect size*Ellis, C. T., Lesnick, M., Henselman-Petrusek, G., Keller, B., & Cohen, J. D. (2019). Feasibility of topological data analysis for event-related fMRI, Network Neuroscience, 1-12

`link`

*Example of using fmrisim to evaluate the plausibility of an analysis procedure under different signal parameters and design constraints*Kumar, S., Ellis, C., O'Connell, T. P., Chun, M. M., & Turk-Browne, N. B. (in press). Searching through functional space reveals distributed visual, auditory, and semantic coding in the human brain.

*PLoS Computational Biology*`link`

*Example of using fmrisim to test different possible neural bases for an observed effect in real data*Welvaert, M., et al. (2011) neuRosim: An R package for generating fMRI data.

*Journal of Statistical Software*44, 1-18`link`

*A package in R for simulating fMRI data that was an inspiration for fmrisim*

- 1.1 Import necessary Python packages
- 1.2 Load participant data
- 1.3 Specify participant dimensions and resolution
- 1.4 Generate an activity template and a mask
- 1.5 Determine noise parameters

- 2.1 Create temporal noise
- 2.2 Create system noise
- 2.3 Combine noise and template
- 2.4 Fit the data to the noise parameters

- 3.1 Specify which voxels in the brain contain signal
- 3.2 Characterize signal for voxels
- 3.3 Generate event time course
- 3.4 Export stimulus time course for analysis
- 3.5 Estimate the voxel weight for each event
- 3.6 Convolve each voxel’s time course with the Hemodynamic Response Function
- 3.7 Establish signal magnitude
- 3.8 Multiply the convolved response with the signal voxels
- 3.9 Combine signal and noise

- 4.1 Pull out data for each trial
- 4.2 Represent the data
- 4.3 Test for univariate effect
- 4.4 Test for a multivariate effect

Authors: Cameron Ellis (Yale) 2018

It is necessary to set various parameters that describe how the signal and the noise will be generated.

In [1]:

```
%matplotlib notebook
from pathlib import Path
from brainiak.utils import fmrisim
import nibabel
import numpy as np
import matplotlib.pyplot as plt
import scipy.ndimage as ndimage
import scipy.spatial.distance as sp_distance
import sklearn.manifold as manifold
import scipy.stats as stats
import sklearn.model_selection
import sklearn.svm
```

Any 4 dimensional fMRI data that is readible by nibabel can be used as input to this pipeline. For this example, data is taken from the open access repository DataSpace: http://arks.princeton.edu/ark:/88435/dsp01dn39x4181. This file is unzipped and placed in the home directory with the name Corr_MVPA

In [2]:

```
home = str(Path.home())
nii = nibabel.load(home + '/Corr_MVPA/Participant_01_rest_run01.nii')
volume = nii.get_data()
```

In [3]:

```
dim = volume.shape # What is the size of the volume
dimsize = nii.header.get_zooms() # Get voxel dimensions from the nifti header
tr = dimsize[3]
if tr > 100: # If high then these values are likely in ms and so fix it
tr /= 1000
print('Volume dimensions:', dim)
print('TR duration: %0.2fs' % tr)
```

Volume dimensions: (64, 64, 27, 294) TR duration: 1.50s

*1.4 Generate an activity template and a mask*

Functions in fmrisim require a continuous map that describes the appropriate average MR value for each voxel in the brain and a mask which specifies voxels in the brain versus voxels outside of the brain. One way to generate both of these volumes is the mask_brain function. At a minimum, this takes as an input the fMRI volume to be simulated. To create the template this volume is averaged over time and bounded to a range from 0 to 1. In other words, voxels with a high value in the template have high activity over time. To create a mask, the template is thresholded. This threshold can be set manually or instead an appropriate value can be determined by looking for the minima between the two first peaks in the histogram of voxel values. If you would prefer, you could use the compute_epi_mask function in nilearn which uses a similar method.

In [4]:

```
mask, template = fmrisim.mask_brain(volume=volume,
mask_self=True,
)
```

*1.5 Determine noise parameters*

A critical step in the fmrisim toolbox is determining the noise parameters of the volume to be created. Many noise parameters are available for specification and if any are not set then they will default to reasonable values. As mentioned before, it is instead possible to provide raw fMRI data that will be used to estimate these noise parameters. The goal of the noise estimation is to calculate general descriptive statistics about the noise in the brain that are thought to be important. The simulations are then useful for understanding how signals will survive analyses when embedded in realistic neural noise.

Now the disclaimers: the values here are only an estimate and will depend on noise properties combining in the ways assumed. In addition, because of the non-linearity and stochasticity of this simulation, this estimation is not fully invertible: if you generate a dataset with a set of noise parameters it will have similar but not the same noise parameters as a result. Moreover, complex interactions between brain regions that likely better describe brain noise are not modelled here: this toolbox pays no attention to regions of the brain or their interactions. Finally, for best results use raw fMRI because if the data has been preprocessed then assumptions this algorithm makes are likely to be erroneous. For instance, if the brain has been masked then this will eliminate variance in non-brain voxels which will mean that calculations of noise dependent on those voxels as a reference will fail.

To ameliorate some of these concerns, it is possible to fit the spatial and temporal noise properties of the data. This iterates over the noise generation process and tunes parameters in order to match those that are provided. This is time consuming (especially for fitting the temporal noise) but is helpful in matching the specified noise properties.

This toolbox separates noise in two: spatial noise and temporal noise. To estimate spatial noise both the smoothness and the amount of non-brain noise of the data must be quantified. For smoothness, the Full Width Half Max (FWHM) of the volume is averaged for the X, Y and Z dimension and then averaged across a sample of time points. To calculate the Signal to Noise Ratio (SNR) the mean activity in brain voxels for the middle time point is divided by the standard deviation in activity across non-brain voxels for that time point. For temporal noise an auto-regressive and moving average (ARMA) process is estimated, along with the overall size of temporal variability. A sample of brain voxels is used to estimate the first AR component and the first MA component of each voxel's activity over time using the statsmodels package. The Signal to Fluctuation Noise Ratio (SFNR) is calculated by dividing the average activity of voxels in the brain with that voxel’s noise (Friedman & Glover, 2006). That noise is calculated by taking the standard deviation of that voxel over time after it has been detrended with a second order polynomial. The SFNR then controls the amount of functional variability. Other types of noise can be generated, such as physiological noise, but are not estimated by this function.

In [5]:

```
# Calculate the noise parameters from the data. Set it up to be matched.
noise_dict = {'voxel_size': [dimsize[0], dimsize[1], dimsize[2]], 'matched': 1}
noise_dict = fmrisim.calc_noise(volume=volume,
mask=mask,
template=template,
noise_dict=noise_dict,
)
```

In [6]:

```
print('Noise parameters of the data were estimated as follows:')
print('SNR: ' + str(noise_dict['snr']))
print('SFNR: ' + str(noise_dict['sfnr']))
print('FWHM: ' + str(noise_dict['fwhm']))
```

fmrisim can generate realistic fMRI noise when supplied with the appropriate inputs. A single function receives these inputs and deals with generating the noise. The necessary code to run this is in the next cell. For clarity, we walk through the steps of how this simulation is performed.

In [7]:

```
# Calculate the noise given the parameters
noise = fmrisim.generate_noise(dimensions=dim[0:3],
tr_duration=int(tr),
stimfunction_tr=[0] * dim[3],
mask=mask,
template=template,
noise_dict=noise_dict,
)
```

In [8]:

```
# Plot a slice through the noise brain
plt.figure()
plt.title('Axial slice from template used for generating noise')
plt.imshow(noise[:, :, int(dim[2] / 2), 0], cmap=plt.cm.gray)
plt.axis('off')
txt="This template is generated by averaging the real functional data in time"
plt.figtext(0.5, 0.01, txt, wrap=True, horizontalalignment='center', fontsize=12);
```

The temporal noise of fMRI data is comprised of multiple components: drift, autoregression, task related motion and physiological noise. To estimate drift, cosine basis functions are combined, with longer runs being comprised of more basis functions (Welvaert, et al., 2011). This drift is then multiplied by a three-dimensional volume of Gaussian random fields of a specific FWHM. Autoregression noise is estimated by initializing with a brain shaped volume of gaussian random fields and then multiplying then creating an ARMA time course by adding additional volumes of noise. Physiological noise is modeled by sine waves comprised of heart rate (1.17Hz) and respiration rate (0.2Hz) (Biswal, et al., 1996) with random phase. This time course is also multiplied by brain shaped spatial noise. Finally, task related noise is simulated by adding Gaussian or Rician noise to time points where there are events (according to the event time course) and in turn this is multiplied by a brain shaped spatial noise volume. These four noise components are then mixed together in proportion to the size of their corresponding sigma values. This aggregated volume is then Z scored and the SFNR is used to estimate the appropriate standard deviation of these values across time.

In [9]:

```
# Plot spatial noise
low_spatial = fmrisim._generate_noise_spatial(dim[0:3],
fwhm=4.0,
)
high_spatial = fmrisim._generate_noise_spatial(dim[0:3],
fwhm=1.0,
)
plt.figure()
plt.subplot(1,2,1)
plt.title('FWHM = 4.0')
plt.imshow(low_spatial[:, :, 12])
plt.axis('off')
plt.subplot(1,2,2)
plt.title('FWHM = 1.0')
plt.imshow(high_spatial[:, :, 12])
plt.axis('off')
txt="Slices through the volume for different amounts of spatial noise. FWHM stands for full width half maximum, referencing the width of the Gaussian distribution used to simulate the data"
plt.figtext(0.5, 0.01, txt, wrap=True, horizontalalignment='center', fontsize=12);
```

In [10]:

```
# Create the different types of noise
total_time = 500
timepoints = list(range(0, total_time, int(tr)))
drift = fmrisim._generate_noise_temporal_drift(total_time,
int(tr),
)
mini_dim = np.array([2, 2, 2])
autoreg = fmrisim._generate_noise_temporal_autoregression(timepoints,
noise_dict,
mini_dim,
np.ones(mini_dim),
)
phys = fmrisim._generate_noise_temporal_phys(timepoints,
)
stimfunc = np.zeros((int(total_time / tr), 1))
stimfunc[np.random.randint(0, int(total_time / tr), 50)] = 1
task = fmrisim._generate_noise_temporal_task(stimfunc,
)
```

In [11]:

```
# Plot the different noise types
plt.figure()
plt.title('Noise types')
def clean_axis(ax):
# Remove the borders and ticks of the plots (but different than plt.axis('off') since you can set a label)
ax.spines["top"].set_visible(False)
ax.spines["bottom"].set_visible(False)
ax.spines["left"].set_visible(False)
ax.spines["right"].set_visible(False)
plt.xticks([])
plt.yticks([])
ax = plt.subplot(4, 1, 1)
plt.plot(drift)
clean_axis(ax)
plt.ylabel('Drift')
ax = plt.subplot(4, 1, 2)
plt.plot(autoreg[0, 0, 0, :])
clean_axis(ax)
plt.ylabel('AR')
ax = plt.subplot(4, 1, 3)
plt.plot(phys)
clean_axis(ax)
plt.ylabel('Physio')
ax = plt.subplot(4, 1, 4)
plt.plot(task)
clean_axis(ax)
plt.ylabel('Task')
txt="Time course of different noise properties that are usable in fmrisim."
plt.figtext(0.5, 0.01, txt, wrap=True, horizontalalignment='center', fontsize=12);
```