The functional connectivity methods that we used in previous notebooks compared time series of BOLD activity between voxels within participant to infer how different regions of the brain were interacting. However, BOLD activity contains multiple components (Figure a):
In this notebook, we will consider methods that combine data across participants to eliminate #2 and #3 when calculating fMRI reliability (intersubject correlation, ISC, Hasson et al., 2004) and connectivity (intersubject functional correlation, ISFC, Simony et al., 2016). ISC and ISFC help isolate #1 because it is the only component that ought to be shared across participants.
Figure b,c show how ISC differs from functional connectivity: rather than correlating brain regions, which preserves participant-specific activity and noise, ISC correlates between the brains of different participants in order to capture only the activity that is shared. In ISC, this correlation is done for every voxel in the brain to the matching voxel in other brains, producing a full brain map. Figure e shows this as the diagonal of a correlation matrix, where each cell corresponds to a voxel in subject X correlated with the same anatomical voxel in subject Y. In practice, to simplify the computation and the interpretation it is typical for ISC to compare each individual participant with the average of all other participants.
Figure d shows ISFC: the correlation of every voxel in one participant with every other voxel in another participant (or average of other participants). This is like FCMA except it is between participants rather than within participants. In fact, these analyses use the same computational tricks. ISFC is valuable because it allows us to identify activity coupling in voxels that are not aligned across participants: the off diagonal in Figure e represents correlations for voxels in different parts of the brain.
We will use ISC and ISFC to identify brain regions that respond preferentially to narrative stories, rather than to a random assortment of words (replicating Simony et al., 2016). Furthermore, seed-based connectivity analysis does not show differences between resting state, random words, and intact narratives, but ISFC does distinguish between these conditions (Simony et al., 2016). Thus, ISFC shows greater sensitivity to the task than seed-based functional connectivity.
1. To run intersubject correlation (ISC).
2. To run intersubject functional correlation (ISFC).
3. Use ISFC to examine how a network of brain regions that respond to narrative stimuli.
2.1 The "Pieman" data
2.2 Data file preparation
2.3 Compute ISC
2.4 ISC with statistical tests
The following sequence of steps are recommended for successfully running ISC and ISFC using BrainIAK.
Data Preparation: Organize a data directory with fMRI subject data that you want to process. All subjects must be in the same anatomical space for analysis. Also you need to create a whole-brain mask. The outcome of this is an array of anatomically-aligned and temporally-aligned brain data.
Compute ISC: The ISC function computes correlations across subjects for corresponding voxels in the mask. It uses the compute_correlation
function in BrainIAK, which is optimized for fast execution (and was used in FCMA).
Permutation Test for ISC: Perform statistical analysis to determine significant correlation values for ISC.
Compute ISFC: The ISFC function computes correlations for every voxel in one subject with every other voxel averaged across subjects.
Cluster the ISFC results: Create clusters based on the correlation values.
Perform ISFC permutation: Perform permutation tests to determine the significance of the results.
import warnings
import sys
if not sys.warnoptions:
warnings.simplefilter("ignore")
import os
import glob
import time
from copy import deepcopy
import numpy as np
import pandas as pd
from nilearn import datasets
from nilearn import surface
from nilearn import plotting
from nilearn.input_data import NiftiMasker, NiftiLabelsMasker
import nibabel as nib
from brainiak import image, io
from brainiak.isc import isc, isfc, permutation_isc
import matplotlib.pyplot as plt
import seaborn as sns
%autosave 5
%matplotlib inline
sns.set(style = 'white', context='talk', font_scale=1, rc={"lines.linewidth": 2})
For this script we will use the "Pieman" dataset from Simony et al. (2016). A description of the dataset is as follows:
18 native English speakers were scanned (15 females, ages: 18–31), corresponding to the replication dataset from the Pieman study.
Stimuli for the experiment were generated from a 7 min real life story ("Pie Man", Jim O'Grady) recorded at a live storytelling performance ("The Moth" storytelling event, New York City). Subjects listened to the story from beginning to end (intact condition). In addition, subjects listened to scrambled versions of the story, which were generated by dividing the original stimulus into segments of different timescales (paragraphs and words) and then permuting the order of these segments. To generate the scrambled stimuli, the story was segmented manually by identifying the end points of each word and paragraph. Two adjacent short words were assigned to a single segment in cases where we could not separate them. Following segmentation, the intact story was scrambled at two timescales: short—‘words’ (W; 608 words, 0.7±0.5 s each) and long—‘paragraphs’ (P; 11 paragraphs, 38.1±17.6 s each). Laughter and applause were classified as single word events (4.4% of the words). Twelve seconds of neutral music and 3 s of silence preceded, and 15 s of silence followed, each playback in all conditions. These music and silence periods were discarded from all analyses.
More details about the experiment may be accessed in the methods section of the paper.
Loading and preparing the data:
BrainIAK has methods to efficiently load data. We have used some of these functions in previous notebooks.
load_images: reads data from all subjects in a list that you provide. This is like the function load_images_from_dir but here we specify the names manually.
load_boolean_mask: Create a binary mask from a brain volume
mask_images: Loads the brain images and masks them with the mask provided
image.MaskedMultiSubjectData.from_masked_images: Creates a list of arrays, with each item in the list corresponding to one subject's data. This data format is accepted by the BrainIAK ISC and ISFC function.
# Set up experiment metadata
from utils import pieman2_dir, results_path
print('Data directory is: %s' % pieman2_dir)
dir_mask = os.path.join(pieman2_dir, 'masks/')
mask_name = os.path.join(dir_mask, 'avg152T1_gray_3mm.nii.gz')
all_task_names = ['word', 'intact1']
all_task_des = ['word level scramble', 'intact story']
n_subjs_total = 18
group_assignment_dict = {task_name: i for i, task_name in enumerate(all_task_names)}
# Where do you want to store the data
dir_out = results_path + 'isc/'
if not os.path.exists(dir_out):
os.makedirs(dir_out)
print('Dir %s created ' % dir_out)
We provide helper functions to load the data.
# Reduce the number of subjects per condition to make this notebook faster
upper_limit_n_subjs = 18
def get_file_names(data_dir_, task_name_, verbose = False):
"""
Get all the participant file names
Parameters
----------
data_dir_ [str]: the data root dir
task_name_ [str]: the name of the task
Return
----------
fnames_ [list]: file names for all subjs
"""
c_ = 0
fnames_ = []
# Collect all file names
for subj in range(1, n_subjs_total):
fname = os.path.join(
data_dir_, 'sub-%.3d/func/sub-%.3d-task-%s.nii.gz' % (subj, subj, task_name_))
# If the file exists
if os.path.exists(fname):
# Add to the list of file names
fnames_.append(fname)
if verbose:
print(fname)
c_+= 1
if c_ >= upper_limit_n_subjs:
break
return fnames_
"""load brain template"""
# Load the brain mask
brain_mask = io.load_boolean_mask(mask_name)
# Get the list of nonzero voxel coordinates
coords = np.where(brain_mask)
# Load the brain nii image
brain_nii = nib.load(mask_name)
"""load bold data"""
# load the functional data
fnames = {}
images = {}
masked_images = {}
bold = {}
group_assignment = []
n_subjs = {}
for task_name in all_task_names:
fnames[task_name] = get_file_names(pieman2_dir, task_name)
images[task_name] = io.load_images(fnames[task_name])
masked_images[task_name] = image.mask_images(images[task_name], brain_mask)
# Concatenate all of the masked images across participants
bold[task_name] = image.MaskedMultiSubjectData.from_masked_images(
masked_images[task_name], len(fnames[task_name])
)
# Convert nans into zeros
bold[task_name][np.isnan(bold[task_name])] = 0
# compute the group assignment label
n_subjs_this_task = np.shape(bold[task_name])[-1]
group_assignment += list(
np.repeat(group_assignment_dict[task_name], n_subjs_this_task)
)
n_subjs[task_name] = np.shape(bold[task_name])[-1]
print('Data loaded: {} \t shape: {}' .format(task_name, np.shape(bold[task_name])))
Exercise 1: Inspect the data and report on the following details.
brain_nii
, brain_mask
brain_nii
and brain_mask
by plotting the 30th slice along the Z dimension.coords
refers to coords
with a 3d plot. For this, only plot every 10th point, otherwise the plot will be slow to load. bold
. How many subjects do we have for each task condition? Do different subjects have the same number of TRs/voxels?# Insert code below
ISC is the correlation of each voxel's time series for a participant with the corresponding (anatomically aligned) voxel time series in the average of the other participants' brains. BrainIAK has functions for computing ISC by feeding in the concatenated participant data.
This will take about 10 minutes to complete.
# run ISC, loop over conditions
isc_maps = {}
for task_name in all_task_names:
isc_maps[task_name] = isc(bold[task_name], pairwise=False)
print('Shape of %s condition:' % task_name, np.shape(isc_maps[task_name]))
The output of ISC is a voxel by participant matrix (showing the result of each individual with the group). Below we will visualize the ISC matrix for one participant and condition back on to the brain to see where activity is correlated between participants.
# set params
subj_id = 0
task_name = 'intact1'
save_data = False
# Make the ISC output a volume
isc_vol = np.zeros(brain_nii.shape)
# Map the ISC data for the first participant into brain space
isc_vol[coords] = isc_maps[task_name][subj_id, :]
# make a nii image of the isc map
isc_nifti = nib.Nifti1Image(isc_vol, brain_nii.affine, brain_nii.header)
# Save the ISC data as a volume
if save_data:
isc_map_path = os.path.join(dir_out, 'ISC_%s_sub%.2d.nii.gz' % (task_name, subj_id))
nib.save(isc_nifti, isc_map_path)
# Plot the data as a statmap
threshold = .2
f, ax = plt.subplots(1,1, figsize = (12, 5))
plotting.plot_stat_map(
isc_nifti,
threshold=threshold,
axes=ax
)
ax.set_title('ISC map for subject {}, task = {}' .format(subj_id,task_name))
# Insert code here
This analysis was performed in volumetric space; however, nilearn makes it easy to compare this data in surface space (assuming the alignment to MNI standard is excellent). Here's an example of surface plot.
# set some plotting params
subj_id = 0
task_name = 'intact1'
threshold = .2
view = 'medial'
# get a surface
fsaverage = datasets.fetch_surf_fsaverage5()
# Make the ISC output a volume
isc_vol = np.zeros(brain_nii.shape)
# Map the ISC data for the first participant into brain space
isc_vol[coords] = isc_maps[task_name][subj_id, :]
# make a nii image of the isc map
isc_intact_1subj = nib.Nifti1Image(isc_vol, brain_nii.affine, brain_nii.header)
# make "texture"
texture = surface.vol_to_surf(isc_intact_1subj, fsaverage.pial_left)
# plot
title_text = ('Avg ISC map, {} for one participant'.format(task_name))
surf_map = plotting.plot_surf_stat_map(
fsaverage.infl_left, texture,
hemi='left', view=view,
title= title_text,
threshold=threshold, cmap='RdYlBu_r',
colorbar=True,
bg_map=fsaverage.sulc_left)
# Insert code here
Exercise 4: Compare the averaged ISC map for the two task conditions. What are some brain regions showing stronger correlation in the intact story condition (vs. the word-level scramble condition)? What does this tell us about the processing of language?
Hint: The following paper this work comes from will help.
A:
BrainIAK provides several nonparametric statistical tests for ISC analysis (Nastase et al., 2019). Nonparametric tests are preferred due to the inherent correlation structure across ISC values—each subject contributes to the ISC of other subjects, violating assumptions of independence required for standard parametric tests (e.g., t-test, ANOVA). We will use the permutation test below.
Permutation tests are used to compute a null distribution of values. We have used permutation tests in previous notebooks and the steps outlined here are similar to what was done in prior notebooks with one small change, incorporating the group of subjects to compute the ISC:
Prepare the data. Here we have two conditions (intact and word_scramble), so we compute ISC for both conditions and concatenate the data for these two conditions for all subjects.
We use leave-one-subject-out (
pairwise=False
) to compute ISC and then use these correlationsisc_maps_all_tasks
to compute statistics.
We are going to permute the condition label for each subject to simulate the randomization of conditions. To do this, we first need to assign subjects to the correct experimental conditions that they were in. We have prepared such a list of assignments when we loaded the data and stored the information in the variable: group_assignment
.
The next steps are executed internally in BrainIAK in the function permutation_isc
:
- For each permutation iteration:
- BrainIAK permutes the group assignment for each subject.
- A mean of the ISC values is then computed for this shuffled group for each condition.
- A difference of group means is computed between each condition.
- The difference values for all iterations is collected and forms the null distribution.
p
percent of this distribution. permutation_isc
returns the actual observed ISC values, p-values, and optionally the resampling distribution.
# Concatenate ISCs from both tasks
isc_maps_all_tasks = np.vstack([isc_maps[task_name] for
task_name in all_task_names])
print('group_assignment: {}'.format(group_assignment))
print('isc_maps_all_tasks: {}' .format(np.shape(isc_maps_all_tasks)))
# permutation testing
n_permutations = 1000
summary_statistic='mean'
observed, p, distribution = permutation_isc(
isc_maps_all_tasks,
pairwise=False,
group_assignment=group_assignment,
summary_statistic=summary_statistic,
n_permutations=n_permutations
)
p = p.ravel()
observed = observed.ravel()
print('observed:{}'.format(np.shape(observed)))
print('p:{}'.format(np.shape(p)))
print('distribution: {}'.format(np.shape(distribution)))
Exercise 5: Interpret the results from the permutation test.
observed
, p
, distribution
A:
The goal of ISFC is to find coupling between brain regions across participants. For example the angular gyrus in subject 1 could be correlated to the pre-frontal cortex in subject 2, if they share some cognitive state. For completely random cognitive states across these two subjects, the correlation should be zero. ISFC helps us identify such commonalities across subjects.
In this section, we will compare functional connectivity vs. ISFC on the Pieman data. Whereas FC is computed within individuals, ISFC is computed between individuals. Hence the only correlations that should be robust in ISFC are those that are present across individuals. At the end of the exercises, you will qualitatively replicate Simony et al. (2016), showing that ISFC is sensitive to the cognitive state of the participants.
ISFC in voxel space is very computationally intensive, so for this notebook we will divide the brain into a smaller number of parcels. We are going to use predefined ROI masks to select the voxels.
# load a parcel
atlas = datasets.fetch_atlas_harvard_oxford('cort-maxprob-thr25-2mm', symmetric_split=True)
plotting.plot_roi(atlas.maps, title='the harvard-oxford parcel')
n_regions = len(atlas.labels)-1 # rm background region
n_TRs = np.shape(bold[task_name])[0]
print('number of voxels:\t {}'.format(np.shape(bold[task_name][1])))
print('number of parcels:\t {}'.format(n_regions))
Convert the bold data into ROI parcels
# Get a masker for the atlas
masker_ho = NiftiLabelsMasker(labels_img=atlas.maps)
# Transform the data to the parcel space
bold_ho = {
task_name:np.zeros((n_TRs, n_regions, n_subjs[task_name]))
for task_name in all_task_names}
# Collect all data
row_has_nan = np.zeros(shape=(n_regions,), dtype=bool)
for task_name in all_task_names:
for subj_id in range(n_subjs[task_name]):
# get the data for task t, subject s
nii_t_s = nib.load(fnames[task_name][subj_id])
bold_ho[task_name][:,:,subj_id] = masker_ho.fit_transform(nii_t_s)
# figure out missing rois
row_has_nan_ = np.any(np.isnan(bold_ho[task_name][:,:,subj_id]),axis=0)
row_has_nan[row_has_nan_] = True
# Figure out which ROI has missing values
roi_select = np.logical_not(row_has_nan)
n_roi_select = np.sum(roi_select)
rois_filtered = np.array(atlas.labels[1:])[roi_select]
bold_ho_filtered = {
task_name:np.zeros((n_TRs, n_roi_select, n_subjs[task_name]))
for task_name in all_task_names
}
# Remove ROIs with missing values
for task_name in all_task_names:
for subj_id in range(n_subjs[task_name]):
bold_ho_filtered[task_name][:,:,subj_id] = bold_ho[task_name][:,roi_select,subj_id]
print('ROI selected\n {}'.format(rois_filtered))
print('ROI removed due to missing values :( \n {}'.format(np.array(atlas.labels[1:])[row_has_nan]))
Here we compute FC and ISFC on the parcellated data.
# Compute FC
fc_maps = {
task_name:np.zeros((n_roi_select,n_roi_select))
for task_name in all_task_names
}
for task_name in all_task_names:
for subj_id in range(n_subjs[task_name]):
fc_maps[task_name] += np.corrcoef(
bold_ho_filtered[task_name][:,:,subj_id].T
) / n_subjs[task_name]
np.fill_diagonal(fc_maps[task_name], np.nan)
# Compute ISFC
isfc_maps_ho = {}
for task_name in all_task_names:
isfc_maps_ho[task_name] = isfc(data=bold_ho_filtered[task_name],
summary_statistic='median',
vectorize_isfcs=False)
# Insert code here
Exercise 7: Visualize FC/ISFC connectivity strength on glass brains
plot_connectome
to visualize the 4 correlation matrices: (FC vs. ISFC) x (word-level scrabled vs. intact story)edge_threshold
for all plots. Hint: The plot_connectome function takes as an input a correlation matrix (such as the FC one plotted above) and also a set of coordinates that define the XYZ coordinates that corresponds to each column/row of the correlation matrix. In order to get the coordinates of these ROIs, we recommend you use: plotting.find_parcellation_cut_coords(atlas.maps)[roi_select]
# Insert code here
Exercise 8: Do FC maps look different across conditions? How about ISFC? And why? Hint: consult this paper.
A:
We can apply the idea of inter-subject analysis to RSA. So far, ISC is being computed between aligned pairs of voxels across time points and it is commonly referred to as temporal ISC. However, we could instead correlate between aligned pairs of time points across voxels. That is, how does the pattern of activity across voxels for one time point correlate with the average pattern of the other participants at that time point. By doing this for each time point, we can generate a time course of these correlations to observe the general ebb and flow of coupling in brain activity across participants. This can be done simply by transposing the voxel and time dimensions (for a 3-D matrix, this is accomplished with a 90 degree rotation). This is a simple transposition. If we have data in the format: (TRs, voxels, subjects), we can use data.transpose(1,0,2)
, where the indices refer to the dimensions of the array and we will have an array in the format (voxels, TRs, subjects).
One way to compare the intact and word_scramble conditions is to plot the correlation values, by TR, for each condition. Again, we can use the same ISC functions from above after transposing the data matrices.
# Get a list of ROIs.
roi_mask_path = os.path.join(pieman2_dir,'masks','rois')
all_roi_fpaths = glob.glob(os.path.join(roi_mask_path, '*.nii.gz'))
# Collect all ROIs
all_roi_names = []
all_roi_nii = {}
all_roi_masker = {}
for roi_fpath in all_roi_fpaths:
# Compute ROI name
roi_fname = os.path.basename(roi_fpath)
roi_name = roi_fname.split('.')[0]
all_roi_names.append(roi_name)
# Load roi nii file
roi_nii = nib.load(roi_fpath)
all_roi_nii[roi_name] = roi_nii
# Make roi maskers
all_roi_masker[roi_name] = NiftiMasker(mask_img=roi_nii)
print('Path to all roi masks: {}'.format(roi_mask_path))
print('Here are all ROIs:\n{}'.format(all_roi_names))
# Make a function to load data for one ROI
def load_roi_data(roi_name):
# Pick a roi masker
roi_masker = all_roi_masker[roi_name]
# Preallocate
bold_roi = {task_name:[] for i, task_name in enumerate(all_task_names)}
# Gather data
for task_name in all_task_names:
for subj_id in range(n_subjs[task_name]):
# Get the data for task t, subject s
nii_t_s = nib.load(fnames[task_name][subj_id])
bold_roi[task_name].append(roi_masker.fit_transform(nii_t_s))
# Reformat the data to std form
bold_roi[task_name] = np.transpose(np.array(bold_roi[task_name]), [1,2,0])
return bold_roi
Compute spatial ISC on some ROIs.
roi_selected = ['dPCC', 'vPCUN', 'V1']
roi_selected_names = ['dorsal posterior cingulate cortex', 'ventral precuneus', 'primary visual cortex']
# compute sISC for all ROIs
iscs_roi_selected = []
for j, roi_name in enumerate(roi_selected):
print(j, roi_name)
# Load data
bold_roi = load_roi_data(roi_name)
# Compute isc
iscs_roi = {}
for task_name in all_task_names:
iscs_roi[task_name] = isc(np.transpose(bold_roi[task_name], [1,0,2]))
iscs_roi_selected.append(iscs_roi)
# Plot the spatial ISC over time
col_pal = sns.color_palette(palette='colorblind', n_colors=len(all_task_names))
ci = 95
f, axes = plt.subplots(len(roi_selected), 1, figsize=(14, 5 * len(roi_selected)), sharex=True)
# For each ROI
for j, roi_name in enumerate(roi_selected):
# For each task
for i, task_name in enumerate(all_task_names):
sns.tsplot(
iscs_roi_selected[j][task_name],
color=col_pal[i], ci=ci,
ax=axes[j]
)
f.legend(all_task_des)
sns.despine()
# Label the plot
for j, roi_name in enumerate(roi_selected):
axes[j].axhline(0, color='black', linestyle='--', alpha=.3)
axes[j].set_ylabel('Linear correlation')
axes[j].set_title('Spatial inter-subject correlation, {}'. format(roi_selected_names[j]))
axes[-1].set_xlabel('TRs')
A:
Novel contribution: be creative and make one new discovery by adding an analysis, visualization, or optimization.
Here are some ideas:
E. Simony and U. Hasson for providing data
C. Baldassano and C. Chen provided initial code
M. Kumar, C. Ellis and N. Turk-Browne produced the initial notebook 4/4/18
S. Nastase enhanced the ISC brainiak module; added the section on statistical testing
Q. Lu added solutions; switched to S. Nastase's ISC module; replicated Lerner et al 2011 & Simony et al. 2016; added spatial ISC.
M. Kumar edits to section introductions and explanation on permutation test.
K.A. Norman provided suggestions on the overall content and made edits to this notebook.
C. Ellis incorporated edits from cmhn-s19