Inverted Encoding Model

Overview

We provide examples of how to use the inverted encoding model (IEM) module in BrainIAK to reconstruct features of stimuli presented to human subjects. First, a forward encoding model is estimated, mapping a set of stimulus features to the accompanying fMRI response in a population of voxels. Then, the model is inverted to allow the user to feed in new fMRI responses and predict the accompanying stimulus features.

The BrainIAK implementation allows for users to specify and fit an encoding model for stimulus features that are represented in either a 1-dimensional, circular space or a 2-dimensional space. We include examples for each of these from real fMRI studies of working memory and attention.

While decoding methods such as support vector machines (SVM) can also take fMRI responses and predict stimulus features, they rely on general purpose algorithms to learn how features are represented in the data. When the encoding model is a better description for the data than a generic decoding algorithm, it is a more efficient way to estimate the response --> stimulus mapping. Our last example simulates responses from 1D receptive fields, and uses either SVM or an IEM to predict the stimulus feature. The IEM achieves higher accuracy with less data.

Annotated Bibliography

  1. Brouwer, G.J., and Heeger, D.J. (2009). Decoding and reconstructing color from responses in human visual cortex. Journal of Neuroscience 29, 13992–14003. link Uses an inverted encoding model to reconstruct color in a continuous space, demonstrating how color is represented across a hierarchy of visual regions.

  2. Naselaris, T., Kay, K. N., Nishimoto, S., & Gallant, J. L. (2011). Encoding and decoding in fMRI. NeuroImage 56(2), 400–410. link A review article distinguishing between the different uses of encoding and decoding approaches for fMRI.

  3. Serences, J.T., and Saproo, S. (2012). Computational advances towards linking BOLD and behavior. Neuropsychologia 50, 435–446. link Describes the differences between encoding and decoding approaches and emphasizes how these approaches can test linking hypotheses between fMRI and behavior.

  4. Sprague, T.C., Adam, K.C.S., Foster, J.J., Rahmati, M., Sutterer, D.W., and Vo, V.A. (2018). Inverted encoding models assay population-level stimulus representations, not single-unit neural tuning. eNeuro 5, 1–5. link Argues that inverted encoding models are most useful when using population-level stimulus representations across experimental manipulations to pointedly test psychological theories.

  5. Sprague, T.C., Boynton, G.M., and Serences, J.T. (2019). The importance of considering model choices when interpreting results in computational neuroimaging. eNeuro 6, 1–11. link Describes the encoding model approach in the broader scope of computational models and acknowledges some important limitations.

Table of Contents

The data associated with these examples are originally derived from these OSF repositories, but have been sorted and cleaned for easier use.

Example 1: Reconstructing items from working memory

In this study, Rademaker et al. trained the IEM on an independent dataset where the participants viewed orientation gratings.

In the test data, the participants viewed a target orientation, and held it working memory for 12 seconds. During this delay period, a distractor grating appeared in a portion of the trials. The orientation of the distractor was randomized relative to the remembered orientation.

Using the fMRI data from the delay period, we will reconstruct both the orientation held in WM and the distractor orientation that was simultaneously being viewed. This sample data is from visual area V1.

The quality and interpretability of your stimulus reconstructions all depend on how you set up the channels, or basis functions, in the model. In order to ensure that you can accurately reconstruct stimuli at all portions in the area where you have presented stimuli, you will want to evenly space your basis functions in that region so that the sum of all the basis functions is constant across the feature space. You also will likely want to ensure some overlap between the basis functions.

Now that we have trained the IEM, we can test it on our two different conditions: holding an orientation in working memory, or viewing it as a distractor. Let's first look at the memory condition.

In order to collapse across trials from all the different orientations, we recenter all the reconstructions to be centered at the same point. To do this, we circularly shift the reconstruction based on the presented target orientation on that trial. We can then plot the average reconstruction across trials.

We can see that there is a robust representation of the remembered orientation in the data.

Next, we look at the reconstruction of the distractor orientation during the working memory delay period. Recall that this is the orientation that is being visually presented to the participant.

We see that the distractor orientation is simultaneously represented in the same data!

Read more about these data in the full paper.

Rademaker, R., Chunharas, C., Serences, J.T. 2019. Coexisting representations of sensory and mnemonic information in human visual cortex. Nature Neuroscience 22:8.

Example 2: Reconstructing 2D spatial representations with attention and contrast changes

Itthipuripat et al. 2019 collected data from participants as they viewed flickering checkerboard stimuli presented at a range of contrasts (0-70%, logarithmically spaced) on either the left or right of the screen. They either attended to an occasional change in contrast at the stimulus position ("attend stimulus") or at the central fixation point ("attend fixation"). These contrast change trials are excluded from the data, so the sensory input is equated across conditions.

The IEM will be trained on an independent dataset, in which participants viewed checkerboards as they appeared at many locations across the screen. Then we will test the IEM, i.e. reconstruct the stimulus, under the different contrast and attention conditions.

The test data have different conditions than the training data. There are four independent variables in these data based on the values in the following columns:

The quality and interpretability of your stimulus reconstructions all depend on how you set up the channels, or basis functions, in the model. In order to ensure that you can accurately reconstruct stimuli at all portions in the area where you have presented stimuli, you will want to evenly space your basis functions in that region. You also will likely want to ensure some overlap between the basis functions.

There are two pre-built functions to create a 2D grid of basis functions, to use a rectangular grid or a triangular grid. A triangular grid is more space-efficient, so let's use that.

Note you will need to define these basis functions before you can fit the model. Otherwise it will throw an error.

To visualize these, you will need to reshape the second dimension into the 2D pixel space where the stimuli are represented.

To check how well the basis functions cover the stimulus domain, we can sum across all the basis functions.

Next, we want to map channel responses for each voxel. To do this, we fit a standard general linear model (GLM), where the design matrix is the channel activations for each trial. Below, you can see the design matrix of these trial activations in the channel domain (x-axis: trials, y-axis: channels, color: activations).

Whenever you run the fit() function, the trial-wise channel activations will be created automatically, and the GLM will be fit on the training data and feature labels. Using this, we can then predict the feature responses on a set of test data.

Average feature reconstructions across trials

In this experiment, we are not specifically interested in separating trials by whether stimuli were on the left or the right. Instead, we're interested in how the activation in the model-based reconstruction varies with the experimental manipulation of contrast and attended location. For the sake of visualization and quantification, we can simply average across the trials of interest. Below we separated the trials by contrast and attention location, but averaged across trials where the stimulus appeared on the left side of the screen and the target was not present (to ensure that overall contrast is identical across averaged trials).

Finally, we plot the data as a function of:

1) whether subjects were attending to the stimulus or fixation, and 2) the contrast of the stimulus (across six levels).

These data suggest that increasing the contrast leads to stronger activation of the stimulus. They also suggest that the effect of attention is greatest at low contrast levels -- e.g. at contrast level 3, we see a clear enhancement when the participant is attending to the stimulus compared to when they are attending fixation.

However, since this is single-participant data, these effects should be quantified across a group of subjects.

Read the full results in the paper.

Itthipuripat, S., Sprague, T.,C., Serences, J.T. 2019. Functional MRI and EEG Index Complementary Attentional Modulations. J. Neurosci. 31:6162-6179.

Example 3: Comparing SVM and IEM with simulated data and low trial numbers

If the assumptions of the forward encoding model are a good match to the data, it can outperform common decoding algorithms such as support vector machine (SVM) classifiers. Here, we simulate fMRI responses from a set of 1D receptive fields with Gaussian-like tuning along an orientation space.

Then, we train an IEM and SVM on the same data: simulated responses from these voxel receptive fields to 7 orientation stimuli (stim_vals). We then score the accuracy of each model by calculating the coefficient of determination, R2. Note that although the IEM can provide a continuous stimulus reconstruction, we are simply taking the feature value with the maximal response to be the predicted orientation. The SVM is a 7-way classifier.

We need to define a function which can calculate R2 in a circular space. A version of this function is included with the 1D IEM module, but we need it here to score the SVM.

We simulate two sources of uniform noise: consistent noise in the receptive field responses (rf_noise) and trial-wise noise. This relies on some simple functions packaged with BrainIAK that can provide the user ways to test how the IEM behaves with known inputs.

The data simulations are repeated 25 times, and the R2 values are averaged across these repeats to account for any random variability.

We can see that both models improve with more trials, but the IEM consistently outperforms the SVM in all cases.

Next, we'll hold the number of trials constant and vary how much noise is added to the receptive fields.

Again, we see that both models perform worse with higher noise, but the IEM consistently outperforms the SVM.

Summary

The IEM allows us to build a decoder that relies on a specific hypothesis about how a stimulus is encoded. In cases where this hypothesis is a better characterization of how the stimulus evokes fMRI responses, the IEM can outperform standard decoding approaches like the SVM. In addition, the specific form of the IEM allows for a natural way to visualize a decoded stimulus, i.e. to reconstruct the stimulus that must have been seen on any given trial.