Comparing RDMs

To compare RDMs, we need a measure of their similarity. There are various measures to compare RDMs, which are chosen based on what aspects the model RDMs are meant to capture. The strictest measure to use would be to simply compute a distance between a model-RDM and the reference-RDMs. In virtually all cases we cannot predict the exact magnitude of the distances though, as the signal to noise ratio varies between subjects and measurement sessions. Allowing an overall scaling of the RDMs leads to the cosine similarity. If we additionally drop the assumption that a predicted difference of 0 corresponds to a measured dissimilarity of 0 we can use a correlation between RDMs. For the cosine similarity and correlation between RDMs whitened variants can take the correlations between the different entries of the RDM into account. Finally, we can drop the assumption of a linear relationship between RDMs by using rank correlations like Kendall’s tau or Spearman’s rho. For this lowest bar for a relationship Kendall’s \(\tau_a\) or randomised rank breaking for Spearman’s \(\rho_a\) are usually preferred over a standard Spearman’s \(\rho\) or Kendall’s \(\tau_b\) and \(\tau_c\), which all favor RDMs with tied ranks. As we discuss below there is a direct formula for the expected Spearman’s rho under random tiebraking, which we prefer now for computational efficiency reasons.

All comparison methods are implemented in rsatoolbox.rdm. They can each be accessed by passing a method argument to or by using a specific function rsatoolbox.rdm.compare_[comparison]. The comparison functions each take two RDMs objects as input and return a matrix of all pairwise comparisons.

Cosine similarity

The most stringent similarity measure for RDMs is the cosine similarity. For two vectorized RDMs \(\mathbf{r}_1\) and \(\mathbf{r}_2\) it is defined as:

\[\frac{\mathbf{r}_1^T \mathbf{r}_2}{\sqrt{\mathbf{r}_1^T\mathbf{r}_1\,\mathbf{r}_2^T\mathbf{r}_2}}\]

This comparison measure can be accessed using method='cosine' or using rsatoolbox.rdm.compare_cosine.

Pearson Correlation

When a dissimilarity of 0 is not interpret-able as indistinguishable, the average dissimilarity can be removed by using the Pearson correlation as a similarity measure. It is defined as:

\[\frac{(\mathbf{r}_1- \bar{\mathbf{r}}_1)^T (\mathbf{r}_2- \bar{\mathbf{r}}_2)}{\sqrt{(\mathbf{r}_1- \bar{\mathbf{r}}_1)^T (\mathbf{r}_1- \bar{\mathbf{r}}_1)\,(\mathbf{r}_2 -\bar{\mathbf{r}}_2)^T (\mathbf{r}_2- \bar{\mathbf{r}}_2)}},\]

where the bar indicates the mean of the vector.

This comparison measure can be accessed using method='corr' or using rsatoolbox.rdm.compare_correlation.

Whitened comparison measures

We recently derived a formula for the covariance of RDM entries, which arises because all dissimilarities of a single condition are based on the same measurements of that condition (see Diedrichsen_2021). Based on a simplified estimate of this covariance \(V\) we can then compute a whitened cosine similarity as:

\[\frac{\mathbf{r}_1^T V^{-1} \mathbf{r}_2}{\sqrt{\mathbf{r}_1^TV^{-1}\mathbf{r}_1\,\mathbf{r}_2^TV^{-1}\mathbf{r}_2}}\]

and a whitened correlation as:

\[\frac{(\mathbf{r}_1- \bar{\mathbf{r}}_1)^T V^{-1}(\mathbf{r}_2- \bar{\mathbf{r}}_2)}{\sqrt{(\mathbf{r}_1-\bar{\mathbf{r}}_1)^T V^{-1}(\mathbf{r}_1-\bar{\mathbf{r}}_1)(\mathbf{r}_2-\bar{\mathbf{r}}_2)^T V^{-1}(\mathbf{r}_2-\bar{\mathbf{r}}_2)}}\]

The cosine similarity measures are exactly equivalent to a linear centered kernel alignment (CKA) and the correlation is equivalent to the cosine similarity after removing the mean. This equivalent formulation can be computed faster as it avoids the inversion of \(V\). Thus, our implementation uses these equivalent formulation for faster computation in the background.

These comparison measures can be accessed using method='corr_cov' and method=='cosine_cov' or using rsatoolbox.rdm.compare_correlation_cov_weighted and rsatoolbox.rdm.compare_cosine_cov_weighted.

Kendall’s tau

Kendals \(\tau_a\) is implemented for backward comparisons. It implements a rank correlation, which does not favor with tied ranks. Consider Spearman’s \(\rho_a\) as a faster alternative.

This comparison measure can be accessed using method='tau-a' or using rsatoolbox.rdm.compare_kendall_tau.

Spearman’s rho

Spearman’s rank-correlation in its original form is higher for predictions with tied ranks, which introduces an unwanted bias into analyses. As a solution earlier versions recommended the use of Kendall’s \(\tau_a\) to remove this problem. This problem can also be solved by using the expected Spearman’s \(\rho\) under random tiebreaking as an evaluation criterion instead. This coefficient was called \(\rho_a\) by Kendall. For this expectation there is a direct formula based on the rank transformed entries of the two RDMs \(\mathbf{x}\) and \(\mathbf{y}\):

\[\begin{split}\rho_a(\mathbf{x},\mathbf{y}) &=&\mathop{\mathbb{E}_{\substack{ \tilde{\mathbf{a}}=\tilde{\mathbf{x}}-\frac{1}{n}\sum_{i=1}^{n}{i},\tilde{\mathbf{x}} \sim Rae(\mathbf{x})\\ \tilde{\mathbf{b}}=\tilde{\mathbf{y}}-\frac{1}{n}\sum_{i=1}^{n}{i},\tilde{\mathbf{y}} \sim Rae(\mathbf{y})}} \biggl[ \frac{ \tilde{\mathbf{a}}^\top\tilde{\mathbf{b}}} {\|\tilde{\mathbf{a}}\|_2\|\tilde{\mathbf{b}}\|_2} \biggr]}\\ &=&\frac{12}{n^3-n}\mathop{\mathbb{E}_{\tilde{\mathbf{a}}} [ \tilde{\mathbf{a}}]^\top} \mathop{\mathbb{E}_{\tilde{\mathbf{b}}} [ \tilde{\mathbf{b}}] }\\ &=& \frac{12\mathbf{x}^\top\mathbf{y}}{n^3-n} - \frac{3(n+1)}{n-1}\end{split}\]

Using \(\rho_a\) is much faster to compute and the best average RDM for a set of data RDMs is easily computed, which are two important advantages. Thus, we generally recommend using this \(\rho_a\) measure now.

This comparison measure can be accessed using method='rho-a' or using rsatoolbox.rdm.compare_rho_a.