Model Evaluation - Target Shuffling

Target Shuffling

Simulations can also be used to evaluate models. One of the most popular examples is target shuffling. Target shuffling has been around for a long time, but has recently been brought back to popularity by Dr. John Elder from Elder Research. Target shuffling is when you randomly reorder the target variables values among the sample, while keeping the predictor variable values fixed. An example of this is shown below:

Target Shuffling Example

In the above picture, we are using two variables to predict the target variable if you are going to buy a product. Those two variables are age and gender. However, we have also shuffled the values (same number of categories, just different order) of the target variable to create columns \(Y_1\)\(Y_2\), etc.

The goal of target shuffling is to build a model for each of the target variables - both the one real one and all the shuffled ones. We will then record the same model evaluation metric from each shuffle to see how well our model performs compared to the random data. The shuffling should remove the pattern from the data, but some pattern may still exist in some of the shuffles due to randomness. This will help us evaluate how well our model did as compared to random as well as tell us the probability we arrived at our results just by chance.

The following sections outline two examples.

Fake Data

We are going to randomly generate 8 variables that follow a normal distribution with a mean of 0 and standard deviation of 8. However, only two of these variables will be related to our target variable:

\[ Y = 5 + 2X_2 - 3X_8 + \varepsilon \]

We are going to perform a target shuffle on our target variable \(Y\) and look at a couple of things. First, the adjusted \(R^2\) value from our model compared to the target shuffled (random data) models. Second, the number of significant variables in the target shuffled (random data) models. Remember, there should be no relationship when we shuffle our target so our adjusted \(R^2\) should be close to zero in the target shuffled models and there should not really be any significant variables in the target shuffled models.

Let’s see this example in each of our softwares!

Student Grade Analogy

Target shuffling is essentially a simulation version of permutation analysis. This can be easily seen with an example using student grades. Imagine there were 4 students in a class. One studied 1 hour for an exam, another 2 hours, another 3 hours, and the last one 4 hours. Their grades on the exam were 75, 87, 85, and 95 respectively. If you were to plot these grades on a scatter plot and run a linear regression you would have the following:

This model has an \(R^2 = 0.83\). What if the professor randomly shuffled these grades among the four students? How many different ways can four students get the grades of 75, 87, 85, and 95? There are 24 possible ways which are all listed below:

Of these 24 grades, there are 3 possible combinations that produce a linear regression with an \(R^2\) value that is greater than or equal to our actual data. Two of these are plotted below (the third is the inverse of our above plot):

Therefore, there are 4 combinations of grades that produce a regression with an \(R^2\) that is greater than or equal to our actual data - a chance of \(4/24 = 1/6 = 16.67\%\). This problem didn’t need to be target shuffled as it was easy to calculate all possible combinations of the target variable, \(4! = 24\). However, what if we had 40 students instead of 4? The number of permutations of the target variable increases tremendously (\(40! = 8.16 \times 10^{47}\)). This is the value of target shuffling. Target shuffling is essentially a simulated sampling of permutation analysis.