Apr 12, 01:40 PM
There are a lot of game companies out there that collect important data on their games, but then aggregate that data straight out the gate into a max/min/avg trio. For instance, something like this:
Average Damage Per Second (DPS) by Class
A lot of people are happy with this kind of data! This makes me sad, and I’ll show you why.
Histograms Are Your Friend
Let’s take a look at the data for the Fighter. All that we know from the table is that nobody has fewer than 101 DPS, nobody has more than 220 DPS, and the average DPS is 134. However, we have no idea how this data is distributed! If you recorded the DPS achieved by each individual player to capture the entire distribution, you could make histograms of the data by charting, for each DPS value along the X axis, the number of players who had that DPS value on the Y axis.
For example, the histogram of recorded DPS values for the Fighter class could look like this (click on the graph to see it full size):
This is the graph that most people think of in their heads when they see the min/max/avg figures. It conforms in a basic normal distribution where most people are near the average and the min and max are outliers.
However, the distribution could just as easily look like this:
This graph still conforms to the min/max/avg, but it tells a very different story, where almost everybody is clustered around the average and anyone off by more than a few percentage points is an exceptional case. In this case, you would look at the graph and say, “Oh, 99% of people who play Fighters have a DPS of about 134. Everybody else should be considered an outlier and we should figure out how they managed to get that kind of performance.”
Scarier still would be this kind of distribution:
Aaaarrgh! This is a scary, scary distribution which completely invalidates the average figure. While the average is still technically 134, there are actually two separate distributions, one clustered close to the min and one clustered close to the max. In this case, you might say, “Hmmm. Maybe the Fighters with a DPS of close to 200 all managed to get a special item or power that has a bug which gives them +70 to their DPS.”
Now, if you absolutely can’t get individual data points for your measurements, I would at least encourage you to record min/max/avg/mode/median/#points. Basically, what the mode and the median will do is provide a sanity check: if the average is roughly equal to the mode and the median, then you can say with some confidence that you have a normal distribution, and you can sleep easier at night. And recording the number of sample points is always important. A min/max/avg table is useless if you only have a few data points!