• About us…

  • The archives

  • RSS The Gaming Session



  •  Better and faster with IPv6

  • ipv6 ready

More than just average

By special reader request, a quick discussion of averages, since I usually whiz past them pretty quickly. If your school-teachers were on-the-ball, you’ve probably heard all of this before, but a short refresher is always nice for those of us that don’t work with them routinely – or if you’re trying to tell how badly a political candidate might be lying to you to score your vote.

There are actually three things called an ‘average’. They are the mean, the median and the mode.

Warning: Contains traces of simple arithmetic While they’re all technically averages, they differ in how they’re calculated and what they’re useful for . Usually, when anyone says the word ‘average’, they’re talking about the ‘mean’.

Let’s invent an example. Some kind of hat. Within a city, the hats are on sale for various prices. We’ll make a list of those prices:

($10, $11, $11, $11, $12, $12, $20)

Okay, so let’s look at the mean, first. We get that by simply adding the values together, and dividing by the number of values. That’s 87/7 which comes to 12.42.

So, most folks using this method would say that the average price of this hat is $12.42.

But looking at the numbers, that doesn’t seem quite right, does it? The problem is that $20 on the end. This outlying value pushes the mean upwards. Likewise, if there was a single much lower value, it would skew the mean down.

Let’s look at the median instead. The median is the middle value of a set, where as many values are below it as are above it.

In this example, it would be 11: ($10, $11, $11, $11, $12, $12, $20)

In the case where you have an even number of values, you take the mean of the two middle values: (1,2,3,4)= 2.5

So, the median is $11. That’s a pretty good number. It gives you a pretty good idea of how much you’d expect to pay for a hat if you walked into one of the stores at random, and whether a given price is perhaps a bit more or less expensive. Using a median instead of a mean places less emphasis on unusually large or unusually small samples in your data set.

The last one, and perhaps the least used is the ‘mode’. The mode is simply the most commonly recurring value in the set of samples. In this case it is also 11. There are three of those, and only two 12s. So, the mode is $11, which tells us that $11 is the most common price-tag for this kind of a hat.

For both these kinds of usage, it’s clear that the median and the mode tell us more about the price of hats than the mean does.

So, why is the mean so commonly used?

Well, it’s easier to calculate. Before we had digital computers and programmable calculators, figuring out the median was a real nuisance. First you had to sort all of your values into order, then find the mid-point of the set (if it was an odd number of values) or take the mean of the middle two values (if it was an even number of values).

Or, you could just add them all together, divide by the number of values and get the mean, and that got you in roughly the right area, sort of.

My maths teachers in school all those decades ago used to call the mean “the lazy person’s average” and that “a median is never any worse than a mean, and almost always better.”

Having a mean actually isn’t such a dreadful thing, since comparing the behaviour of a mean and a median can tell you quite a bit, in some circumstances.

A real-world example

Since the 1980s, in the USA, mean wages have been on the rise. Actually, they’ve been on the rise for a lot longer than that, but since the 1980s something else interesting has been happening: the median wages have been rising much slower.

What does that mean?

Well, think about it. Remember our hat example? The mean was 12.42, but the median was 11, because of a large figure in the set. if that figure was larger, the median would stay the same, but the mean would rise.

So, in the USA, wages up near the top end (where they are already large) have been rising faster than wages near the middle and lower. This causes the median wage to move upwards very slowly, while the mean wages figure goes up much faster.

In other words, if you were focused on the mean US wage, you’d be thinking that wages were rising for most people far faster than is actually the case. Congratulations! You’ve just used math to get the wrong idea!

Here’s another example: In once census survey, the mean number of children in a family was 2.36. The mode, however, was just 1.

So, the most common number of children for a family to have there was one. Other numbers (2, 3 or more) were less common.

Summary

The mean, the median and the mode are each different tools, and the type of data you’re applying them to makes a difference.

Mean, for example, doesn’t give you very good results for data sets like user-concurrency, prices or wages. Medians probably give you the best results there. Modes, on the other hand, work best when the number of possible values is relatively small, like the number of children, cars or televisions a family might have.

Now when someone mouths off about averages (probably means), means, medians and modes – you’re prepared!

Categories: Math, Science, Statistics.



Got a news tip or a press-release? Send it to [email protected].

  • Support us

    Writing is my day job. Site advertising pays for the hosting, but nothing else. Help keep us in coffee and keyboards

    ... or donate in Second Life at this location.

  • ...or use Flattr

  • Read previous post:
    Close