In making any measurement, it's nice to be able to describe the quality of that measurement. What we're interested in here is precision, the ability of a measurement technique to be reproduced with only small errors.
These graphs represent simulated measurements of a quantity (maybe a length or a mass) with a "true value" of 6. The green circles are instances of each measurement. In the top graph, four measurements produced a value of 3, three produced a value of 4, and so on. The average of all 25 measurements is 6.12, pretty close to our true value. The number after the ± sign will be know as the standard deviation, and it represents the precision of those measurements – how spread out they are. More on this later, but it will turn out that about 68% of all of the measurements will fall in the range between 6.12 - 2.09 and 6.12 + 2.09.
The 25 measurements in the lower graph are more tightly clustered about the true value. They are more precise. This is represented mathematically by the smaller standard deviation, which shows that 68% of all measurements in this set will lie between 6.08 - 0.81 and 6.08 + 0.81, a more narrow range than the other set.
On this page we'll learn how to calculate the standard deviation (and related variance) of sets of data like these.
The standard deviation is a measure of the width of the underlying distribution of measurements. The narrower the distribution, the more tightly clustered the measurements will be about the mean and the more precise those measurements are.
Roll over or tap the image to compare the widths of a precise and less precise distribution.
Mathematically, the standard deviation, usually signified by the Greek letter sigma (σ) is the average value of the distance of each measurement from the mean. It looks like this:
You should study that equation a bit. It isn't as intimidating as it looks. If you don't know about summation notation, the Greek capital letter sigma ( Σ ) simply means to add up N things starting from the first one (we count them with the variable n, starting at n = 1) and ending at the Nth. There are N total measurements.
What we're summing here is the set of differences or distances between each measurement and the mean value of those measurements. Actually, it's the square of those differences so that they're all positive. That number is called the variance, or σ2.
The standard deviation is the square root of the variance, σ
In calculating σ, we're interested in the distance of each measurement from the mean value, x. Imagine a case where we had three measurements, one at the mean, and one each at ±x, like this:
If we didn't square the differences, one would be -x and the other +x. Adding them to zero (the distance between x and x), would give us σ = 0, which is not the case. Squaring the differences ensures that they are all positive, and that this kind of misleading cancellation can't take place. Taking the square root to get σ gets us back to scale with the measurements in the end.
Let's say we have a set of ten measurements, x1 through x10, like this:
In order to calculate the standard deviation (and to have it mean something) we need the mean of those measurements, x. That's found by adding them up and dividing by the number of measurements, N = 10. In summation notation that looks like:
It's a good practice to keep some extra digits while doing the calculation, and pare them back to match the number of significant digits in the data once the calculation is complete. That way we avoid some round-off error. Now recall that the variance (the square of σ) is
To help us do the calculation, it's convenient to make a table:
is just the sum of the right-most column of the table. Then the variance, σ2, is just that sum divided by the number of measurements:
The standard deviation, σ, is the square root of the variance:
And finally, we can report the average and standard deviation like this, rounding to get back to the same number of digits we had in the data:
Graphically, the data (green circles) the mean and standard deviation look like this.
The standard deviation tells us that for the data collected, assuming that if enough data were collected the distribution would be normal or Gaussian (see Central limit theorem), about 2/3 of the measurements would fall in the range 2.9 - 0.9 to 2.9 + 0.9.
The standard deviation is the most-commonly accepted way of describing the precision of measurements.
We must accept that when the number of measurements is small, both an average and its standard deviation has a diminished meaning. If, for example, I want to measure the average height of American females, and I do that by measuring the heights of two women and averaging the results, I hope you wouldn't take that seriously as the average height of ALL women in America. The same is true of the standard deviation derived from those two measurements.
When the number of measurements is small OR when the sample does not represent an entire population, we customarily divide the sum of squares of xn - x not by N, but by N-1
The so-called sample variance, σ2 is
... and the sample standard deviation is the square root of that variance.
The N-1 in the denominator increases the variance and standard deviation just a little. It accounts for the loss of a degree of freedom in the data. That is, that data was already used once to determine one unknown, the mean. Therefore, we only have N-1 new and independent measurements left to determine the standard deviation. The sample deviation accounts for that.
Now when N is large, N ≈ N-1, so the difference can be negligible. Therefore, it's a good practice to always use N-1. In our example 1 above, that would increase σ to 0.91, not a significant difference given the precision of our data, but in some situations it can be.
For a sample of a population, or if the number of samples, N, is relatively small, the sample variance of the mean is
When a dataset represents an entire population, or if N is relatively large, the population variance is
The sample and population standard deviations of those are just the square root of the variance. These are referred to as the standard deviation of the mean if we are confident that, given enough measurements, they would be normally distributed.
Let's say we gave a test and the twenty resulting scores formed the set:
In order to analyze these, it's best to construct a table something like this:
Now the mean of the scores is
We then sum the squares if distances from the mean of each score:
The variance will be this number divided by N-1, where N = 20. We'll use the sample standard deviation (divide by N-1) for this small sample.
The standard deviation is the square root of the variance:
And finally we can report our mean with its associated standard deviation or standard error:
One trick here is knowing just what makes a "small sample." The easy answer is just to remember that for a large sample, the difference between dividing by N vs. N-1 is small, particularly after we take a square root. The take home message is just use the N-1 definition of the standard deviation.
The meaning of the standard deviation is that, statistically speaking, about 2/3 of all scores did/would have been between 83.8-8.4 = 75.4 and 83.8+8.4 = 92.2. More about that below.
When the Gaussian or normal probability distribution is divided into standard deviations, we find that the total probability enclosed between ± σ is about 68%, as shown in the graph. That is, we would expect 68% of all measurements approximating the mean to lie between ± σ.
Likewise, two standard deviations enclose about 95% of the total probability, and three sigmas enclose about 99% of the total probability. In other words, it's very unlikely to find a measurement greater than three standard deviations from the mean or less than 3σ from it.
These results come from calculus, as the Gaussian function is an integral-defined function.
For the most part, standard deviations aren't calculated by hand as they've been in these examples. Calculators often have a statistics mode that can be used to calculate means, standard deviations and other properties of data sets.
Likewise, any spreadsheet program should have a suite of built in functions for calculating statistical properties of all kinds for any data set.
You should become familiar with using both.
xaktly.com by Dr. Jeff Cruzan is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License. © 2012, Jeff Cruzan. All text and images on this website not specifically attributed to another source were created by me and I reserve all rights as to their use. Any opinions expressed on this website are entirely mine, and do not necessarily reflect the views of any of my employers. Please feel free to send any questions or comments to firstname.lastname@example.org.