In statistics, variance is a fundamental measurement that finds the variation in the given data set from the average. It is the best way to measure the dispersion of the data observations of various objects.
It is one of the essential measurements in statistics that is used in hypothesis testing, risk management, standard deviation, and quality control. In this post, we are going to explain the statistical measurement variance along with its types and solved examples.
What is the variance in statistics?
Variance is the statistical measurement that has a wide range of uses in risk management, hypothesis testing, and quality control. It measures the spread of data observations of various things from the average value of those data values.
It plays a vital role in machine learning. It is the best way that describes the models at various portions of the training data sets. It measures how much machine learning functions could be adjusted on the basis of the given set of data observations.
In other words, the variance is the variability in the model prediction. It is very essential to understand the variance in order to make sound decisions and draw meaningful conclusions on behalf of the set of data observations and their means.
Mathematically, the variance is the sum of the squared deviations (differences of data observations from the average) divided by the total number of observations. There are two ways to calculate the variance on the basis of the nature of the data set.
Sample variance
In statistics, the sample variance is the quantity of the variability of the sample set of data observations. A sample set of observations is the subset of whole observations taken to calculate the estimated results when population observations are larger and difficult to handle.
Such as if we want to evaluate the variance of the weights of the boys in a university, then we’ll take some of the weights from the whole to describe the estimated results. Sample variance has “s2” as a symbol for representation.
The formula for measuring the sample variance is:
Formula | Components |
S2 = Σ(xi – x̄)² / (n – 1) | S2 = the sample varianceΣ = the summation symbolxi = each data point in the samplex̄ = the sample meann = the total number of data points in the sample |
Population variance
In statistics, the population variance is the quantity of the variability of the population set of data observations. The population set of observations is the set of whole observations taken to calculate the estimated results of whole observations.
Such as if we want to evaluate the variance of the weights of all the boys in a university, then we’ll take all weights of each boy to calculate the weights of each male and their average. Population variance has “σ2” as a symbol for representation.
The formula for measuring the population variance is:
Formula | Components |
σ2 = Σ(xi – μ)² / N | σ2 = the population varianceΣ = the summation symbolxi = each data point in the populationμ = the population meanN = the total number of data points in the population |
Note
There is the only difference in the types of variance are the nature of data observations and the sample mean and population means of sample and population data observations.
How to find the sample and population variances?
Here are a few examples of measuring the sample and population variances with a step by step calculations. Alternatively, a variance calculator can be used to get the sample and population variance results on a single click.
Example I: for population data
Use the given data observations to measure the spread of the given entire experiments of an object.
1, 4, 9, 11, 14, 17, 21, 24, 25, 31, 34, 37
Solution
Step 1: Find the total sum of data observations and divide the result by the total number of observations to get the result of the average of population.
Given Observations | xi = 1, 4, 9, 11, 14, 17, 21, 24, 25, 31, 34, 37 |
Sum of Observations | Σ xi = 1 + 4 + 9 + 11 + 14 + 17 + 21 + 24 + 25 + 31 + 34 + 37Σ xi = 228 |
Average of population | μ = (Σ xi) ÷ nμ = 228 ÷ 12 = 19 |
Step 2: Now take each observation and subtract the mean “19” from it and calculate the square of each difference.
Given Observations | xi – μ | (xi – μ)2 |
1 | 1 – 19 = -18 | (-18)2 = 324 |
4 | 4 – 19 = -15 | (-15)2 = 225 |
9 | 9 – 19 = -10 | (-10)2 = 100 |
11 | 11 – 19 = -8 | (-8)2 = 81 |
14 | 14 – 19 = -5 | (-5)2 = 25 |
17 | 17 – 19 = -2 | (-2)2 = 4 |
21 | 21 – 19 = 2 | (2)2 = 4 |
24 | 24 – 19 = 5 | (5)2 = 25 |
25 | 25 – 19 = 6 | (6)2 = 36 |
31 | 31 – 19 = 12 | (12)2 = 144 |
34 | 34 – 19 = 15 | (15)2 = 225 |
37 | 37 – 19 = 18 | (18)2 = 324 |
Step 3: Now take the third column results and add them to calculate the sum of squared deviations.
∑ (xi – μ)2 = 324 + 225 + 100 + 64 + 25 + 4 + 4 + 25 + 36 + 144 + 225 + 324
∑ (xi – μ)2 = 1500
Step 4: Now divide the sum of squared deviation by the total observations of the population set
∑ (xi – μ)2 ÷ (n) = 1500 ÷ 12
∑ (xi – μ)2 ÷ (n) = 125
Example-2: for sample data
Use the given set of observations to find the spread of the given sample observations from the entire experiments of an object.
2, 4, 7, 9, 13, 15, 17, 19, 21, 23, 24
Solution
Step 1: Find the total sum of data observations and divide the result by the total number of observations to get the result of the average of the sample.
Sample Observations | xi = 2, 4, 7, 9, 13, 15, 17, 19, 21, 23, 24 |
Sum of observations | Σxi = 2 + 4 + 7 + 9 + 13 + 15 + 17 + 19 + 21 + 23 + 24Σxi = 154 |
Average of sample | x̄ = (Σxi) ÷ nx̄ = 154 ÷ 11 = 14 |
Step 2: Now take each observation and subtract the mean “14” from it and calculate the square of each difference.
Sample Observations | xi – x̄ | (xi – x̄)2 |
2 | 2 – 14 = -12 | (-12)2 = 144 |
4 | 4 – 14 = -10 | (-10)2 = 100 |
7 | 7 – 14 = -7 | (-7)2 = 49 |
9 | 9 – 14 = -5 | (-5)2 = 25 |
13 | 13 – 14 = -1 | (-1)2 = 1 |
15 | 15 – 14 = 1 | (1)2 = 1 |
17 | 17 – 14 = 3 | (3)2 = 9 |
19 | 19 – 14 = 5 | (5)2 = 25 |
21 | 21 – 14 = 7 | (7)2 = 49 |
23 | 23 – 14 = 9 | (9)2 = 81 |
24 | 24 – 14 = 10 | (10)2 = 100 |
Step 3: Now take the third column results and add them to calculate the sum of squared deviations.
∑ (xi – x̄)2 = 144 + 100 + 49 + 25 + 1 + 1 + 9 + 25 + 49 + 81 + 100
∑ (xi – x̄)2 = 584
Step 4: Now divide the sum of squared deviation by the total observations of the sample set minus 1.
∑ (Xi – x̄)2 ÷ (N – 1) = 584 ÷ 11 – 1
∑ (Xi – x̄)2 ÷ (N – 1) = 584 ÷ 10
∑ (Xi – x̄)2 ÷ (N – 1) = 58.4
You can also take help from a sample variance calculator to avoid such a lengthy calculations to find the variance.
Conclusion
Now you can understand the basics of the variance from this post. We have covered the definition, types, formulas, and solved calculations of sample and population variance to understand the concept precisely.