variance

NOVEMBER 14, 2023

What is Variance in Math? Definition

Variance is a statistical measure that quantifies the spread or dispersion of a set of data points. It provides information about how far each data point in a dataset is from the mean (average) and thus indicates the variability or diversity within the dataset. In simpler terms, variance measures how much the data points deviate from the average.

History of Variance

The concept of variance was introduced by the English statistician and geneticist Ronald A. Fisher in the early 20th century. Fisher developed the concept as part of his work on the analysis of variance (ANOVA) and the design of experiments. His contributions to the field of statistics revolutionized the way researchers analyze and interpret data.

What Grade Level is Variance For?

Variance is typically introduced in middle or high school mathematics courses, depending on the curriculum. It is commonly taught in algebra or statistics classes, where students learn about data analysis and probability.

Knowledge Points and Detailed Explanation

To understand variance, it is essential to grasp the following concepts:

  1. Mean: The average of a set of numbers.
  2. Deviation: The difference between each data point and the mean.
  3. Squaring: Multiplying a number by itself.
  4. Summation: Adding up a series of numbers.

The steps to calculate variance are as follows:

  1. Calculate the mean of the dataset.
  2. Subtract the mean from each data point to find the deviation.
  3. Square each deviation.
  4. Find the sum of all squared deviations.
  5. Divide the sum of squared deviations by the total number of data points minus one (for a sample) or the total number of data points (for a population).

Types of Variance

There are two main types of variance:

  1. Sample Variance: This is used when the dataset represents a sample from a larger population. The formula for sample variance involves dividing the sum of squared deviations by the total number of data points minus one.

  2. Population Variance: This is used when the dataset represents the entire population. The formula for population variance involves dividing the sum of squared deviations by the total number of data points.

Properties of Variance

Variance possesses several important properties:

  1. Variance is always non-negative.
  2. If all data points are identical, the variance is zero.
  3. Variance is sensitive to outliers, meaning that extreme values can significantly affect its value.
  4. Variance is not in the same unit as the original data points. It is measured in squared units.

How to Find or Calculate Variance?

To calculate variance, follow these steps:

  1. Find the mean of the dataset.
  2. Subtract the mean from each data point to obtain the deviations.
  3. Square each deviation.
  4. Sum up all the squared deviations.
  5. Divide the sum of squared deviations by the total number of data points minus one (for a sample) or the total number of data points (for a population).

Formula or Equation for Variance

The formula for sample variance is:

Sample Variance Formula

The formula for population variance is:

Population Variance Formula

Where:

  • Sample Variance Formula represents each data point in the dataset.
  • Sample Variance Formula is the mean of the dataset.
  • Sample Variance Formula is the population mean.
  • Sample Variance Formula is the total number of data points.

How to Apply the Variance Formula?

To apply the variance formula, substitute the values of the dataset into the formula and perform the necessary calculations. The resulting value will represent the variance of the dataset.

Symbol or Abbreviation for Variance

The symbol commonly used to represent variance is Variance Symbol for population variance and Variance Symbol for sample variance.

Methods for Variance

There are various methods to calculate variance, including:

  1. Direct Calculation: This involves manually calculating the deviations, squaring them, and finding the average.
  2. Excel or Spreadsheet Software: Many spreadsheet programs have built-in functions to calculate variance.
  3. Statistical Software: Statistical software packages like R, Python's NumPy, or MATLAB provide functions to calculate variance.

Solved Examples on Variance

Example 1: Calculate the sample variance for the following dataset: 5, 7, 9, 11, 13.

Solution:

  1. Find the mean: (5 + 7 + 9 + 11 + 13) / 5 = 9.
  2. Calculate the deviations: (5 - 9), (7 - 9), (9 - 9), (11 - 9), (13 - 9) = -4, -2, 0, 2, 4.
  3. Square each deviation: (-4)^2, (-2)^2, 0^2, 2^2, 4^2 = 16, 4, 0, 4, 16.
  4. Sum up the squared deviations: 16 + 4 + 0 + 4 + 16 = 40.
  5. Divide the sum by the total number of data points minus one: 40 / (5 - 1) = 10.

Therefore, the sample variance is 10.

Example 2: Calculate the population variance for the following dataset: 2, 4, 6, 8, 10.

Solution:

  1. Find the mean: (2 + 4 + 6 + 8 + 10) / 5 = 6.
  2. Calculate the deviations: (2 - 6), (4 - 6), (6 - 6), (8 - 6), (10 - 6) = -4, -2, 0, 2, 4.
  3. Square each deviation: (-4)^2, (-2)^2, 0^2, 2^2, 4^2 = 16, 4, 0, 4, 16.
  4. Sum up the squared deviations: 16 + 4 + 0 + 4 + 16 = 40.
  5. Divide the sum by the total number of data points: 40 / 5 = 8.

Therefore, the population variance is 8.

Practice Problems on Variance

  1. Calculate the sample variance for the dataset: 3, 5, 7, 9, 11.
  2. Calculate the population variance for the dataset: 1, 3, 5, 7, 9.
  3. Find the sample variance for the dataset: 10, 10, 10, 10, 10.

FAQ on Variance

Q: What is the purpose of calculating variance? A: Variance helps to understand the spread or dispersion of data points in a dataset. It is useful in comparing different datasets, identifying outliers, and making statistical inferences.

Q: Can variance be negative? A: No, variance is always non-negative. It can be zero if all data points are identical.

Q: How does variance relate to standard deviation? A: Standard deviation is the square root of variance. It provides a measure of dispersion in the original units of the dataset, while variance is measured in squared units.

Q: Is variance affected by outliers? A: Yes, variance is sensitive to outliers. Outliers, being extreme values, can significantly impact the variance value.

Q: Can variance be used for categorical data? A: No, variance is primarily used for numerical data. For categorical data, other measures like mode or chi-square tests are more appropriate.