Measures of Central Tendency and Variability

Central tendency is a statistical measure that identifies a single score as representative of an entire distribution of scores. The goal of central tendency is to find the single score that is most typical or most representative of the entire distribution. Unfortunately, there is no single, standard procedure for determining central tendency. The problem is that there is no single measure that will always produce a central, representative value in every situation. There are three main measures of central tendency: the arithmetical mean, the median and the mode.

The mean of a set of scores (abbreviated M) is the most common and useful measure of central tendency. The mean is the sum of the scores divided by the total number of scores. The mean is commonly known as the arithmetic average. The mean can only be used for variables at the interval or ratio levels of measurement. The mean of [2 6 2 10] is (2 + 6 + 2 + 10)/4 = 20/4 = 5. One can think of the mean as the balance point of a distribution (the center of gravity). It balances the distances of observations to the mean.

Another measure of central tendency is the median, which is defined as the middle value when the numbers are arranged in increasing or decreasing order. The median is the score that divides the distribution of scores exactly in half. The median is also the 50th percentile. The median can be used for variables at the ordinal, interval or ratio levels of measurement. If for example, daily expenses are $50, $100, $150, $350, $350 the middle value is $150, and therefore, $150 is the median. For odd number of count the median is middle value. If there is an even number of items in a set, the median is the average of the two middle values. For example, if we had four values–$50, $100, $150, $350–the median would be the average of the two middle values, $100 and $150; thus, 125 is the median in that case. The median may sometimes be a better indicator of central tendency than the mean, especially when there are extreme values.

Another indicator of central tendency is the mode, or the value that occurs most often in a set of numbers. In other words, the mode is the score or category of scores in a frequency distribution that has the greatest frequency. In the set of expenses mentioned above, the mode would be $350 because it appears twice and the other values appear only once. The mode can be used for variables at any level of measurement (nominal, ordinal, interval or ratio). Sometimes a distribution has more than one mode. Such a distribution is called multimodal. A distribution with two modes is called bimodal. Note that the modes do not have to have the same frequencies. The tallest peak is called the major mode; other peaks are called minor modes. Some distributions do not have modes. A rectangular distribution has no mode. Some distributions have many peaks and valleys.

Variability provides a quantitative measure of the degree to which scores in a distribution are spread out. The greater the difference between scores, the more spread out the distribution is. The more tightly the scores group together, the less variability there is in the distribution. Variability is the essence of statistics. The most frequently used methods of measurement of this variance are: range, deviation and variance, interquartile range and standard deviation. The range is simply the difference between the highest score and the lowest score in a distribution plus one. This statistic can be calculated for measurements that are on an interval scale or above. In dataset with 10 numbers {99,45,23,67,45,91,82,78,62,51}, the highest number is 99 and the lowest number is 23, so 99 ˆ’23=76; the range is 76. The interquartile range (IQR) is a range that contains the middle 50% of the scores in a distribution. It is computed as follows: IQR=75th percentile ˆ’25th percentile. A related measure of variability is called the semi-interquartile range. The semi-interquartile range is defined simply as the interquartile range divided by 2. Variance can be defined as a measure of how close the scores in the distribution are to the middle of the distribution. Using the mean as the measure of the middle of the distribution, the variance is defined as the average squared difference of the scores from the mean. When the scores are spread out or heterogeneous, the measure of variability should be large. When the scores are homogeneous the variability should be smaller. Another measure of variability is the standard deviation. The standard deviation is simply the square root of the variance. The standard deviation is an especially useful measure of variability when the distribution is normal or approximately normal because the proportion of the distribution within a given number of standard deviations from the mean can be calculated. Therefore standard deviation is the average distance from the mean. So the mean is the representative value, and the standard deviation is the representative distance of any one point in the distribution from the mean.

While the measures of central tendency convey information about the commonalties of measured properties, the measures of variability quantify the degree to which they differ. If not all values of data are the same, they differ and variability exists. The measures of central tendency should be complemented by measures of variability for the same reason.

Related posts:

Leave a Reply Cancel reply