For a population of values, three customary "measures of central tendency" are the arithmetic mean, the median, and the mode.

The Pythagorean means are the arithmetic mean, the geometric mean, and the harmonic mean.

These measures have the property that if all values in a population are the same, the measure is the common value.

The min and max of a set—and all quantiles—have this same property, so the property is necessary, but perhaps not sufficient to guarantee that a statistic is a "measure of central tendency".

What if a population had to be characterized by a single number? The size of the population seems important but tells us nothing about what the values might be.


Given n values xi, we can compute an average value in several different ways.

The arithmetic mean is well defined for all real or complex xi:

\begin{align} \;\;\;\;\mathrm{AM}(x_1, ..., x_n) \;=\; \frac{\sum_{i=1}^n \; x_i}{n} \end{align}

The arithmetic mean has this relationship with the max:

\begin{align} \frac{\max(x_1, ..., x_n)}{n} \leq \mathrm{AM}(x_1, ..., x_n) \leq \max(x_1, ..., x_n) \end{align}


The geometric mean is only well-defined for non-negative real values xi:

\begin{align} \;\;\;\;\mathrm{GM}(x_1, ..., x_n) \;=\; \sqrt[n]{\prod_{i=1}^n \; x_i} \end{align}

If there are negative or complex values in the data, one can always compute the geometric mean of their absolute values.

If you have a rectangle R with sides of length w and h, then a square S with area equal to R will have a side length s which is the geometric mean of w and h.

The geometric mean is related to the arithmetic mean by these identities:

\begin{align} \exp\big(\mathrm{AM}(x_1, ..., x_n)\big) = \mathrm{GM}\big(\exp(x_1), ..., \exp(x_n)\big) \end{align}
\begin{align} \log\big(\mathrm{GM}(x_1, ..., x_n)\big) = \mathrm{AM}\big(\log(x_1), ..., \log(x_n)\big) \end{align}


The harmonic mean is well-defined for positive real values xi:

\begin{align} \;\;\;\;\mathrm{HM}(x_1, ..., x_n) \;=\; \frac{n}{\sum_{i=1}^n \; \frac{1}{x_i}} \end{align}

If a car is doing laps on a track, then the overall average speed is the harmonic mean of the average speed of each of the laps.

However, if a car drives at speed A for an hour, and then drives at speed B for an hour, the overall average speed is the arithmetic mean.

This relationship holds between the harmonic mean and the min:

\begin{align} \min(x_1, ..., x_n) \leq \mathrm{HM}(x_1, ..., x_n) \leq n \min(x_1, ..., x_n) \end{align}


If all the values in a population X are the same, say x, then the arithmetic mean, the geometric mean, and the harmonic mean will all be equal to that value.

All three means are greater than or equal to the minimum value in the population.

All three means are less than or equal to the maximum value in the population.

In fact, the following inequality is always true:

\begin{align} \min X \leq \mathrm{HM} \leq \mathrm{GM} \leq \mathrm{AM} \leq \max X \end{align}

Moreover, the inequalities are strict if and only if the elements of X are not all equal.

a proof?


for all n, or just n=2?


Suppose that we want to compare two companies by their revenue and their consumer satisfaction score. The revenue is expressed in dollars, whereas the the satisfaction score is a number between zero and 100. For each company, we have a "population" of scores.

If the arithmetic mean is used to compare the companies, then the company with the higher revenue will get the higher score. The problem is that the scores aren't normalized. We could normalize the scores, but it might be better to use the geometric mean, since it doesn't require normalization.

geometric mean and averaging interest rates


Suppose that class A has 20 students and gets and average score of 80 on a test. Class B has 30 students and gets and average score of 90 on a test. Then the overall average score is the weighted arithmetic mean:

\begin{align} \frac{20 \cdot 80 + 30 \cdot 90}{20 + 30} = \frac{4300}{50} = 86 \end{align}

A slightly different way to calculate the weighted arithmetic mean uses the fact that class A is 0.4 of the population and class B is 0.6 of the population:

\begin{align} 0.4 \cdot 80 + 0.6 \cdot 90 = 32 + 54 = 86 \end{align}

The latter method of calculation shows that the weighted arithmetic mean is in the convex hull of arithmetic means for the groups.


Why the harmonic mean is better than the arithmetic mean.

The family of F-measures. Which one to choose if we have a cost on false positives and false negatives.


The median is a quantile. It is the same as the 2nd quartile, the 5th decile, and the 50th percentile.

Quantiles can be computed by sorting the population. They are well defined for any ordinal values.

If the number of values in the population is even, the median is usually defined, at least for interval data, as the arithmetic mean of the two middle values.

On interval data, the midrange is arithmetic mean of the minimum and the maximum. The trimean is the arithmetic mean of the 1st, 2nd, and 3rd quartiles.


The mode is the only measure of central tendency for nominal data.

Some data sets lack a unique mode.


The frequency of a value a in a population X is how often that value occurs in the population.

If X is a multiset, the multiplicity of a is how often it occurs in X. The notation for this is

\begin{equation} 1_X(a) \end{equation}


The arg max of a function f is the set of values where the maximum value of the function is attained. The set can be empty even if f is bounded from above; e.g.

\begin{align} f(x) = x\;\;\;\;x \in (-\infty, 0) \end{align}

The set can have more than one value, such as for a constant function.

The mode is the arg max of the frequency.

Given some observed data and a distribution with unknown parameters, the maximum likelihood estimation of the parameters is the arg max of the probability density function of the observed data values, where the pdf is treated as a function of the parameters instead of the data values.


quadratic mean


The arithmetic mean of vectors is called a centroid. It is also the vector formed from the arithmetic means of the components.


The mediod of a population of vectors is the vector for which the sum of the distances to all the other vectors is minimal. The mediod is always a member of the population; the centroid is often not.

mediod defined when centroid isn't…


Measures of central tendency are canonical examples of statistics, which are functions which map sets of values to single values.


An estimator is a function defined on the set of samples of a population. An estimate is the value the estimator assigns to a specific sample.

However, must an estimator be associated with a population or distribution statistic?


Population as multiset instead of set…


Instead of a distribution, it


Can be drawn from a population or a distribution.


Only assumption of the classical central limit theorem is that the random variables in the sample are independent and identically distributed, and the variance is finite.

If the mean and variance are μ and σ2, then the mean and variance of the sample are μ and σ2/n.


Distributions are defined by usually a small number of parameters.


Expected value of the the estimator is the same as the true value of the statistic.


Look at the change from making a large change to one value in the sample.

What about a sample drawn from two samples with different variance?


Does the estimate contain all information in the sample relevant to computing the statistic?



Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License