Summarizing: [newman-2006-archiv:cond-mat]

Many measured quantities concentrate around a mean value.

- heights of human beings: adult human beings average about 180cm. Tallest and shortest adult men as having had heights 272cm and 57cm (ratio 4.8).
- speeds in miles per hour of cars on British highways are strongly peaked around 75mph.

BUT not all measured concentrate nicely around a mean.

Some vary over an enormous range (sometimes many orders of magnitude). E.g., 2700 US city populations.

- largest population of any city in the US is 8.00 million for New York City (2000)
- America’s smallest town is Duffield, Virginia, with a population of 52
- the ratio of largest to smallest population is at least 150 000.

America’s total population of 300 million people could be allocated to 40 cities the size of New York. But America’s 2700 cities cannot have a mean population of more than 110,000.

Source: Newman (2006)

A histogram of city sizes plotted with logarithmic horizontal and vertical axes follows quite closely a straight line.

Source: Newman (2006)

$$\begin{array}{c}lny=-alnx+c\end{array}$$

$$\begin{array}{c}f\left(x\right)=c{x}^{-a}\end{array}$$

Power law implies that small $x$ are very common and large $x$ are quite rare.

(See Mathematica notebook.)

- power law probability distribution
- named after Vilfredo Pareto (1848-1923, Italian economist)
- also known as Bradford distribution

Note: these notes draw on http://en.wikipedia.org/wiki/Pareto_distribution, http://www.math.uah.edu/stat/special/Pareto.pdf,

For $x\ge {x}_{m}$ we have

$$\begin{array}{c}Pr(X>x)={\left(\frac{{x}_{m}}{x}\right)}^{\alpha}\end{array}$$

- parameterized by ${x}_{m}$ and $\alpha $,
where ${x}_{m}>0$ is the cutoff (minimum possible realization),
and $\alpha >0$ is called the Pareto index or
*shape*parameter. - we can define the probability function below the cutuff: $Pr(X>x)=1$ for $x<{x}_{m}$
- the
*basic*Pareto distribution sets ${x}_{m}=1$ - interpret $Pr(X>x)$ as the fraction of the population with wealth greater than $x$.

For $x\ge {x}_{m}$

$$\begin{array}{c}{F}_{X}\left(x\right)=1-{\left(\frac{{x}_{m}}{x}\right)}^{\alpha}\end{array}$$

Differentiating w.r.t. $x$, the probability density function is

$$\begin{array}{c}{f}_{X}\left(x\right)=\alpha \frac{{x}_{m}^{\alpha}}{{x}^{\alpha +1}}\end{array}$$

If $\alpha \le 1$, the expected value is infinite. If $\alpha >1$ then

$$\begin{array}{c}E\left(X\right)=\frac{\alpha {x}_{m}}{\alpha -1}\end{array}$$

If $\alpha \le 2$, the variance does not exist. If $\alpha >2$ then

$$\begin{array}{c}var\left(X\right)={\left(\frac{{x}_{m}}{\alpha -1}\right)}^{2}\frac{\alpha}{\alpha -2}.\end{array}$$

The raw moment moments are

$$\begin{array}{c}{\mu}_{n}\text{'}=\frac{\alpha {x}_{m}^{n}}{\alpha -n}\end{array}$$

A finite $n$-th moment exists only for $n<\alpha $.

Here is a cool property of the Pareto variate. The conditional probability

$$\begin{array}{c}Pr(X>x|X>{x}_{1}\ge {x}_{m}\end{array}$$

is also a Pareto distribution. It has the same Pareto index $\alpha $ but has cutoff ${x}_{1}$.

(selected from Wikipedia)

- sizes of human settlements (few cities, many hamlets/villages)
- file size distribution of internet traffic
- values of oil reserves in oil fields (a few large fields, many small fields)
- length distribution in jobs assigned supercomputers (a few large ones, many small ones)
- areas burnt in forest fires

- Wealth and income distribution
- pdf shows that the "probability" or fraction of the population that owns a small amount of wealth per person is rather high, and then decreases steadily as wealth increases.
- Pareto used this to describe the distribution of wealth and of income among individuals
- most wealth of any society is owned by a small percentage of the people in that society
- Pareto principle: "80-20 rule" (20% of the population controls 80% of the wealth)

The 80-20 law holds when the Pareto index is $\alpha =lo{g}_{4}5$.

- 80-20 law
- 20% of all people receive 80% of all income
- make a group of the top 20% → 20% of the group receive 80% of its income
- recursively

(Excludes $0<\alpha \le 1$, which implies infinite expected value and thus cannot model income distribution.)

**Zipf's law**

- the zeta distribution
- discrete analog of Pareto distribution
- the $n$-th largest value of X is approximately $C{n}^{-k}$

The Pareto distribution is related to the exponential distribution as follows. If $X$ is Pareto-distributed with minimum ${x}_{m}$ and index $\alpha $, then

$$\begin{array}{c}Y=ln\left(\frac{X}{{x}_{m}}\right)\end{array}$$

is exponentially distributed with intensity $\alpha $. Equivalently, if $Y$ is exponentially distributed with intensity $\alpha $, then

$$\begin{array}{c}{x}_{m}{e}^{Y}\end{array}$$

is Pareto-distributed with minimum ${x}_{m}$ and index $\alpha $.

Let $f$ be any pdf with finite mean $\mu =\underset{{x}_{m}}{\overset{\infty}{\int}}xf\left(x\right)dx$. Then the associated quantile function (Lorenz curve) $L\left(F\right)$ can be written as

$$\begin{array}{c}L\left(F\right)=\frac{\underset{{x}_{m}}{\overset{x\left(F\right)}{\int}}xf\left(x\right)dx}{\mu}\end{array}$$

Here $x\left(F\right)$ is the inverse of the CDF. Recall $F\left(x\right)$ is the fraction of the population with wealth no bigger than $x$, so $x\left(F\right)$ gives the maximum wealth found in that subpopulation. So the numerator is this subpopulations average wealth. The denominator is the average wealth of the population.

Correspondingly, for any cdf (F) the Lorenz curve $L\left(F\right)$ can be written as

$$\begin{array}{c}L\left(F\right)=\frac{\underset{0}{\overset{F}{\int}}x(F\text{'})dF\text{'}}{\underset{0}{\overset{1}{\int}}x(F\text{'})dF\text{'}}\end{array}$$

Recall that the Gini coefficient measures the deviation of the Lorenz curve from perfect equality as twice the area between the Lorenz curve and the equidistribution line.

$$\begin{array}{c}G=1-2\underset{0}{\overset{1}{\int}}L\left(F\right)dF\end{array}$$

See Aaberge (2005) http://ideas.repec.org/p/ssb/dispap/491.html

For the Pareto distribution, assuming a finite mean (i.e., $\alpha \ge 1$), the quantile function is

$$\begin{array}{c}x\left(F\right)=\frac{{x}_{m}}{(1-F{)}^{1/\alpha}}\end{array}$$

So the Lorenz curve is calculated to be

$$\begin{array}{c}L\left(F\right)=1-(1-F{)}^{1-1/\alpha}\end{array}$$

Source: Wikipedia http://en.wikipedia.org/wiki/File:Pareto_distributionLorenz.png

Compare Mathematica figure.

$$\begin{array}{c}G=\frac{1}{2\alpha -1}\end{array}$$

Note: $\alpha =\infty $ → Gini = 0, and $\alpha =1$ → Gini = 1.

The likelihood function for the Pareto index and cutoff parameters given a sample sample $x=({x}_{1},{x}_{2},...,{x}_{n})$, is

$$\begin{array}{c}L(\alpha ,{x}_{m})=\prod _{i=1}^{n}\alpha \frac{{x}_{m}^{\alpha}}{{x}_{i}^{\alpha +1}}={\alpha}^{n}{x}_{m}^{n\alpha}\prod _{i=1}^{n}\frac{1}{{x}_{i}^{\alpha +1}}\end{array}$$

Therefore, the loglikelihood function is

$$\begin{array}{c}\ell (\alpha ,{x}_{m})=nln\alpha +n\alpha ln{x}_{m}-(\alpha +1)\sum _{i=1}^{n}ln{x}_{i}\end{array}$$

Estimating xm: $\ell (\alpha ,{x}_{m})$ is strictly increasing in ${x}_{m}$. Since $x\ge {x}_{m}$, pick

$$\begin{array}{c}{\hat{x}}_{m}={min}_{i}{x}_{i}\end{array}$$

To find the estimator for $\alpha $, take the corresponding partial derivative and find $\alpha $ that makes it zero:

$$\begin{array}{c}\frac{\partial \ell}{\partial \alpha}=\frac{n}{\alpha}+nln{x}_{m}-\sum _{i=1}^{n}ln{x}_{i}=0.\end{array}$$

Thus the maximum likelihood estimator for $\alpha $ is:

$$\begin{array}{c}\hat{\alpha}=\frac{n}{\sum _{i}\left(ln{x}_{i}-ln{\hat{x}}_{m}\right)}.\end{array}$$

use inverse transform sampling:

draw from a uniform distribution $U$ (on (0,1]) and produce the Pareto variate $X$ as

$$\begin{array}{c}X=\frac{{x}_{m}}{{U}^{1/\alpha}}\end{array}$$

nsamples = 10**6 u = 1 - np.random.random(nsamples) # U(0,1] x = u**(-1.0/(2.5 - 1.0)) # pareto distributed

lastbin01 = 10 bins01 = np.linspace(1, lastbin01, 91) # bin size = 0.1 ind01 = np.digitize(x, bins01) freq01 = np.bincount(ind01)[1:-1] #trim the unbinned relfreq01 = freq01 / float(nsamples)

Barry C. Arnold (1983). “Pareto Distributions”, International Co-operative Publishing House, Burtonsville, Maryland. ISBN 0-899974-012-1.

Christian Kleiber and Samuel Kotz (2003). Statistical Size Distributions in Economics and Actuarial Sciences, New York:Wiley. xi+332 pp. ISBN 0-471-15064-9.

Lorenz, M. O. (1905). "Methods of measuring the concentration of wealth". Publications of the American Statistical Association. 9: 209–219.

Pareto, Vilfredo, Cours d’Économie Politique: Nouvelle édition par G.-H. Bousquet et G. Busino, Librairie Droz, Geneva, 1964, pages 299–345.

Reed, William J. “The Pareto, Zipf and other power laws,” http://linkage.rockefeller.edu/wli/zipf/reed01_el.pdf

Aabergé, Rolf. “Gini's Nuclear Family” In: International Conference to Honor Two Eminent Social Scientists], May, 2005 http://www.unisi.it/eventi/GiniLorenz05/ http://www.unisi.it/eventi/GiniLorenz05/25%20may%20paper/PAPER_Aaberge.pdf

Michael Hardy (2010) "Pareto's Law", Mathematical Intelligencer, 32 (3), 38–43. doi: 10.1007/s00283-010-9159-2

M. E. J. Newman, 2005, “Power laws, Pareto distributions and Zipf's law” Contemporary Physics 46, pages 323–351. doi:10.1080/00107510500052444 http://arxiv.org/abs/cond-mat/0412004v3

[newman-2006-archiv:cond-mat] | Newman, M E J. 2006. Power Laws, Pareto Distributions and Zipf’s
Law. arXiv preprint cond-mat.stat-mech, 0412004v3. |