Many measured quantities concentrate around a mean value.
BUT not all measured concentrate nicely around a mean.
Some vary over an enormous range (sometimes many orders of magnitude). E.g., 2700 US city populations.
America’s total population of 300 million people could be allocated to 40 cities the size of New York. But America’s 2700 cities cannot have a mean population of more than 110,000.
Source: Newman (2006)
A histogram of city sizes plotted with logarithmic horizontal and vertical axes follows quite closely a straight line.
Source: Newman (2006)
Power law implies that small are very common and large are quite rare.
(See Mathematica notebook.)
For we have
Differentiating w.r.t. , the probability density function is
If , the expected value is infinite. If then
If , the variance does not exist. If then
The raw moment moments are
A finite -th moment exists only for .
Here is a cool property of the Pareto variate. The conditional probability
is also a Pareto distribution. It has the same Pareto index but has cutoff .
(selected from Wikipedia)
The 80-20 law holds when the Pareto index is .
(Excludes , which implies infinite expected value and thus cannot model income distribution.)
The Pareto distribution is related to the exponential distribution as follows. If is Pareto-distributed with minimum and index , then
is exponentially distributed with intensity . Equivalently, if is exponentially distributed with intensity , then
is Pareto-distributed with minimum and index .
Let be any pdf with finite mean . Then the associated quantile function (Lorenz curve) can be written as
Here is the inverse of the CDF. Recall is the fraction of the population with wealth no bigger than $x$, so gives the maximum wealth found in that subpopulation. So the numerator is this subpopulations average wealth. The denominator is the average wealth of the population.
Correspondingly, for any cdf (F) the Lorenz curve can be written as
Recall that the Gini coefficient measures the deviation of the Lorenz curve from perfect equality as twice the area between the Lorenz curve and the equidistribution line.
See Aaberge (2005) http://ideas.repec.org/p/ssb/dispap/491.html
For the Pareto distribution, assuming a finite mean (i.e., ), the quantile function is
So the Lorenz curve is calculated to be
Compare Mathematica figure.
Note: → Gini = 0, and → Gini = 1.
The likelihood function for the Pareto index and cutoff parameters given a sample sample , is
Therefore, the loglikelihood function is
Estimating xm: is strictly increasing in . Since , pick
To find the estimator for , take the corresponding partial derivative and find that makes it zero:
Thus the maximum likelihood estimator for is:
use inverse transform sampling:
draw from a uniform distribution (on (0,1]) and produce the Pareto variate as
nsamples = 10**6 u = 1 - np.random.random(nsamples) # U(0,1] x = u**(-1.0/(2.5 - 1.0)) # pareto distributed
lastbin01 = 10 bins01 = np.linspace(1, lastbin01, 91) # bin size = 0.1 ind01 = np.digitize(x, bins01) freq01 = np.bincount(ind01)[1:-1] #trim the unbinned relfreq01 = freq01 / float(nsamples)
Barry C. Arnold (1983). “Pareto Distributions”, International Co-operative Publishing House, Burtonsville, Maryland. ISBN 0-899974-012-1.
Christian Kleiber and Samuel Kotz (2003). Statistical Size Distributions in Economics and Actuarial Sciences, New York:Wiley. xi+332 pp. ISBN 0-471-15064-9.
Lorenz, M. O. (1905). "Methods of measuring the concentration of wealth". Publications of the American Statistical Association. 9: 209–219.
Pareto, Vilfredo, Cours d’Économie Politique: Nouvelle édition par G.-H. Bousquet et G. Busino, Librairie Droz, Geneva, 1964, pages 299–345.
Reed, William J. “The Pareto, Zipf and other power laws,” http://linkage.rockefeller.edu/wli/zipf/reed01_el.pdf
Aabergé, Rolf. “Gini's Nuclear Family” In: International Conference to Honor Two Eminent Social Scientists], May, 2005 http://www.unisi.it/eventi/GiniLorenz05/ http://www.unisi.it/eventi/GiniLorenz05/25%20may%20paper/PAPER_Aaberge.pdf
Michael Hardy (2010) "Pareto's Law", Mathematical Intelligencer, 32 (3), 38–43. doi: 10.1007/s00283-010-9159-2
M. E. J. Newman, 2005, “Power laws, Pareto distributions and Zipf's law” Contemporary Physics 46, pages 323–351. doi:10.1080/00107510500052444 http://arxiv.org/abs/cond-mat/0412004v3
|[newman-2006-archiv:cond-mat]||Newman, M E J. 2006. Power Laws, Pareto Distributions and Zipf’s Law. arXiv preprint cond-mat.stat-mech, 0412004v3.|