Generalized Pareto distribution

In statistics, the generalized Pareto distribution (GPD) is a family of continuous probability distributions. It is often used to model the tails of another distribution. It is specified by three parameters: location , scale , and shape .[1][2] Sometimes it is specified by only scale and shape[3] and sometimes only by its shape parameter. Some references give the shape parameter as .[4]

Generalized Pareto distribution
Probability density function
Gpdpdf
GPD distribution functions for and different values of and
Cumulative distribution function
Gpdcdf
Parameters

location (real)
scale (real)

shape (real)
Support


PDF


where
CDF
Mean
Median
Mode
Variance
Skewness
Ex. kurtosis
Entropy
MGF
CF
Method of Moments

DefinitionEdit

The standard cumulative distribution function (cdf) of the GPD is defined by[5]

 

where the support is   for   and   for  . The corresponding probability density function (pdf) is

 

CharacterizationEdit

The related location-scale family of distributions is obtained by replacing the argument z by   and adjusting the support accordingly.

The cumulative distribution function of   ( ,  , and  ) is

 

where the support of   is   when  , and   when  .

The probability density function (pdf) of   is

 ,

again, for   when  , and   when  .

The pdf is a solution of the following differential equation:[citation needed]

 

Special casesEdit

  • If the shape   and location   are both zero, the GPD is equivalent to the exponential distribution.
  • With shape   and location  , the GPD is equivalent to the Pareto distribution with scale   and shape  .
  • If         ,  ,    , then   [1]. (exGPD stands for the exponentiated generalized Pareto distribution.)
  • GPD is similar to the Burr distribution.

Generating generalized Pareto random variablesEdit

Generating GPD random variablesEdit

If U is uniformly distributed on (0, 1], then

 

and

 

Both formulas are obtained by inversion of the cdf.

In Matlab Statistics Toolbox, you can easily use "gprnd" command to generate generalized Pareto random numbers.

GPD as an Exponential-Gamma MixtureEdit

A GPD random variable can also be expressed as an exponential random variable, with a Gamma distributed rate parameter.

 

and

 

then

 

Notice however, that since the parameters for the Gamma distribution must be greater than zero, we obtain the additional restrictions that:  must be positive.

Exponentiated generalized Pareto distributionEdit

The exponentiated generalized Pareto distribution (exGPD)Edit

 
The pdf of the   (exponentiated generalized Pareto distribution) for different values   and  .

If     ,  ,    , then   is distributed according to the exponentiated generalized Pareto distribution, denoted by         ,    .

The probability density function(pdf) of         ,     is

 

where the support is   for  , and   for  .

For all  , the   becomes the location parameter. See the right panel for the pdf when the shape   is positive.

The exGPD has finite moments of all orders for all   and  .

 
The variance of the   as a function of  . The red dotted line corresponds to the value of variance ( ) evaluated at  .

The moment-generating function of   is

 

where   and   denote the beta function and gamma function, respectively.

The variance of         ,     depends on the shape parameter   only through the polygamma function of order 1 (also called the trigamma function):

 

See the right panel for the variance as a function of  . Note that  .

Note that the roles of the scale parameter   and the shape parameter   under   are separably interpretable, which may lead to a robust efficient estimation for the   than using the   [2]. The roles of the two parameters are associated each other under   (at least up to the second central moment); see the formula of variance   wherein both parameters are participated.

The Hill's estimatorEdit

Assume that   are   observations (not need to be i.i.d.) from an unknown heavy-tailed distribution   such that its tail distribution is regularly varying with the tail-index   (hence, the corresponding shape parameter is  ). To be specific, the tail distribution is described as

 

It is of a particular interest in the extreme value theory to estimate the shape parameter  , especially when   is positive (so called the heavy-tailed distribution).

Let   be their conditional excess distribution function. Pickands–Balkema–de Haan theorem (Pickands, 1975; Balkema and de Haan, 1974) states that for a large class of underlying distribution functions  , and large  ,   is well approximated by the generalized Pareto distribution (GPD), which motivated Peak Over Threshold (POT) methods to estimate  : the GPD plays the key role in POT approach.

A renowned estimator using the POT methodology is the Hill's estimator. Technical formulation of the Hill's estimator is as follows. For  , write   for the  -th largest value of  . Then, with this notation, the Hill's estimator (see page 190 of Reference 5 by Embrechts et al [3]) based on the   upper order statistics is defined as

 

In practice, the Hill estimator is used as follows. First, calculate the estimator   at each integer  , and then plot the ordered pairs  . Then, select from the set of Hill estimators   which are roughly constant with respect to  : these stable values are regarded as reasonable estimates for the shape parameter  . If   are i.i.d., then the Hill's estimator is a consistent estimator for the shape parameter   [4].

Note that the Hill estimator   makes a use of the log-transformation for the observations  . (The Pickand's estimator   also employed the log-transformation, but in a slightly different way [5].)

See alsoEdit

ReferencesEdit

  1. ^ Coles, Stuart (2001-12-12). An Introduction to Statistical Modeling of Extreme Values. Springer. p. 75. ISBN 9781852334598.
  2. ^ Dargahi-Noubary, G. R. (1989). "On tail estimation: An improved method". Mathematical Geology. 21 (8): 829–842. doi:10.1007/BF00894450. S2CID 122710961.
  3. ^ Hosking, J. R. M.; Wallis, J. R. (1987). "Parameter and Quantile Estimation for the Generalized Pareto Distribution". Technometrics. 29 (3): 339–349. doi:10.2307/1269343. JSTOR 1269343.
  4. ^ Davison, A. C. (1984-09-30). "Modelling Excesses over High Thresholds, with an Application". In de Oliveira, J. Tiago (ed.). Statistical Extremes and Applications. Kluwer. p. 462. ISBN 9789027718044.
  5. ^ Embrechts, Paul; Klüppelberg, Claudia; Mikosch, Thomas (1997-01-01). Modelling extremal events for insurance and finance. p. 162. ISBN 9783540609315.

Further readingEdit

External linksEdit