User:Chefkokkie/sandbox/Total Percentile Error

This is not a Wikipedia article: It is an individual user's work-in-progress page, and may be incomplete and/or unreliable. For guidance on developing this draft, see Wikipedia:So you made a userspace draft.

Find sources: Google (books · news · scholar · free images · WP refs) · FENS · JSTOR · TWL
Easy tools: Citation bot (help) | Advanced: Fix bare URLs
This page was last edited by Koavf (talk | contribs) 5 years ago. (Update timer)

Finished writing a draft article? Are you ready to request an experienced editor review it for possible inclusion in Wikipedia? Submit your draft for review!

In statistics, the Total Percentile Error (TPE) is a measure of the accuracy of forecasts. The total percentile error addresses most of the shortcomings of other forecast error metrics, while its main distinction is that it measures not only the accuracy of the expected value of the forecast, but rather the accuracy of the entire probability distribution of possible values forecasted. This is especially important in scenarios where the tails (or extremities) of the probability distribution have significant impact on results of processes consuming the forecast. The only downsides of the total percentile error compared to other accuracy metrics are that it consumes more historical data and its computation is more intensive.

Concept

The total percentile error expects a forecast to be provided in terms of a probability distribution. This could be provided in the form of a mean and standard deviation of the expected variability around the mean for a statistical forecast, or a closed-form distribution with parameters for a stochastic forecast, or more generally a set of cumulative distribution function (cdf) values for either. Since a closed-form cannot be assumed to be generally available the total output value range of the cdf [0..1] is partitioned into percentile groups. In the simplest case these would be equal sized groups, for example 4 quartiles. If the forecasted cdf perfectly matches the distribution of the actual data (i.e. no error) each such percentile group would contain an equal amount of occurrences. Any difference between the number of actual samples in a percentile group and the forecasted number is a percentile error. The weighted average of all absolute values of such percentile errors across all percentile groups is the total percentile error. The total percentile error provides flexibility in that the weighting factor in this weighted average can be freely chosen, as can the number and sizes of the percentile groups.

Properties

The total percentile error has the following desirable properties:

Scale invariance: The total percentile error is independent of the scale of the data, so can be used to compare forecasts across data sets with different scales.
Interpretability: It can be easily interpreted, as values are on a scale of 0% to 100%, where 0% indicates zero error and 100% indicates maximum possible error. Unlike all other known forecast accuracy metrics 0% is the realistic achievable accuracy.^[a] Compared for example to Mean absolute percentage error (MAPE) where some arbitrary percentage is the realistic achievable accuracy, whilst 0% is not achievable for any except trivial cases.
Robustness: It is a robust statistic across the entire spectrum of possible values. Other metrics, such as MAPE, Mean absolute error (MAE), and Root-mean-square error (RMSE) are not robust, being highly sensitive to distortion from outliers and values near or exactly zero. And semi-robust metrics such as Median absolute deviation (MAD) and Interquartile range give undesirable results when more than half or a quarter of the measured values are zero respectively, making them unusable for intermittent demand patterns.
Symmetry: It penalizes positive and negative forecast errors equally, and penalizes errors in large forecasts and small forecasts equally (unless parameters are chosen with specific purpose to offset this symmetry). This symmetry is different from traditional perspective in that it is symmetric with regards to probability of a value occurring as opposed to symmetric with regards to value itself. For example, a 1st percentile error has the same impact as a 99th percentile. If the actual distribution of values is symmetric (such as a normal distribution) the two perspectives of symmetry are the same.
Completeness: It is unique in that it measures the error of the complete probability distribution of forecasted values, rather than just the error in the expected value as other published forecast accuracy metrics do. The latter is the equivalent of measuring only the accuracy of the mean of the probability distribution, whilst ignoring all other possible values.
Versatility: The total percentile error is the only known forecast accuracy metric that can be used on both traditional statistical forecasts as well as more sophisticated stochastic forecasts, allowing comparison of accuracy of both types on equal scale.

The total percentile error has two downsides compared to most other forecast accuracy metrics:

Historical data: It requires archiving of historical data of not just the expected values, but also the expected dispersion in such a way that it can be reconstituted for calculation. For statistical forecasts this may be the standard deviation or other single-value measure of dispersion. For closed-form stochastic forecasts this may be values of the distribution parameters. In general it may be one value for each percentile group of a prescribed partition of the total percentile error.
Computation: The total percentile error is more computationally intensive than other forecast accuracy metrics. It requires an order O(h) more basic computations than most common such metrics, where h is the number of percentile groups into which a distribution is partitioned.

NOTE:

^ Lower end of the value range approaches 0% when number of values is sufficiently large and percentile group size is sufficiently small. For small sample sizes or large percentile groups the lower bound may be significantly larger than 0%.

Generalized total percentile error

The total percentile error is provided in two forms: a general form that may be customized to specific requirements and a proposed simplified standard form.

The total percentile error provides degrees of freedom through 4 parameters:

How to partition the percentile range. This is represented by parameter $h$ , which could be an integer indicating number percentile groups all of uniform size. It could be a specified set of sizes, or it could be some shorthand for commonly used partitioning. For the standard form $h=6\sigma$ signifying a partition into 8 percentile groups: 3 standard deviations both above and below the median, plus a group for each tail, using the cdf of the normal distribution.
A weighting function $w$ across the percentile groups, i.e. to weigh some groups more than others. Default will be 1 to indicate all are weighted equally. When some percentile ranges are more important than others they can be given more weight through this parameter.
An indicator $t$ for the time granularity. Its values could be d, w, m, q or y for daily, weekly, monthly, quarterly or yearly granularity of the forecast. Naturally the more granular the forecast, the greater the error will be compared to measuring the same forecast at more aggregate levels. For example, error on a weekly forecast will be greater than error on the same forecast measured in monthly granularity.
An indicator $f$ for a weighting factor used to give more importance to forecasts of some samples versus others. For a simple count it could be n, for volume v, for units u, for cost c, for revenue r, etc.

With these notational parameters the generalized total percentile error is given by:^[1]

\mathrm {TPE_{h,w,t,f}} ={\frac {\sum _{g=1}^{h}w_{g}\left|\sum _{i=1}^{n}\mu _{t,f,i}\left(l_{g}-\lambda _{g,i}\right)\right|}{2n\left(1-{\frac {1}{h}}\right)w\sum _{i=1}^{n}\mu _{t,f,i}}}

where $w_{g}$ is the weight for percentile group $g$ , $w$ is the average of the weights $w_{g}/h$ , with $h$ the number of percentile groups, and $l_{g}$ is the size of the percentile group. In case the groups are of equal size all $l_{g}=1/h$ . Parameter $n$ is the number of samples of actual values measured, and $\mu _{t,f,i}$ is the mean of the forecasted distribution in the unit $f$ used to weigh each sample for time period size $t$ . Finally,

\mathrm {\lambda _{g,i}} ={\frac {P(x_{i}\cap g)}{P(x_{i})}}

is a factor determining a split where one sample $x_{i}$ could fall into multiple percentile groups. Its formula states that the value is equal to the probability that the sample falls in percentile group $g$ divided by the probability it could occur at all. The probabilities are determined by the forecasted distribution and thus the $\lambda _{g,i}$ can be calculated prior to any actual values being known.

Standard total percentile error

To allow accuracy of stochastic forecasts to be evaluated against accuracy of statistical forecasts a naive probability distribution is assumed for the latter, since it does not explicitly state one. This distribution is a normal distribution characterized by its mean and standard deviation. These two parameters are typically known and most statisticians assume the error residuals follow this distribution, making it a logical choice. Note that the mean in the context of this topic is not the mean across multiple time periods or across multiple time series, but rather the mean of the probability distribution for one specific time period of one specific time series. For a statistical forecast this equals the forecasted value for such period and series.

For the standard total percentile error the partition of percentiles is bounded by the values of the cdf $F_{N}(x)$ of the normal distribution ${\mathcal {N}}(\mu ,\,\sigma ^{2})$ for 3 standard deviations $\sigma$ below and above the mean $\mu$ plus the tails on both ends:

$n$	$F_{N}(\mu +n\sigma )$
$-\infty$	0
-3	0.00135
-2	0.02275
-1	0.15866
0	0.5
1	0.84134
2	0.97725
3	0.99865
$\infty$	1

To Do ^[2]

Applications

To do

This scale-free error metric "can be used to compare forecast methods on a single series and also to compare forecast accuracy between series. This metric is well suited to intermittent-demand series because it never gives infinite or undefined values except in the irrelevant case where all historical data are equal.

When comparing forecasting methods, the method with the lowest TPE is the preferred method.

References

^ De Kok, Stefan. B. (2015). "Measure Total Percentile Error to See the Big Picture of Forecast Accuracy", LinkedIn Pulse [1]
^ De Kok, Stefan. B. (2015). "Stochastic Value Add: How Much Value Are You Really Adding to the Forecast?", LinkedIn Pulse [2]

Category:Point estimation performance Category:Statistical deviation and dispersion Category:Time series

[1] Lower end of the value range approaches 0% when number of values is sufficiently large and percentile group size is sufficiently small. For small sample sizes or large percentile groups the lower bound may be significantly larger than 0%.

[DeKok2015_TPE-2] De Kok, Stefan. B. (2015). "Measure Total Percentile Error to See the Big Picture of Forecast Accuracy", LinkedIn Pulse [1]

[DeKok2015_SVA-3] De Kok, Stefan. B. (2015). "Stochastic Value Add: How Much Value Are You Really Adding to the Forecast?", LinkedIn Pulse [2]

[a]

[1]

[2]