# Root-mean-square deviation

(Redirected from Root mean square error)

The root-mean-square deviation (RMSD) or root-mean-square error (RMSE) (or sometimes root-mean-squared error) is a frequently used measure of the differences between values (sample and population values) predicted by a model or an estimator and the values actually observed. The RMSD represents the sample standard deviation of the differences between predicted values and observed values. These individual differences are called residuals when the calculations are performed over the data sample that was used for estimation, and are called prediction errors when computed out-of-sample. The RMSD serves to aggregate the magnitudes of the errors in predictions for various times into a single measure of predictive power. RMSD is a measure of accuracy, to compare forecasting errors of different models for a particular data and not between datasets, as it is scale-dependent.[1]

RMSD is the square root of the average of squared errors. The effect of each error on RMSD is proportional to the size of the squared error; thus larger errors have a disproportionately large effect on RMSD. Consequently, RMSD is sensitive to outliers.[2][3]

## FormulaEdit

The RMSD of an estimator ${\displaystyle {\hat {\theta }}}$  with respect to an estimated parameter ${\displaystyle \theta }$  is defined as the square root of the mean square error:

${\displaystyle \operatorname {RMSD} ({\hat {\theta }})={\sqrt {\operatorname {MSE} ({\hat {\theta }})}}={\sqrt {\operatorname {E} (({\hat {\theta }}-\theta )^{2})}}.}$

For an unbiased estimator, the RMSD is the square root of the variance, known as the standard deviation.

The RMSD of predicted values ${\displaystyle {\hat {y}}_{t}}$  for times t of a regression's dependent variable ${\displaystyle y_{t}}$  is computed for n different predictions as the square root of the mean of the squares of the deviations:

${\displaystyle \operatorname {RMSD} ={\sqrt {\frac {\sum _{t=1}^{n}({\hat {y}}_{t}-y_{t})^{2}}{n}}}.}$

Similarly, the RMSD of predicted values ${\displaystyle {\hat {y}}_{i}}$  for observations i of a regression's dependent variable ${\displaystyle y_{i}}$  is computed for n different predictions as the square root of the mean of the squares of the deviations:

${\displaystyle \operatorname {RMSD} ={\sqrt {\frac {\sum _{i=1}^{n}({\hat {y}}_{i}-y_{i})^{2}}{n}}}.}$

In some disciplines, the RMSD is used to compare differences between two things that may vary, neither of which is accepted as the "standard". For example, when measuring the average difference between two time series ${\displaystyle x_{1,t}}$  and ${\displaystyle x_{2,t}}$ , the formula becomes

${\displaystyle \operatorname {RMSD} ={\sqrt {\frac {\sum _{t=1}^{n}(x_{1,t}-x_{2,t})^{2}}{n}}}.}$

## Normalized root-mean-square deviationEdit

Normalizing the RMSD facilitates the comparison between datasets or models with different scales. Though there is no consistent means of normalization in the literature, common choices are the mean or the range (defined as the maximum value minus the minimum value) of the measured data:[4]

${\displaystyle \mathrm {NRMSD} ={\frac {\mathrm {RMSD} }{y_{\max }-y_{\min }}}}$  or ${\displaystyle \mathrm {NRMSD} ={\frac {\mathrm {RMSD} }{\bar {y}}}}$ .

This value is commonly referred to as the normalized root-mean-square deviation or error (NRMSD or NRMSE), and often expressed as a percentage, where lower values indicate less residual variance. In many cases, especially for smaller samples, the sample range is likely to be affected by the size of sample which would hamper comparisons.

Another possible method to make the RMSD a more useful comparison measure is to divide the RMSD by the interquartuale range. When dividing the RMSD with the IQR the normalized value gets lets sensitive for extreme values in your target variable.

${\displaystyle \mathrm {RMSDIQR} ={\frac {\mathrm {RMSD} }{IQR}}}$  where ${\displaystyle IQR=Q_{1}-Q_{3}}$

with ${\displaystyle Q_{1}={\text{CDF}}^{-1}(0.25)}$  and ${\displaystyle Q_{3}={\text{CDF}}^{-1}(0.75),}$  where CDF−1 is the quantile function.

When normalising by the mean value of the measurements, the term coefficient of variation of the RMSD, CV(RMSD) may be used to avoid ambiguity.[5] This is analogous to the coefficient of variation with the RMSD taking the place of the standard deviation.

${\displaystyle \mathrm {CV(RMSD)} ={\frac {\mathrm {RMSD} }{\bar {y}}}}$

## Related measuresEdit

Some researchers have recommended the use of Mean Absolute Error (MAE) instead of Root Mean Square Deviation. MAE possesses advantages in interpretability over RMSD. MAE is the average absolute difference between two variables designated X and Y. MAE is fundamentally easier to understand than the square root of the average of squared errors. Furthermore, each error influences MAE in direct proportion to the absolute value of the error, which is not the case for RMSD.[2]