A number characterising the model prediction quality ( goodness of fit ):
R^2 = 1 - \frac{MSE(x, \hat x)}{MSE(x, \bar x)} |
where
x = \{ x_1, \, x_2, \, x_3 , ... x_N \} | observed variable represented by a discrete data set of numerical samples |
---|---|
\hat x = \{ \hat x_1, \, \hat x_2, \, \hat x_3 , ... \hat x_N \} | predictor of variable x, represented by another discrete data set of numerical samples, with the same number of samples N predicted at the same conditions as the original samples \{ x_1, \, x_2, \, x_3 , ... x_N \} |
\bar x | mean value of the variable x, which can be considered as some sort of extreme predictor with zero variability |
MSE(x, \hat x) | mean square error between a variable x and its predictor \hat x |
MSE(x, \bar x) | mean square error between a variable x and its mean value \bar x |
It is similar to Mean Square Error (MSE) but quantifies the model prediction efficiency in normalized way which sometimes is more suitable for computations.
The coefficient of determination R^2 normally ranges between :
- 0, indicating that prediction error is within the variance of the observed variable around its mean value
and
- 1, indicating a fine fit, fairly reproducing the variability of the x
The
R^2 values falling outside the above range indicate a substantial mismatch between variable
x and model prediction
\hat x and have a meaning that gap between predicted and actual values is higher than the variance of the actual data.