A number characterising the model prediction quality ( goodness of fit ):
R^2 = 1 - \frac{RMSD(x, \hat x)}{MSE(x, \bar x)}, \quad 0 \leq R^2 \leq 1 |
where
x = \{ x_1, \, x_2, \, x_3 , ... x_N \} | a variable represented by a discrete data set of numerical samples |
---|---|
\hat x = \{ \hat x_1, \, \hat x_2, \, \hat x_3 , ... \hat x_N \} | predictor of variable x, represented by another discrete data set of numerical samples, with the same number of samples N predicted at the same conditions as the original samples \{ x_1, \, x_2, \, x_3 , ... x_N \} |
\bar x | mean value of the variable x, which can be considered as some sort of extreme predictor with zero variability |
RMSD(x, \hat x) | Root-Mean-Square Deviation between a variable x and its predictor \hat x |
MSE(x) = RMSD(x, \bar x) | mean square error between a variable x and its mean value \bar x |
The coefficient of determination R^2 normally ranges between:
- 0, indicating a course fit, trending to the mean average value
and
- 1, indicating a fine fit, fairly reproducing the variability of the x
Negative R^2 values indicate a substantial mismatch between variable x and model prediction \hat x.