A real number characterising the real-value model prediction quality ( goodness of fit ):
R^2 = 1 - \frac{MSD(x, \hat x)}{MSD(x, \bar x)} = 1 - \frac{\sum_i (x_i -\hat x_i)}{\sum_i (x_i -\bar x)} |
where
observed variable represented by a discrete data set of numerical samples | |
predictor of variable , represented by another discrete data set of numerical samples, with the same number of samples predicted at the same conditions as the original samples | |
mean value of the variable , which can be considered as some sort of extreme predictor with zero variability | |
mean square deviation between a variable and its predictor | |
mean square deviation between a variable and its mean value |
It is similar to Mean Square Deviation (MSD) but quantifies the model prediction efficiency in normalized way which sometimes is more suitable for computations.
The coefficient of determination normally ranges between :
and
The values falling outside the above range indicate a substantial mismatch between variable and model prediction and have a meaning that gap between predicted and actual values is higher than the variance of the actual data.