Page tree

@wikipedia


A real number characterising the real-value model prediction quality (goodness of fit):

R^2 = 1 - \frac{MSD(x, \hat x)}{MSD(x, \bar x)} = 1 - \frac{\sum_i (x_i -\hat x_i)^2}{\sum_i (x_i -\bar x)^2}

where 

x = \{ x_1, \, x_2, \, x_3 , ... x_N \}

observed  variable represented by a discrete dataset of numerical samples

\hat x = \{ \hat x_1, \, \hat x_2, \, \hat x_3 , ... \hat x_N \}

predictor of variable  x, represented by another discrete dataset  of numerical samples,

with the same number of samples  N predicted at the same conditions as the original samples  \{ x_1, \, x_2, \, x_3 , ... x_N \}

\bar x = \frac{1}{N} \sum_i x_i

mean value of the variable  x, which can be considered as some sort of extreme predictor with zero variability

MSD(x, \hat x)

mean square deviation between a variable  x and its predictor  \hat x

MSD(x, \bar x)

mean square deviation between a variable  x and its mean value  \bar x


It is similar to Mean Square Deviation (MSD) but quantifies the model prediction efficiency in normalized way which is normally more suitable for assessment goodness of fit.


The coefficient of determination   R^2 normally ranges between :

  • 0, indicating that prediction error is within the variance of the observed variable around its mean value

and

  • 1, indicating a fine fit, fairly reproducing the variability of the  x


The  R^2  values falling outside the above range indicate a substantial mismatch between variable x and model prediction  \hat x and have a meaning that gap between predicted and actual values is higher than the variance of the actual data.

See also


Formal science / Mathematics / Statistics / Statistical Metric 




  • No labels