For sample sets of two statistical variables and :
\rho_P(x,y) = \frac{{\rm cov}(x,y)}{\sigma(x) \sigma(y)} = \frac{ \sum\limits^n_{i=1} (x_i - \bar x)( y_i - \bar y)}{\sqrt{\sum\limits_i (x_i - \bar x)^2} \cdot \sqrt{\sum\limits_i (y_i - \bar y)^2}} |
where
where
sample mean of variable | |
sample mean of variable |
and | finite arrays of -variable and -variable values |
covariance between -variable and -variable | |
, | standard deviation of -variable and -variable |
Pearson correlation coefficient ranges between -1 and 1 and indicates how accurately the two variables can be approximated by a linear correlation:
y_i = a \, x_i + b, \quad \forall \, i=1..n |
with a certain pick on and .
Fig. 1. Highly correlated variables | Fig. 2. Poorly correlated variables | Fig. 3. Highly anti-correlated variables |
Natural Science / System / Model / Model Validation
Formal science / Mathematics / Statistics / Statistical correlation / Correlation coefficient
[ Statistical correlation metrics @ review ] [ Spearmen Correlation ] [ Kendall correlation ] [ Fehner correlation ]