In some sense, this whole course/quarter will be on the notion of correlation, which is why I want to spend some time on reminding us all on what that is/means.
Definition: Two variables A
and B
are
correlated
A
makes it easier to
‘predict’ (better) the value/range of B
than if one doesn’t
know the value/range of A
;B
makes it easier to
‘predict’ (better) the value/range of A
than if one doesn’t
know the value/range of B
.Here is an example where knowing A
(or B
)
does not help ‘predicting’ B
(or A
):
##
## Pearson's product-moment correlation
##
## data: A and B
## t = 0.16863, df = 98, p-value = 0.8664
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.1799881 0.2127386
## sample estimates:
## cor
## 0.01703215
##
## Call:
## lm(formula = B ~ A)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.50405 -0.23895 0.00904 0.21547 0.46846
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.50851 0.05974 8.512 2.03e-13 ***
## A 0.01730 0.10260 0.169 0.866
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.2732 on 98 degrees of freedom
## Multiple R-squared: 0.0002901, Adjusted R-squared: -0.009911
## F-statistic: 0.02844 on 1 and 98 DF, p-value: 0.8664
Why does knowing A
(or B
) not help
‘predicting’ B
(or A
)? Because, for instance,
no matter which value range of A
you pick, you can’t
predict B
very well (and vice versa):
And we can exemplify that easily with a regression line as well:
By contrast, here is an example where knowing A
(or
B
) does help ‘predicting’ B
(or
A
):
##
## Pearson's product-moment correlation
##
## data: A and B
## t = 48.903, df = 98, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.9705438 0.9866034
## sample estimates:
## cor
## 0.9801194
##
## Call:
## lm(formula = B ~ A)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.100809 -0.047789 0.001807 0.043093 0.093692
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.10170 0.01195 8.512 2.03e-13 ***
## A 1.00346 0.02052 48.903 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.05463 on 98 degrees of freedom
## Multiple R-squared: 0.9606, Adjusted R-squared: 0.9602
## F-statistic: 2391 on 1 and 98 DF, p-value: < 2.2e-16
Why does knowing A
(or B
) help ‘predicting’
B
(or A
)? Because, for instance, knowing the
value range of A
makes you predict B
better
(and vice versa):
And we can exemplify that easily with a regression line as well:
Now, here’s an example of a correlation with the same slope, but more noise: