Correlation#
Correlation is a measure of the strength of a relationship that exists between two observable variables.
Introduction#
Preliminaries#
Before we can begin our study of correlation, let’s make some preliminary defintions that will help us keep everything clear and precise.
Univariate Statistics#
In order to differentiate between the statistics relationing to the x and y variables, we introduce some notation.
and
are defined as the univariate sample means of the
and
variables. In other words,
is the sample mean of the
variable, as if we were observing the
variable in isolation. Similarly for
.
and
are defined as the univariate standard deviations of the
and
variables. In other words,
is the standard deviation of the
variable, as if we were observing the
variable in isolation. Similarly, for
.
Assessing Correlation#
TODO
(Source code
, png
, hires.png
, pdf
)
data:image/s3,"s3://crabby-images/ba1af/ba1af42381e188512383985a63d43da44e80db9c" alt="../../_images/scatterplot_positive_correlation.png"
TODO
(Source code
, png
, hires.png
, pdf
)
data:image/s3,"s3://crabby-images/26c90/26c90aa929e9fbd1a7dcfc05cd04f70e8d8a185c" alt="../../_images/scatterplot_negative_correlation.png"
TODO
(Source code
, png
, hires.png
, pdf
)
data:image/s3,"s3://crabby-images/f3f19/f3f19fdc19bf478612ad98e9771b485074ce5a12" alt="../../_images/scatterplot_no_correlation.png"
TODO
and
are defined as the univariate standard deviations of the
and
variables. In other words,
is the standard deviation of the
variable, as if we were observing only
alone. Similarly, for
.
Definition#
Version 1#
TODO: justification. make some plots.
Version 2#
TODO: shortcut for version 2
Version 3#
TODO: justifcation, again.