Unizor - Creative Minds through Art of Mathematics - Math4Teens
Notes to a video lecture on http://www.unizor.com
In this lecture we will talk about independent and dependent random variables and will introduce a numerical measure of dependency between random variables.
Assume a random variable ξ takes values
x1, x2,..., xM
p1, p2,..., pM.
Further, assume a random variable η takes values
y1, y2,..., yN
q1, q2,..., qN.
The known property of mathematical expectations for independent random variables is the basis of measuring the degree of dependency between any pair of random variables.
First of all, we introduce a concept of covariance of any two random variables:
Simple transformation by opening parenthesis converts it into an equivalent definition:
Cov(ξ,η) = E(ξ·η)−E(ξ)E(η)
Now we see that for independent random variables their covariance equals to zero (see property (c) above).
Incidentally, the covariance of a random variable with itself (kind of ultimate dependency) equal to its variance:
Cov(ξ,ξ) = E(ξ·ξ)−E(ξ)E(ξ) =
= E[(ξ−E(ξ))²] = Var(ξ)
Also notice that another example of very strong dependency, η = A·ξ, where A is a constant, leads to the following value of covariance:
= E(ξ·Aξ)−E(ξ)E(Aξ) =
= A·E[(ξ−E(ξ))²] = A·Var(ξ)
This shows that, when coefficient A is positive (that is, positive change of ξ causes positive change of η=A·ξ), covariance between them is positive as well and proportional to coefficient A. If A is negative (that is, positive change of ξ causes negative change of η=A·ξ), covariance between them is negative as well and still proportional to coefficient A.
One more example.
Consider "half-dependency" between ξ and η, defined as follows.
Let ξ' be an independent random variable, identically distributed with ξ.
Let η = (ξ + ξ')/2.
So, η "borrows" its randomness from two independent identically distributed random variables ξ and ξ'.
Then covariance between ξ and η is:
Cov(ξ,η) = Cov(ξ,(ξ+ξ')/2) =
Since ξ and ξ' are independent, expectation of their product equals to a product of their expectations.
So, our expression can be transformed further: =E(ξ²)/2+E(ξ)·E(ξ')/2 −
As we see, covariance between "half-dependent" random variables ξ and η=(ξ+ξ')/2, where ξ and ξ' are independent identically distributed random variables, equals to half of the variance of ξ.
All the above manipulations with covariance led us to some formulas where the variance plays a significant role. If we want a kind of measure that reflects the dependency between random variables not related to variances, but always scaled in the interval [−1, 1], we have to scale the covariance by a factor that depends on variances, thus forming a coefficient of correlation: R(ξ,η) =
Let's examine this coefficient of correlation in cases we considered above as examples.
For independent random variables ξ and η the correlation is zero because their covariance is zero.
Correlation between a random variable and itself equals to 1: R(ξ,ξ) = Cov(ξ,ξ)/Var(ξ,ξ) = 1
Correlation between a random variables ξ and Aξ equals to 1 (for positive constant A) or −1 (for negative A):
which equals to 1 or −1, depending on a sign of A.
This seems to corresponds our intuitive understanding of rigid relationship between ξ and Aξ.
Correlation between "half-dependent" random variables, as introduced above, is:
R(ξ,(ξ+ξ')/2) = Cov(ξ,(ξ+ξ')/2) / √[Var(ξ)·Var((ξ+ξ')/2] = √2/2.
As we see, in all these examples the correlation is a number from an interval [−1,1] that is equal to zero for independent random variables, equals to 1 or −1 for rigidly dependent random variables and is inside this interval for partially dependent (like in our example of "half-dependent") random variables.
For those interested, it can be proved that this statement is true for any pair of random variables.
So, the coefficient of correlation is a good tool to measure the degree of dependency between two random variables.