Correlation between three variables question

Here's an answer to the general question, which I wrote up a while ago. It's a common interview question.

The question goes like this: "Say you have X,Y,Z three random variables such that the correlation of X and Y is something and the correlation of Y and Z is something else, what are the possible correlations for X and Z in terms of the other two correlations?"

We'll give a complete answer to this question, using the Cauchy-Schwarz inequality and the fact that $\mathcal{L}^2$ is a Hilbert space.

The Cauchy-Schwarz inequality says that if x,y are two vectors in an inner product space, then

$$\lvert\langle x,y\rangle\rvert \leq \sqrt{\langle x,x\rangle\langle y,y\rangle}$$

This is used to justify the notion of an ''angle'' in abstract vector spaces, since it gives the constraint

$$-1 \leq \frac{\langle x,y\rangle}{\sqrt{\langle x,x\rangle\langle y,y\rangle}} \leq 1$$ which means we can interpret it as the cosine of the angle between the vectors x and y.

A Hilbert space is an infinite dimensional vector space with an inner product. The important thing for this post is that in a Hilbert space the inner product allows us to do geometry with the vectors, which in this case are random variables. We'll take for granted that the space of mean 0 random variables with variance 1 is a Hilbert space, with inner product $\mathbb{E}[XY]$. Note that, in particular

$$\frac{\langle X,Y\rangle}{\sqrt{\langle X,X\rangle\langle Y,Y\rangle}} = \text{Cor}(X,Y)$$

This often leads people to say that ''correlations are cosines'', which is intuitively true, but not formally correct, as they certainly aren't the cosines we naturally think of (this space is infinite dimensional), but all of the laws hold (like Pythagorean theorem, law of cosines) if we define them to be the negative of the cosines of the angle between two random variables, whose lengths we can think of as their standard deviations in this vector space.

Because this space is a Hilbert space, we can do all of the geometry that we did in high school, such as projecting vectors onto one another, doing orthogonal decomposition, etc. To solve this question, we use orthogonal decomposition, which is often called the ''uncorrelation trick'' in statistics and consists of writing a random variable as a function of another random variable plus a random variable that is uncorrelated with the second random variable. This is especially useful in the case of multivariate normal random variables, when two components being uncorrelated implies independence.

Okay, let's suppose that we know that the correlation of X and Y is $p_{xy}$, the correlation of Y and Z is $p_{yz}$, and we want to know the correlation of X and Z, which we'll call $p_{xz}$. Note that we don't lose generality by assuming mean 0 and variance 1 as scaling and translating vectors doesn't affect their correlations. We can then write that:

$$X = \langle X,Y\rangle Y + O^X_Y$$

$$Z = \langle Z,Y\rangle Y + O^Z_Y$$

where $\langle \cdot,\cdot\rangle$ stands for the inner product on the space and the $O$ are uncorrelated with Y. Then, we take the inner product of $X,Z$ which is the correlation we're looking for, since everything has variance 1. We have that

$$\langle X,Z\rangle = p_{xz} = \langle p_{xy}Y+O^X_Y,p_{zy}Y+O^Z_Y\rangle = p_{xy}p_{yz}+\langle O^X_Y,O^Z_Y\rangle$$

since the variance of Y is 1 and the other terms of this bilinear expansion are orthogonal and hence have 0 covariance. We can now apply the Cauchy-Schwarz inequality to the last term above to get that

$$p_{x,z} \leq p_{xy}p_{yz} + \sqrt{(1-p_{x,y}^2)(1-p_{y,z}^2)}$$

$$p_{x,z} \geq p_{xy}p_{yz} - \sqrt{(1-p_{x,y}^2)(1-p_{y,z}^2)}$$

where the fact that

$$\langle O^X_Y,O^X_Y\rangle = 1-p_{xy}^2$$

comes from the equation setting the variance of X equal to 1 or

$$1 = \langle X,X\rangle = \langle p_{xy}Y + O^X_Y,p_{xy}Y+O^X_Y\rangle = p_{xy}^2 + \langle O^X_Y,O^X_Y\rangle$$

and the exact same thing can be done for $O^Z_Y$.

So we have our answer. Sorry this was so long.

Assume without loss of generality that the random variables $A$, $B$, $C$ are standard, that is, with mean zero and unit variance. Then, for any $(A,B,C)$ with the prescribed covariances, $$\mathrm{var}(A-B+C)=\mathrm{var}(A)+\mathrm{var}(B)+\mathrm{var}(C)-2\mathrm{cov}(A,B)-2\mathrm{cov}(B,C)+2\mathrm{cov}(A,C), $$ that is, $$ \mathrm{var}(A-B+C)=3-2\cdot0.9-2\cdot0.8+2\cdot0.1=-0.2\lt0, $$ which is absurd.

Edit: Since correlations are cosines, for every random variables such that $\mathrm{corr}(A,B)=b$, $\mathrm{corr}(A,C)=c$ and $\mathrm{corr}(B,C)=a$, one must have $$ a\geqslant bc-\sqrt{1-b^2}\sqrt{1-c^2}. $$ For $b=0.9$ and $c=0.8$, this yields $a\geqslant.458$.

You can use the fact, that correlations can be understood as cosines between vectors from the common origin. Then apply the arccos-function, and check, whether all possible pairwise sums are greater than the third angle, such that they make a tetraeder. I get

[acos(0.9),acos(0.8),acos(0.1)]
 %1695 = [0.451026811796, 0.643501108793, 1.47062890563]

The sum of the first and the second is smaller than the third, so that combination cannot stem from a trivariate correlation.

Correlation between three variables question

Tags:

Statistics

Probability

Correlation

Random Variables

Related

Recent Posts