Help with understanding point from Kahneman's book “Thinking Fast and Slow”

If you have two random variables $X,Y$, the correlation coefficient is

$$\rho=\frac{\mu_{XY}-\mu_X \mu_Y}{\sigma_X \sigma_Y}$$

where $\mu$ denotes the mean of the variable and $\sigma$ denotes the standard deviation of the variable. Suppose now you have two Bernoulli random variables $X,Y$ (so they take on only the values $0$ and $1$) and both have total probability $0.5$ of being $1$. In this case the correlation coefficient $\rho$ is

$$\frac{P(X=1,Y=1)-\frac{1}{4}}{\frac{1}{4}}=4P(X=1,Y=1)-1.$$

So $P(X=1,Y=1)=\frac{\rho+1}{4}$. The more interesting quantity is

$$P(X=1 \mid Y=1)=\frac{P(X=1,Y=1)}{P(Y=1)}=\frac{\rho+1}{2}.$$

Thus, if the correlation coefficient between CEO quality and firm success is $\rho$ and the randomly chosen CEO is better, then the probability that the firm is successful is increased by $\frac{\rho}{2}$, relative to a firm which is equally likely to have a good CEO vs. a bad CEO. So in this version of the model, the number should actually have been $65\%$.

For (mostly my own) future reference, here's my intuitive explanation of the answer.

If CEO strength made no difference, then for the stronger CEO, for 100 firms, 50 would be 'more successful' and 50 would be 'less successful'.

If a stronger CEO made a difference (.30 correlation), then out of 100 firms, 30 would be 'more successful' for the stronger CEO because of the .30 correlation. For the remaining 70, the 'more successful' firms would be distributed equally between the stronger and weaker CEOs (35 + 35).

So, the stronger CEO gets 30 + 35 = 65 'more successful' firms. Hence 65%.

Here another intuitive explanation, slightly different than the other answers.

We agree that a correlation of 0 means pure randomness, i.e. a 50% probability and a correlation of 1 means pure determinism, i.e. a 100% probability.

We now linearly interpolate and get the following linear relationship between probability and correlation:

probability = 50% + correlation * 50%

Observe that it satisfies (0, 50%) and (1, 100%). For a correlation of 0.3 the formula gives us a probability of 65%.

The answer 60% provided in the book appears to be wrong, or more details on the assumptions need to be provided. Note that also the value of 65% assumes a linear relationship between correlation and probability. I honestly do not know if this assumption can always be made.

Help with understanding point from Kahneman's book “Thinking Fast and Slow”

Tags:

Probability

Correlation

Related

Recent Posts