Condition ( | ) in R formula

The general way it is used is dependent ~ independent | grouping You can read more here http://talklab.psy.gla.ac.uk/KeepItMaximalR2.pdf


The symbol | means different things depending on the context:

The general case

In general, | means OR. General modeling functions will see any | as a logic operator and carry it out. This is the equivalent of using another operator, eg ^ as in:

lm(y~ x + x^2)

The operator is carried out first, and this new variable is then used to construct the model matrix and do the fitting.

In your code, | also means OR. You have to keep in mind that R interpretes numeric values also as logical when you use any logical operator. A 0 is seen as FALSE, anything else as TRUE.

So your call to lm constructs a model of y in function of x OR z. This doesn't make any sense. Given the values of x, this will just be y ~ TRUE. This is also the reason your model doesn't fit. Your model matrix has 2 columns with 1's, one for the intercept and one for the only value in x|z, being TRUE. Hence your coefficient for x|z can't even be calculated, as shown from the output:

> lm(y ~ x|z)

Call:
lm(formula = y ~ x | z)

Coefficients:
(Intercept)    x | zTRUE  
   -0.01925           NA  

Inside formulas for mixed models

In mixed models (eg lme4 package), | is used to indicate a random effect. A term like + 1|X means: "fit a random intercept for every category in X". You can translate the | as "given". So you can see the term as "fit an intercept, given X". If you keep this in mind, the use of | in specifications of correlation structures in eg the nlme or mgcv will make more sense to you.

You still have to be careful, as the exact way | is interpreted depends largely on the package you use. So the only way to really know what it means in the context of the modeling function you use, is to check that in the manual of that package.

Other uses

There are some other functions and packages that use the | symbol in a formula interface. Also here it pretty much boils down to indicating some kind of group. One example is the use of | in the lattice graphic system. There it is used for faceting, as shown by the following code:

library(lattice)
densityplot(~Sepal.Width|Species,
            data = iris,
            main="Density Plot by Species",
            xlab="Sepal width")

Tags:

R

R Formula