Where does $\in$ come from and where is it defined?

Propositional logic contains only propositional connectives $\land, \lor,\ldots$ and propositional variables. You are correct that it does not have the symbol $\in$.

First-order logic (sometimes called predicate logic) has a number of basic symbols: variables, propositional connectives, quantifiers, and (usually) $=$ for equality. However, one of the most important things about first-order logic is that you can add optional symbols to represent functions, predicates/relations, and constants.

For instance, the axioms of group theory can be stated within the first order language containing the extra symbols $\{\times, {}^{-1}, e\}$ (indeed, you actually only need $\times$).

When we are dealing with logic we need to be careful to not confuse the syntax of our system with its semantics. Syntactically, $\in$ is just a symbol that we can take or leave in any particular language. This symbol does not need to refer to set membership; it doesn't even need to be a binary relation! I could interpret $\in$ as anything that the logic lets me: a constant, a relation of any arity, or a function of any arity.

The "meaning" of the symbol $\in$ isn't really a question for logic. We have, as a mathematical community, decided that $\in$ denotes set membership. It is very convenient in mathematics to be able to talk about collections of objects, so the symbol is used a lot. However, the question of what $\in$ "really means" is similar to what $0$ "really means": these are questions that are more of metaphysics. There may be some people who claim that they have no notion of "set" or "collection". To these skeptics, talking about ZFC might not be too helpful. To people who do have a notion of "set" we can ask the question of whether particular claims are try about their idea of set. Most of the mathematical community have agreed that sets (whatever they are) satisfy ZFC.


It was originally the Greek lower case epsilon ($\epsilon$). If you read really old math papers/books you will still see a lower case epsilon used instead of the stylized $\in$ symbol.

According to this page, the notation was invented by Peano from the first letter of the Greek word "είναι" meaning "is."

As for the definition of $\in$, it is defined by the axioms of ZFC. The axioms themselves tell you what it means and how to use it properly.


Based on the comments, I think user638203 is looking for a lot of information/context, and a fully complete answer would probably require information from many courses/books on logic and related theoretical "computer science". I'm just going to provide a little information and clarification so that they (and anyone else interested) can do some further reading/research.


How is the syntax for a binary relation(-like) symbol defined?

This is actually more complicated than one might hope, and sources differ.

Before I mention any technical details/terminology, I just want to present one option as a basic intuitive idea. If we want $\star$ to act syntactically like a binary relation between things of a certain type (I mean this very informally), then we might make a rule like this: if "$a$" and "$b$" are placeholders for valid objects of the relevant type (whatever that means in context), then "$(a\star b)$" is a thing you can write that has a truth value. For example, if we're already comfortable with standard set theory symbols, and we want $\star$ to act syntactically like a relation between sets, then if $x$ stands for a set, maybe we can write something like "$\left(((x\cup x)\star x)\land (x=x)\right)$", which might be true/provable or not depending on $x$ and any rules about $\star$.

If you want to make this more formal, this sort of thing is described (semi-)formally in most treatments of logic, as described in the syntax section of the Wikipedia page for first-order logic. Unfortunately, most treatments don't talk about infix notation for relation symbols, so we'd be forced to write things like "$\star(a,b)$" (and "$\in(a,b)$"?) if we followed a random book on logic strictly.

If you want to handle infix notations and make the syntax very formal, arguably to a degree not explicitly spelled out even in many texts on mathematical logic, you would use some approach to specifying formal grammars, like BNF. You would have to very carefully specify things like "these strings are atomic for specifying these types of objects" (whether they be strings that stand for sets, or for propositions that could have a truth value, etc.), and "these are the valid ways to build bigger strings of various types from smaller ones".

But in practice, in mathematics, we often set up precedence rules (rarely considered in discussions of formal grammars), drop parentheses, and allow ambiguous and/or invalid expressions with meaning determined by context and unwritten convention. For example, maybe a math book would use "$a\in b\in c$" in some context to mean something like "$\left((a\in b)\land (b\in c)\right)$", but maybe no book on logic/set theory would ever explicitly write something like that as a rule.


Where is it said that $\in$ is a binary relation(-like) symbol?

Often, when discuss formal mathematics we say what it means for something to be a "(well-formed) formula" (abbreviated "wff"). In a treatment of set theory, (syntactically) valid usage of $\in$ will be mentioned very early on. For example,

  • Introduction to Axiomatic Set Theory by Takeuti and Zaring says on page 4: "The language of our theory consists of...a predicate constant: $\in$..." and "If $x$ and $y$ are individual variables then $x\in y$ is a wff."
  • Metamath Proof Explorer's treatment of set theory has (thanks to Norm Megill) a rule that basically says "$A\in B$ is a wff if $A$ and $B$ are classes", and then a verification that in the special case where $A$ and $B$ are sets $x$ and $y$, $x \in y$ is still a wff.
  • Set Theory: An Introduction to Independence Proofs by Kunen says on pages 2-3: "The basic symbols of our formal language are...$\in$, $=$, and $v_j$ for each natural number $j$." and "we define a formula...$v_i\in v_j$, $v_i=v_j$ are formulas for any $i,j$...".

What about the use of $\in$ with quantifiers?

In an expression like "$\exists x\in S$" or "$\forall x\in S$", the subexpression $x\in S$ does not really have its other conventional meaning (syntactically or semantically, across all treatments of set theory). Generally, one or both of these is taken as an abbreviation. Formulas like $\exists x\in S, \varphi(x)$ abbreviate something like $\exists x,\left(x\in S\land\varphi(x)\right)$. And formulas like $\forall x\in S, \varphi(x)$ abbreviate something like $\forall x,\left(x\in S\rightarrow\varphi(x)\right)$. However, I'm having some trouble finding a source that is very explicit about these abbreviations or similar. Two sources that get close are Hammack's Book of Proof and ProofWiki.

In Book of Proof, it says "Likewise, a statement such as 'There exists a subset $X$ of $\mathbb N$ for which $|X|=5$.' can be translated as '$\exists X,(X\subseteq\mathbb N)\land(|X|=5)$' or '$\exists X\subseteq\mathbb N,|X|=5$' or '$\exists X\in\wp(\mathbb N), |X|=5$'." and "The following statements mean the same thing: '$\forall x\in S,Q(x)$' '$(x\in S)\Rightarrow Q(x)$'."

In ProofWiki, they explain a different abbreviation. It says "In the language of set theory, this can be formally defined: $\forall x \in S: P(x) := \left\{x \in S: P(x)\right\} = S$" on the universal quantifier page. And it says "In the language of set theory, this can be formally defined: $\exists x \in S: P(x) := \left\{x \in S: P(x)\right\} \ne \varnothing$" on the existential quantifier page.


Where is the (proof-based) meaning of $\in$ written?

One definition of the "meaning" of $\in$ would be something like the collection of statements involving $\in$ that are provable. This boils down to the question of which proofs using $\in$ are valid.

Informally, you can pick your favorite account of axioms for set theory (e.g. the ones on Wikipedia) and then read/write proofs building up from there, in your favorite informal written style. You'll get different provable statements depending on which axioms you choose (e.g. do you want the continuum hypothesis or the axiom of choice to be provable?).

But if you want to be extra careful and rigorous, then you'll need to pick a particular formal way of writing and checking proofs (a proof system). A detailed account of many options is probably too much for any single whole book, so I'll just give an idea of what's out there. For example, you might choose fitch-style proofs for (some particular author's account of) natural deduction. Or a Hilbert-style system using starting with modus ponens as the sole inference rule, and Meredith's axiom to handle propositional logic and your favorite axiom (schemes) for quantifiers, such as Q5-7 on Wikipedia.

how we're supposed to use it. Where would I "point you" to show these things?

To emphasize: In any case, the axioms of set theory say which statements involving $\in$ can be used as a foundation for proofs about sets. Those axioms are what you would point me to.


What about the formal semantics of $\in$?

The informal semantics are discussed all over the place. You said, "We all know it means "element of"".

One formal way to discuss the "semantics" (in the philosophy sense) of something is to talk about "syntactic consequence". This basically means "which proofs are valid?" (we push around logical symbols on a "syntactic" level), and I discussed this above. But there's also "semantic consequence", which basically talks not about proofs, but about objects/"possible worlds" (sometimes called models) that satisfy the axioms with and without certain properties.

My favorite example to explain things like this is via Group Theory, which you said you're unfamiliar with. We can instead talk about propositional logic. We can explain how $Q$ follows from $P$ and $P\to Q$ in two very different ways. One way is to say "We have a proof-writing rule of inference called modus ponens which says $Q$ is provable if we have $P$ and $P\to Q$." That's what it means for $Q$ to be a syntactic consequence of $P$ and $P\to Q$.

Another way to argue is to say something like "Consider all possible worlds: The worlds where $P$ is false are not under discussion since we're supposed to have $P$. The worlds where $P$ is true but $Q$ is false are not under discussion since we're supposed to have $P\to Q$. The only worlds left are those where $P$ is true and $Q$ is also true. In particular $Q$ is true in all the worlds under discussion." This is what it means for $Q$ to be a semantic consequence of $P$ and $P\to Q$. If this seems weird, it's exactly the sort of thing we do when we use truth tables to argue something. We basically make a table of all the possible worlds.

The limitation of standard truth tables is that they can't handle quantifiers. To talk about models (possible worlds) with quantifiers, we need something like set theory. For example, a linguist might translate "every dog barks" to something like "every thing (in the possible world under discussion) that's in the set of dogs, is a thing that barks". This works very naturally for many things in math (e.g. the set of integers and the set of rational numbers are both models for some rules of arithmetic), and is discussed in "model theory".

But for $\in$, it's a little confusing. The axioms that say what we do with $\in$ are axioms for a system of set theory. So to have sets to serve as models for that, we'd need to already have some foundational/meta set theory we're working with. This is exactly what's done and it's important mathematics. I don't have an easier source that spells this out than the beginning of Timothy Chow's A beginner's guide to forcing.