Why, logically, is proof by contradiction valid?

Yes, well, a proof by contradiction involves two rules of inference.

$$\begin{split}\text{Negation introduction}\quad&\quad (r\implies q) \text{ and } (r\implies \neg q), \text{ infers } \neg r\\\text{Double Negation elimination:}\quad &\quad \neg\neg p\text{ infers } p\end{split}$$

(1) the "Negation introduction" rule of inference argues that if something implies a contradiction then it must be false, since we usually assert that contradictions are not true and so cannot be infered by true things.

This is acceptable in both intuitionistic and classical logic systems.   Although there are other systems (such as minimal logic) which do not accept this.

  ($\def\false{\mathsf F}\def\true{\mathsf T}$Semantically, this is because $\false \to \false$ is true while $\true\to\false$ is false.   This leads some systems to define negation as $\neg \phi ~\equiv~ \phi\to\mathsf F$ .)

(2) the "Double negation elimination" rule is that if the negation of a premise is false, then the premise must be true.   This is not accepted in intuitionistic logic, but it is in classical logic.

(3) Combining these rules give the schema for a proof by contradiction: assume a negation of a predicate, demonstrate that infers a contradiction, thereby deducing that the predicate is true.

$$\begin{split}\text{Proof by Contradiction}\quad&\quad (\neg p \implies q) \text{ and }(\neg p\implies \neg q) \text{, infers }p\end{split}$$


Many of the issues I described here are on display in this Q&A.

First, let's be clear about what we're talking about. There are two rules that are often called "proof by contradiction". The first, negation introduction, can be written like $\cfrac{\varphi\vdash\bot}{\vdash\neg\varphi}$ which can be read as "if we can derive that $\varphi$ entails falsity, then we can derive $\neg\varphi$". We could also write this as an axiom: $(\varphi\Rightarrow\bot)\Rightarrow\neg\varphi$. For some reason, this is how Bram28 has taken your statement, but I don't think you have an issue with this. You'd say, "well clearly if assuming $\varphi$ leads to a contradiction then $\varphi$ must have been false and thus $\neg\varphi$ is true". There's another rule, more appropriately called "proof by contradiction", that can be written $\cfrac{\neg\varphi\vdash\bot}{\vdash\varphi}$ or as an axiom $(\neg\varphi\Rightarrow\bot)\Rightarrow\varphi$. These seems to be what you are taking issue with. Seeing as this latter rule has been rejected by many mathematicians (constructivists of various sorts), you wouldn't be completely crazy to question it. (In weak defense of Bram28, you'd probably accept "by substituting $\neg\psi$ into the above, by the same argument we can show that $\neg\psi$ is false so $\psi$ is true", but the rule only shows that $\neg\neg\psi$ is true. The rule allowing you to go from $\neg\neg\psi$ to $\psi$ is, in fact, equivalent to proof by contradiction.)

To be even more clear about what we're talking about we need to distinguish syntax from semantics. If we're talking about "rules of inference" or "proofs", we are usually thinking syntactically. That is, we're thinking about symbols on a page and rules for manipulating those collections of symbols into other collections of symbols or rules about what constitutes "correct" arrangements of the symbols, i.e. a proof. (More informal renditions will be sentences in a natural language that follow "rules of reason", but the idea is still that the form of the argument is what makes it valid.) Semantics, on the other hand, interprets those symbols as mathematical objects and then we say a formula (i.e. arrangement of symbols) is "true" if it is interpreted into a mathematical object satisfying some given property. For example, we say a formula of classical propositional logic is "true" if its interpretation as a Boolean function is the constantly $1$ function.

So, we have two possible readings of your question: 1) Why is the rule $\cfrac{\neg\varphi\vdash\bot}{\vdash\varphi}$ derivable? 2) Why is the rule $\cfrac{\neg\varphi\vdash\bot}{\vdash\varphi}$ "true"?

For (1), one very unsatisfying answer is that it is often taken as given, i.e. it is derivable by definition of the logic. A slightly more satisfying answer is the following. Given a constructive logic where that rule isn't derivable but most other "usual" rules are, we can show that if for all formulas $\varphi$, $\vdash\varphi\lor\neg\varphi$ is derivable, then we can derive the rule $\cfrac{\neg\varphi\vdash\bot}{\vdash\varphi}$ (and vice versa). Another way of saying this is that $\varphi\lor\neg\varphi$ is provably equivalent to $(\neg\varphi\Rightarrow\bot)\Rightarrow\varphi$. It is also provably equivalent to $\neg\neg\varphi\Rightarrow\varphi$. The axiom $\varphi\lor\neg\varphi$ is often described as "everything is either true or false". This isn't quite what it means, but this idea of everything being "either true or false" is often considered intuitively obvious. However, there is no question of whether $\varphi$ is "true" or "false" in the above. We have rules for building proofs from other proofs, and that's all there is to this perspective.

For (2), if you use the "truth table" semantics of classical propositional logic, then you simply calculate. You simply need to show that $(\neg\varphi\Rightarrow\bot)\Rightarrow\varphi$ when interpreted is the constantly $1$ function when both $0$ and $1$ are substituted in the interpretation of the formula. You can easily show this. In these semantics, "proof by contradiction" is simply "true". To question this requires questioning the semantics. One thing is to question whether there are only two truth values, $0$ and $1$. Why not three or an infinite number of them? This leads to multi-valued logics. Alternatively, we could keep the truth values the same, but interpret formulas as something other than Boolean functions. For example, we could say they are Boolean functions but we only allow monotonic ones, or we could say that they are total Boolean relations. Making these changes requires adapting the notion of "true". For the latter example, we may say a formula is "true" if it is interpreted as a relation which relates all Boolean inputs to $1$. Being a relation and not just a function, though, this does not stop it from also relating some or all of the inputs to $0$, i.e. something can be both "true" and "false".

Changing the semantics affects which rules and axioms are sound. A rule or axiom is sound with respect to a given semantics, if its interpretation is "true" in that semantics. $(\neg\varphi\Rightarrow\bot)\Rightarrow\varphi$ is sound with respect to "truth tables" but not with respect to many other possible semantics.

To summarize, if you're working with respect to "truth table" semantics, then "proof by contradiction" is simply "true", that is when interpreted it is interpreted as a constantly "true" Boolean function, and this can be easily calculated. In this case, all of your "logical assumptions" are built into the notion of "truth table" semantics. With respect to semantics, "proof" is irrelevant. Proof is a syntactic concept. Your discussion about "assuming the premise is false" is (slightly mangled) proof-theoretic talk. With a semantic approach, there is no "assuming the premise is true/false", either the formula interprets as "true" (i.e. a constantly $1$ function) or it doesn't. (You can have meta-logical assumptions that some formula is "true", but this is happening outside of the logic. Ultimately the coin of the mathematical realm is the more syntactic notion of proof and semantics just pushes proof to the meta-logic.)


It works as follows:

Say you have some set of statements $\Gamma$, and we want to infer $\neg \phi$, and we do this by a proof by contradiction.

Thus, we assume $\phi$, and show that that leads to a contradiction.

This means that $\Gamma$, together with $\phi$ logically implies a contradiction, i.e.

$$\Gamma \cup \phi \vDash \bot$$

and that means that is impossible to set all of the statements in $\Gamma \cup \phi$ to true. But that then also means that if all statements in $\Gamma$ are true, $\phi$ will have to be false, i.e. $\neg \phi$ will have to be true. And thus we have

$$\Gamma \vDash \neg \phi$$

Thus, in effect, we have proven $\neg \phi$