What's up with the Sheffer stroke axiom?

It turns out this is just one Sheffer stroke axiom. There's an old paper by Lukasiewicz, which got entitled Generalizing Deduction if I recall correctly, where he finds that Nicod's axiom, which could get translated to DDpDqrDDtDttDDsqDDpsDps in Polish notation has a substitution instance which is also a single axiom: DDpDqrDDsDssDDsqDDpsDps. Wajsberg found another single axiom in 1931 DDpDqrDDDsrDDpqDDpsDpDpq, and Lukasiewicz in 1931 DDpDqrDDpDrpDDsqDDpsDps. Over 60 more 23-letter single axioms in the Sheffer stroke got found by Ernst, Fitelson, and Harris. There also exist single axioms for calculi with more than one connective, and a theorem which ensures the existence of single axioms for non-classical propositional calculi which have certain theorems which hold.

How do you actually work with the thing above?

The key observation consists in that Nicod's axiom has the same form as all of the assumptions in the argument "if DpDqr, and p, then r", where the second 'p' does not differ at all from the 'p' in 'DpDqr'. Or, if translated to infix, Nicod's axiom has form '(U|(V|W))', and has form 'U' also simultaneously.

More generally, all single axioms which operate under a single rule of inference work similarly, since if they didn't, the rule of inference couldn't get used to infer anything, since every formula before using a rule of inference initially has to get obtained from a substitution of the axiom.

So, the first step for a formal proof involving Nicod's axiom involves making two substitution instances of that axiom, one of the form 'DpDqr', the other of the form 'p', where 'p' has the same form as in 'DpDqr', and then infer 'r'. The easiest way to do this might be to use condensed (Sheffer stroke) detachment, where the axiom sort of suggests a substitution all by itself. Informally, that's a way of doing "as little" substitution as possible in the formulas to infer something, and thus what gets inferred is at least as general as anything else derivable in one detachment. To do condensed detachment, first make sure that the formulas have no variables in common. For Nicod's axiom:

  1. DDpDqrDDtDttDDsqDDpsDps

  2. DDaDbcDDdDddDDebDDaeDae

will do. The following diagram, which recasts 1. and 2. with some spacing, I think helps:

D D p     D q     r           DDtDttDDsqDDpsDps
    |       |     |
    -----   ----- -----------
  D DaDbc D DdDdd DDebDDaeDae

Now, and this case is fairly simple for condensed detachment, substitute p with DaDbc, q with DdDdd, and r with DDebDDaeDae and we get:

  1. DDDaDbcDDdDddDDebDDaeDaeDDtDttDDsDdDddDDDaDbcsDDaDbcs

So, we have the required forms to use the rule of inference, since 3. has the form 'DxDyz', with 2. as the 'x', having the same for as in 3.

Or in other words, since the first part of 3. matches that of 2. we can now infer:

  1. DDsDdDddDDDaDbcsDDaDbcs

I'm not clear about your second question.

If the only axiom is the disaster above, how do you ever conclude anything that makes sense?

If this is taken to mean something that we might find more comprehensible to understand in natural language, we might note that just by applying a definition DpDpp turns into Cpp ("(p$\rightarrow$p)"). Uniform substitution in the axiom, and/or proven (object-language) theorems, using the rule of inference can prove formulas like that.

The following is a first-order proof via William McCune's Prover9 which has clues as to how to construct a propositional calculus proof of DxDxx:

1 P(D(x,D(x,x))) # label(non_clause) # label(goal).  [goal].
2 -P(D(x,D(y,z))) | -P(x) | P(z).  [assumption].
3 P(D(D(x,D(y,z)),D(D(u,D(u,u)),D(D(w,y),D(D(x,w),D(x,w)))))).  [assumption].
4 -P(D(c1,D(c1,c1))).  [deny(1)].
5 P(D(D(x,D(y,D(y,y))),D(D(D(z,D(u,w)),x),D(D(z,D(u,w)),x)))).  [hyper(2,a,3,a,b,3,a)].
6 P(D(D(x,D(D(y,D(z,u)),w)),D(D(D(w,D(v5,D(v5,v5))),x),D(D(w,D(v5,D(v5,v5))),x)))).  [hyper(2,a,3,a,b,5,a)].
10 P(D(D(D(D(x,y),D(D(z,x),D(z,x))),D(u,D(u,u))),D(z,D(y,w)))).  [hyper(2,a,6,a,b,3,a)].
12 P(D(D(x,D(y,z)),D(D(D(u,w),D(D(w,u),D(w,u))),D(v5,D(v5,v5))))).  [hyper(2,a,5,a,b,10,a)].
14 P(D(x,D(x,x))).  [hyper(2,a,12,a,b,12,a)].

Doug Spoonwood's answer was very helpful, as was Jean Nicod's paper introducing the Sheffer stroke axiom, but I'm going to accept the former because Nicod doesn't appear interested in StackExchange karma.

I'm going to post my own answer describing what I learned about working with the Sheffer stroke and with this axiom in particular. My personal goal was to prove $Y \mid (Y\mid Y)$, since it seemed like a very simple fact that should be true; below, I prove that, and a few other things, in potentially human-readable form.

How I learned to stop worrying and love the Sheffer stroke

Some nicer notation, partly borrowed from Nicod:

  • $x \mid y/z$ as shorthand for $x \mid (y \mid z)$; in general, $/$ is a version of $\mid$ that is higher in order of operations, to avoid parentheses.
  • $\overline{x}$ as shorthand for $x \mid x$; this can be read as (and is equivalent to, outside the system) "not $x$". Note that from $x$ and $x \mid \overline y$, we can infer $y$.
  • $\pi_x$ as shorthand for $x \mid \overline x = x \mid x/x$. This should be true for all $x$.
  • $[x,y]$ as shorthand for $x/y \mid \overline{y/x}$. This should be true for all $x,y$; from $[x,y]$ and $x\mid y$, we can "swap $x$ and $y$" and infer $y \mid x$.
  • $[x,y,z]$ as shorthand for $(x\mid y/z) \mid \overline{z/y \mid x}$. This should be true for all $x,y,z$; from $[x,y,z]$ and $x \mid y/z$, we can "reflect $x,y,z$" and infer $z/y \mid x$.

(Nicod uses $\pi$ for a specific $\pi_t$; $[x,y]$ and $[x,y,z]$ are all mine, inspired by commutator brackets in a completely different area of math.)

In this answer, I'll prove $\pi_x$, $[x,y]$, and $[x,y,z]$ for all $x$,$y$, and $z$.

Lemma 1. $[x, \pi_y]$ for all $x,y$.

Proof. The single axiom we've got can be written as $$ (u \mid v/w) \mid \pi_y / ( x/v \mid \overline{u/x}). $$ In particular, if we set $u=v$ in the axiom, we have $$ (u \mid u/w) \mid \pi_y / [x,u]. $$ If we can prove anything of the form $u \mid u/w$, we can deduce $[x,u]$. In particular, setting $u=v=w=x=y$ in the axiom gives us $\pi_y \mid \pi_y / [y,y]$, which has that form; therefore, we have $[x, \pi_y]$ for all $x$ and $y$.

Lemma 2. $[x,\pi_y,z]$ for all $x,y,z$.

Proof. Suppose we want to try to infer $[x,y,z] = (x\mid y/z) \mid \overline{z/y \mid x}$ from the axiom. In general, things we infer from the axiom have the form $x/v \mid \overline{u/x}$, which matches $[x,y,z]$ if we have $u = z\mid y$ and $v = y\mid z$. Setting $w = y \mid z$ in the axiom to simplify things, we get $$ ((z\mid y) \mid (y \mid z)/(y \mid z)) \mid \pi_y / ((x\mid y/z) \mid \overline{z/y \mid x}) $$ which simplifies to $$ [z,y] \mid \pi_y / [x,y,z] $$ so if we prove $[z,y]$, we can infer $[x,y,z]$. In particular, by Lemma 1, we can infer $[x, \pi_y, z]$ for all $x,y,z$.

Theorem. $\pi_y$, $[x,y]$, and $[x,y,z]$ for all $x,y,z$.

Proof. Setting $u=v=w$ in the axiom gives us $\pi_u \mid \pi_y / [x,u]$ which we now "reflect" by Lemma 2 into $[x,u] / \pi_y \mid \pi_u$ and "swap" by Lemma 1 into $\pi_u \mid [x,u] / \pi_y$. So if we prove $\pi_u$ for any $u$, we can use this to infer $\pi_y$ for all $y$.

But we've already proven $[\pi_y, \pi_y]$ (as a special case of Lemma 1) which can be rewritten as $$ [\pi_y, \pi_y] = \pi_y / \pi_y \mid \overline{\pi_y / \pi_y} = \pi_{\pi_y \mid \pi_y}. $$ From $\pi_{\pi_y \mid \pi_y} \mid [x,\pi_y \mid \pi_y] / \pi_y$ and $\pi_{\pi_y \mid \pi_y}$, we infer $\pi_y$ (for all $y$).

Setting $u=v=w=y$ in our earlier formula $(u \mid u/w) \mid \pi_y / [x,u]$ gives us $\pi_y \mid \pi_y / [x,y]$, and now that we know $\pi_y$, we infer $[x,y]$ (for all $x,y$).

Now, from the formula $[z,y] \mid \pi_y / [x,y,z]$, we infer $[x,y,z]$ (for all $x,y,z$).