Optimal Strategy for this schoolyard game - (Charge, block, shoot)

Here is a very incomplete answer which might help use make some progress for two player version.

  1. I'm going to ignore super-fire-ball.

  2. I'm going to make an assumption which simplifies things a lot. If the charges are ever imbalanced, the player with more charges can force a win. (I have some serious doubts about this.)

In this case, equilibrium play involves charging whenever neither have a charge and equally randomizing otherwise.

In the first round, it is weakly dominant to charge (this doesn't rely on my assumption. Having fewer charges cannot possibly increase your probability of losing against any strategy).

In the second round,

If 1 blocks and 2 uses fireball, then 1 eventually wins (by assumption).

If 1 blocks and 2 blocks, then the game simply starts over from the same state.

If 1 blocks and 2 charges, then 2 eventually wins (by assumption)

If 1 charges and 2 uses fireball, then 2 wins

If 1 charges and 2 blocks, then 1 eventually wins

If 1 charges and 2 charges, the game continues with both at 2 charges.

If 1 uses fireball and 2 blocks, then 2 eventually wins

If 1 uses fireball and 2 charges, then 1 wins

If 1 uses fireball and 2 uses fireball, then the game returns to the original sate.

Notice that no matter what 1 does, the game either ends in a win for 1, a win for 2, or continues (with each having equal charges). These are the exact same outcomes possible in every round of a best-of-one rock-paper-scissors game. The only equilibrium is to randomize equally over all three options (ignoring super-fire-ball) in every round unless neither player has a charge in which case, both charge.


Let $X_{a,b}$ be the expected payoff of this game for a player with $a$ charges when the other player has $b$ charges, where $+1$ is a win and $-1$ is a loss, and assuming optimal play. We will take the cost of a super fireball to be $M>1$. By symmetry, $X_{a,b}=-X_{b,a}$, and clearly $X_{a,a}=0$ for any $a$. From state $(0,0)$, both players will charge up to state $(1,1)$; similarly, from state $(M,M)$, both players will super-fireball back to $(0,0)$. Next, $X_{M,a}=+1$ for any $a<M$, because the first player will win immediately with a super fireball. Finally, $X_{M-1,0}=+1$, because the first player can safely charge once and then super-fireball, whatever the second player does. All other payoffs and strategies are initially unknown.

Now, we can calculate the equilibrium strategy (and payoff) for each player in state $(a,b)$ as long as we know the payoffs for some of the neighboring states. Specifically, suppose the first player is charging with probability $c$, fireballing with probability $f$, and blocking with probability $b$, where $0 \le c,f,b\le 1$ and $c+f+b=1$, and that the second player's probabilities are $c'$, $f'$, and $b'$. Then the expected payoff for the first player satisfies $$ X_{a,b}=cc'X_{a+1,b+1}+ff'X_{a-1,b-1}+bb'X_{a,b}-cf'+fc'+fb'X_{a-1,b}+bf'X_{a,b-1}+cb'X_{a+1,b}+bc'X_{a,b+1}, $$ so $$ X_{a,b}=\frac{1}{1-bb'}\times\\ \left(cc'X_{a+1,b+1}+ff'X_{a-1,b-1}-cf'+fc'+fb'X_{a-1,b}+bf'X_{a,b-1}+cb'X_{a+1,b}+bc'X_{a,b+1}\right). $$ For this to be an equilibrium strategy, its partial derivative with respect to each probability, subject to the constraints on the probabilities, must be zero. (At the boundaries, when a probability is zero or one, its partial derivative may be nonzero if it has the correct sign.) Things are simplified when the second player has no charges, since then he cannot fireball ($f'=0$), and the first player need not block ($b=0$). In that case $$ X_{a,0}=cc'X_{a+1,1}+fc'+fb'X_{a-1,0}+cb'X_{a+1,0}. $$ Of course, further simplifications occur when the neighboring payoffs are zero.


Note that if $M=2$, then all payoffs are known, and the only unknown strategy is for state $(1,1)$. In that state, fireball beats charge, charge beats block, and block beats fireball... so the game is isomorphic to rock-paper-scissors, and the Nash equilibrium strategy in state $(1,1)$ is to choose each move with equal probability.

Things first get interesting for $M=3$. Here we have unknown payoffs in states $(1,0)$ and $(2,1)$, and unknown equilibrium strategies in states $(1,1)$, $(2,2)$, $(1,0)$, and $(2,1)$. Let's first consider the state $(1,0)$. The payoff for the first player is $$ X_{1,0}=cc'X_{2,1}+fc'+cb'=cc'X_{2,1}+(1-c)c'+c(1-c')=cc'(X_{2,1}-2)+c+c'; $$ setting the derivatives to zero yields $$ c=c'=X_{1,0}=\frac{1}{2-X_{2,1}}. $$ In the state $(2,1)$, the payoff is $$ X_{2,1}=\frac{cc'+ff'X_{1,0}-cf'+fc'+bf'+cb'}{1-bb'}=\frac{(1-f-b)(1-2f')+f(f'X_{1,0}+c')+bf'}{1-b(1-f'-c')}. $$ Setting the derivative with respect to $f$ to zero gives $$ -(1-2f')+(f'X_{1,0}+c')=f'(2+X_{1,0})+c'-1=0\implies f'=\frac{1-c'}{2+X_{1,0}}. $$ Setting the derivative with respect to $c'$ to zero gives $$ 0=\frac{f}{1-b(1-f'-c')}-\frac{b}{1-b(1-f'-c')}X_{2,1}\implies f=bX_{2,1}. $$ Setting the derivative with respect to $b$ to zero gives $$ 0=\frac{3f'-1}{1-b(1-f'-c')}+\frac{1-f'-c'}{1-b(1-f'-c')}X_{2,1}\implies f'(3-X_{2,1}) + (1-c')X_{2,1}-1=0\implies f'=\frac{1-(1-c')X_{2,1}}{3-X_{2,1}}. $$ Finally, setting the derivative with respect to $f'$ to zero gives $$ 0=\frac{-2(1-f-b)+fX_{1,0}+b}{1-b(1-f'-c')}-\frac{b}{1-b(1-f'-c')}X_{2,1}\implies -2+f(2+X_{1,0})+b(3-X_{2,1})=0\implies f=\frac{2-b(3-X_{2,1})}{2+X_{1,0}}. $$ Combining the second and fourth equations, and using the fact that $X_{1,0}=1/(2-X_{2,1})$, gives $$ b=\frac{2(2-X_{2,1})}{6-X_{2,1}^2}; \qquad f = \frac{2X_{2,1}(2-X_{2,1})}{6-X_{2,1}^2}. $$ Similarly, combining the first and third equations gives $$ c'=\frac{1+2X_{2,1}-X_{2,1}^2}{6-X_{2,1}^2}; \qquad f'=\frac{2-X_{2,1}}{6-X_{2,1}^2}. $$ Feeding everything back in to the equation for $X_{2,1}$, we find that $X_{2,1}$ is a root of the cubic equation $x^3-x^2-4x+2=0$; it must be the real root between $0$ and $1$, so numerically $X_{2,1}\approx 0.47068$. What this means is that from the state $(2,1)$, the first player can expect to win only about $3/4$ of the time. The second player can fight back... and should do so by charging $30\%$ of the time, fireballing $26\%$ of the time, and blocking the remaining $44\%$ of the time. Similarly, $X_{1,0}\approx 0.6539$, so the first player can win about $5/6$ of the time from that state; the second player should fight back by charging $65\%$ of the time. The optimal strategies follow from these numbers; the key takeaway, though, is that (1) a charge advantage does not guarantee victory, and (2) the optimal strategy uses probabilities that are non-trivial algebraic numbers in the general case.

Tags:

Game Theory