Prove that $R^{2}$ cannot decrease when adding a variable

I think it's best to use matrix notation to answer this. Suppose the original estimate by OLS is $$y=X\beta+u \tag{1}$$ where $X$ is an $n\times p$ matrix of observables, $\beta$ is a $p\times 1$ vector of coefficients obtained by OLS and $u$ is a $p\times 1$ vector of residuals. Now we add a new variable $X_0$ and perform a new OLS estimate $$y=X_0\hat{\beta}_0+X\hat{\beta}+v \tag{2}.$$ Since $\beta$, $\hat{\beta}_0$ and $\hat{\beta}$ are all OLS estimates we know that $u^TX=v^TX_0=v^TX=0$.

Combine $(1)$ and $(2)$ to get $$X\beta+u=X_0\hat{\beta}_0+X\hat{\beta}+v \tag{3}.$$ If we multiple both sides of $(3)$ by $u^T$ we have $$u^Tu=u^TX_0\hat{\beta}_0+u^Tv.$$ Similarly, multiplying by $v^T$ gives $$v^Tu=v^Tv.$$ This tells us that $$u^TX_0\hat{\beta}_0=u^Tu-v^Tv \tag{4}.$$

Finally,

$ \begin{align*} v^Tv&=(y-X_0\hat{\beta}_0-X\hat{\beta})^T(y-X_0\hat{\beta}_0-X\hat{\beta}) \\ &=(X\beta+u-X_0\hat{\beta}_0-X\hat{\beta})^T(X\beta+u-X_0\hat{\beta}_0-X\hat{\beta})\qquad\text{from $(1)$} \\ &=(X(\beta-\hat{\beta})-X_0\hat{\beta}_0+u)^T(X(\beta-\hat{\beta})-X_0\hat{\beta}_0+u) \\ &=(X(\beta-\hat{\beta})-X_0\hat{\beta}_0)^T(X(\beta-\hat{\beta})-X_0\hat{\beta}_0)+2u^T(X(\beta-\hat{\beta})-X_0\hat{\beta}_0)+u^Tu \\ &=(X(\beta-\hat{\beta})-X_0\hat{\beta}_0)^T(X(\beta-\hat{\beta})-X_0\hat{\beta}_0)-2u^TX_0\hat{\beta}_0+u^Tu \\ &=(X(\beta-\hat{\beta})-X_0\hat{\beta}_0)^T(X(\beta-\hat{\beta})-X_0\hat{\beta}_0)-2(u^Tu-v^Tv)+u^Tu \qquad\text{using $(4)$}\\ &=(X(\beta-\hat{\beta})-X_0\hat{\beta}_0)^T(X(\beta-\hat{\beta})-X_0\hat{\beta}_0)-u^Tu+2v^Tv \\ u^Tu&=(X(\beta-\hat{\beta})-X_0\hat{\beta}_0)^T(X(\beta-\hat{\beta})-X_0\hat{\beta}_0)+v^Tv \\ &\geq v^Tv. \end{align*} $

This says that the new SSE is less than the original SSE. It follows that $R^2$ must increase (weakly).