Why is $\textbf{S}_1\otimes \textbf{S}_2 = \sum_{i = x,y,z}S_{1i}\otimes S_{2i}$?

You're running into a common issue people have with vector operators. That is, the vector operator $\mathbf{S}$ cannot be squared in the usual way an operator can.

Recall that an operator is a linear map from a vector space to itself, $A: V \to V$. By contrast, $\mathbf{S}$ is a vector of operators. It acts on a single ket to produce a vector of kets, $$\mathbf{S}: V \to V \otimes \mathbb{C}^3, \quad \mathbf{S} | \psi \rangle \equiv \begin{pmatrix} S_x |\psi \rangle \\ S_y |\psi \rangle \\ S_z |\psi \rangle \end{pmatrix}.$$ Since the domain and range aren't the same, $\mathbf{S}$ can't be squared by just applying it twice. Instead, the definition of the notation "$S^2$" is in terms of the usual dot product, $$S^2 \equiv S_x^2 + S_y^2 + S_z^2.$$ Because of this, the question is moot. The dot product isn't actually related to the tensor product at all, it's put in by the definition of the square of a vector operator. In fact, it doesn't even make sense to consider the object $\mathbf{S}_1 \otimes \mathbf{S}_2$ for the calculation you're trying to do. That object would not even be a vector operator; it would be a rank 2 tensor operator, being the tensor product of two vectors.

Unfortunately, textbooks generally aren't very clear about this, because it's tempting to just stick to the clean notation and avoid mentioning these details.

Here the square of the operator $S^2$ does not mean you apply it twice. Literally, one has by definition $$ {S}^2\equiv S_xS_x+S_yS_y+S_zS_z\, . $$ Having ascertained this, replace $$ S_x=S_{1x}\otimes 1 + 1\otimes S_{2x} $$ etc. Applying $S_x$ twice yields \begin{align} S_xS_x&= (S_{1x}S_{1x})\otimes 1 +2 S_{1x}\otimes S_{2x} + 1\otimes (S_{2x}S_{2x})\, . \end{align} Redoing this for all the components and summing yields $$ S^2= S_{1}^2 + S_{2}^2 +2\sum_{i}S_{1i}S_{2i} $$