Why don't we see the covariant derivative in classical mechanics?

You don't see the covariant derivative as often because flat space has isometries that make Cartesian coordinates better, and in these coordinates there are no Christoffel symbols, so we use them as much as possible. But look at the formula for the divergence of a function $\mathbf{F} = F^\hat{r} \hat{\mathbf{r}} + F^\hat{\theta} \hat{\mathbf{\theta}}$ in polar coordinates:

$$\nabla \cdot \mathbf{F} = \frac{1}{r} \frac{\partial(r F^\hat{r})}{\partial r} + \frac{1}{r} \frac{\partial F^\hat{\theta}}{\partial \theta} = \frac{\partial F^\hat{r}}{\partial r} + \frac{1}{r} F^\hat{r} + \frac{1}{r} \frac{\partial F^\hat{\theta}}{\partial \theta}.$$

That $1/r$ in the middle term with no derivatives comes from the Christoffel symbols! So the covariant derivative is definitely there, but instead of using the Christoffel symbols, we usually calculate it using the chain rule and the fact that the cartesian basis vectors have zero derivative. The derivatives of the basis vector are after all the Christoffel symbols, so the method is not that different.

One final comment: the orthonormal basis vectors $\{\hat{\mathbf{r}}, \hat{\theta}\}$ in polar coordinates are not the basis vectors $\{\partial/\partial r, \partial/\partial\theta\}$ we know from differential geometry, because the latter are not orthonormal. The relation is simple:

$$\begin{aligned} \hat{\mathbf{r}} &= \frac{\partial}{\partial r} \\ \hat{\theta} &= \frac{1}{r} \frac{\partial}{\partial\theta}, \end{aligned}$$

so keep this in mind when applying the formulas. In differential geometry we tend to write the components of vectors with respect to the derivative basis, but the formulas we know from more basic calculus (like my divergence formula) are written in terms of the orthonormal basis.

The Chrisoffel symbol - or the connection to the metric - or just the connection - is the result of taking the derivative of a vector field - which may cause the resulting vector field to rotate.

To determine if a manifold is intrinsically or extrinsically curved, you need to calculate the Riemann curvature tensor.

For instance, for Euclidean and Minkowski spaces, the Riemann curvature tensor is zero since both of those spaces are extrinsically flat - or just flat spaces.

However, it is possible to embed an intrinsically curved surface in a flat space - in which case one or possibly more Chrisoffel symbols may not be zero - but the Riemann tensor will still be zero.

The magic of semi-Riemann manifolds is the connection known as the Levi-Civita which is unique.

Another point to consider is that in Hamiltonian mechanics the symplectic structure is independent of a metric. In the regular, non-degenerate case, this structure may be pulled-back to the tangent bundle and the domain of the Lagrangian formulation.

Therefore, you need not start at the covariant derivative for classical mechanics and instead may recover a more general, abstract description.