How to Motivate Open-Cover Formulation of Compactness in a Metric Space?

If you start with the notion of sequences having convergent subsequences and call that compactness then you'll never find the open-cover definition (realistically) because you've not started out with intuition: you've just applied a label to a concept you've come up.

Let's go back a bit further: what are we trying to convey when we say compact? We're trying to explain that what we're looking at is somehow all together in one place, not too spread out, that any point of what we're looking at is "not too far" from any other.

Ok, so how do we make that more mathematical? We could try considering the distances between points... but that requires a metric and we know that a general set doesn't have to have that. In fact, when we think about general sets we run into the standard problem: there's not much structure there to work with. Typically we have open sets and neighbourhoods and... well, that's about it.

But that's actually all we need! We have counting measure available to us and that gives us a way to describe how spread out (or not) our set is: we see if we can cover our set with finitely many open sets. If we can never do that, then we can't possibly be compact: our set must be spread out quite significantly. If we can do it sometimes but not others... that's probably not compact then, as it shouldn't really depend on how we're choosing our sets. But if every time we cover our set we can find a finite set of neighbourhoods that still covers it, we can call that compact.

This way of looking at it already alerts you to the idea that sequential compactness might not always be good: we quickly see that these sequences might run off arbitrarily far in any direction while having convergent subsequences, and that those might be quite messy (we might start thinking about Besicovitch sets and how strange those can be).

Note that compact doesn't have to mean small and that some non-compact sets, with this definition, can be enclosed in compact sets (consider your favourite bounded non-compact set and then any origin-centred closed call that contains it).