How do we know a quantum state isn't just an unknown classical state?

Quantum mechanics was developed in order to match experimental data. The seemingly very weird idea that some observables do not have a definite value before their measurement is not something physicists have been actively promoting, it is something that theoretical considerations followed by many actual experiments have forced them to admit.

I don't think there is an intuitive explanation for this. It is closely linked to the notion of superposition. The basic idea is that we do indirectly observe the effects of interference between superposed quantum states, but upon actual measurement we never see superposed states, only classical, definite values. If we suppose these values where there all along, then why would we have any interference? The whole framework of QM would be pointless.

In other words, a quantum state is what it is (whatever that is) precisely because it is in contrast to a classical state: crucially, it only describes a probability distribution for observables values, not actual, permanent values for these observables.

A wavefunction that would always be collapsed would just be a classical state. Now why (and does?) a measurement "collapse" anything at all is an open question, the measurement problem.

Imagine the following set of experimental data:

Every day, you decide whether to put on your sunglasses before looking at the sky to check the weather. Every day I, a thousand miles away, do the same thing.

After we've made our observations, we call each other on the phone to compare. We discover that on days when we've looked without sunglasses, we always see the same thing (sometimes sunny, sometimes cloudy). On days when one of us wears sunglasses and the other doesn't, we still always see the same thing. But on days when we both wear sunglasses, it is invariably the case that one of us sees a sunny sky and the other sees a cloudy sky.

Now suppose every day, one of four things is true: Either the sky above your house is sunny (and looks sunny with or without sunglasses), or it's cloudy (and looks cloudy with or without sunglasses), or it's in a condition that looks sunny without sunglasses but cloudy with them, or it's in a condition that looks sunny with sunglasses but cloudy without. Likewise for the sky above my house. And suppose each sky is unambiguously in one of these states before we look at it.

Question: What pattern could account for the experimental data? Answer: None. If your sky and my sky are always either both sunny or both cloudy, that accounts for what we see on three out of four days but can't account for what we see when we both wear sunglasses. If there's some much more complicated pattern (e.g. 8% of the time our skies are both sunny, 7% they're both cloudy, 19% yours is sunny while mine is in the state that looks sunny only through sunglasses, etc), you still won't be able to account for that experimental data. It's not hard to prove that no matter what percentages you assign to the sixteen possible pairs of states, the experimental data just don't fit your predictions.

Conclusion: You can't use ordinary probability theory to explain the weather.

Now in real life we don't have this problem with weather, because we never see the kind of experimental data I supposed in the first place. But in quantum mechanics, we do see such data (not exactly as I've supposed here, but close enough so that the same issue arises). Thereefore you can't use ordinary probability theory, in the sense you're trying to use it, to explain the observed facts.

The precise answer is contained within the Kochen-Spekker theorem and Bell's theorem. (I know it's awkward that one of them has the form "the [name] theorem" and the other has the form "[name]'s theorem". That's a long-standing inconsistency in English math and physics usage.)

The key point is the fact that you can measure in different bases. If you have a fixed state $|\psi\rangle$ (whose time-evolution you neglect), and you agree to always measure in the same fixed orthonormal basis $\{|i\rangle \}$ (e.g. the position basis), then the probability distribution $\left \{ p_i = |\langle i | \psi \rangle|^2 \right \}$ is completely classical, and could absolutely simply reflect that the system was in an unknown but definite state before the measurement.

But it turns out that there's no single classical probability distribution (which could simply reflect uncertainty in the system's definite pre-measurement state) that simultaneously reproduces the Born statistics in every basis.

So if you try to understand what's so weird about quantum mechanics while only considering measurements in a single basis, then you'll fail, because the quantum mechanics of a single state measured in a single basis really is just classical probability theory. To see what's really going on, you need to consider measuring in different bases (or equivalently, allowing yourself to act a non-diagonal unitary operator on the state before measuring it).