'Intuitive' difference between Markov Property and Strong Markov Property

A stochastic process has the Markov property if the probabilistic behaviour of the chain in the future depends only on its present value and discards its past behaviour.

The strong Markov property is based on the same concept except that the time, say $T$, that the present refers to is a random quantity with some special properties.

$T$ is called stopping time and it is a random variable taking values in $\{0,1,2,\ldots\}$ such that any value $T=n$ can be determined completely by the values of the chain, $X_0,X_1,\ldots ,X_n$, up to time $n$.

A very simple example is when you throw a coin and you want to stop when you reach $T=n$ heads. $T=n$ is completely determined by the values of the sequence of the previous tosses. Of course, $T$ is random.

The strong Markov property goes as follows. If $T$ is a stopping time, for $m\geq 1$

$$P(X_{T+m}=j\mid X_k=x_k,\;0\leq k <T;\;X_T=i)=P(X_{T+m}=j\mid X_T=i)$$

So conditionally on $X_T=i$ the chain again discards whatever happened previously to time $T$.

In order to determine the(unconditional) probabilistic behaviour of a(homogeneous) Markov chain at time $n$ one needs to know the one step transition matrix and the marginal behaviour of $X$ at a previous time point, call it $t=0$ without loss of generality. ie one should know $P(X_1=j\mid X_0=i)$ and $P(X_0)$.


Here's an intuitive explanation of the strong Markov property, without the formalism:

If you define a random variable describing some aspect of a Markov chain at a given time, it is possible that your definition encodes information about the future of the chain over and above that specified by the transition matrix and previous values. That is, looking into the future is necessary to determine if your random variable's definition is being met. Random variables with the strong Markov property are those which don't do this.

For example, if you have a random walk on the integers, with a bias towards taking positive steps, you can define a random variable as the last time an integer is ever visited by the chain. This encodes information about the future over and above that given by the previous values of the chain and the transition probabilities, namely that you never get back to the integer given in the definition. Such a random variable does not have the strong Markov property.

So with regards to your two questions:

1) The strong Markov property states that after a random variable has been observed which was defined without encoding information about the future, the chain effectively restarts at the observed state. (I'm not sure if this was exactly what you were getting at.)

2) If you define a variable for which the strong Markov property does not hold, and you know it's value, you could use this knowledge to get a better estimate of the future state visitations than if you relied on the transition matrix and current observations alone. The chain is still governed by its transition matrix and you don't need more of them to describe it, but you would have more information than if your variable was strongly Markovian.