Does DTIME(O(n)) = REGULAR?

For the benefit of the readers, here is an outline of the proof that an $o(n\log n)$-time TM accepts a regular language.

Consider the computation of the TM on some input $x$. Write $x$ as follows: $$ |x_0|x_1|\cdots|x_n|. $$ Each $|$ signifies a boundary between two cells. The crossing number at a boundary on $x$ is simply the number of times the machine crosses the boundary. The crossing sequence is the sequence of states that the TM enters immediately after crossing the boundary (first from left to right, then from right to left, and so on).

Claim (Kobayashi '85): If a TM runs in time $o(n\log n)$, then the crossing number at any boundary on any input is $O(1)$.

Proof idea: Suppose that on some input, the crossing number at some boundary $X$ is large (at least $A$). Pick a smallest such input. Since the machine runs in time $o(n\log n)$, there must be many boundaries (at least $B(n)$) at which the crossing number is small (at most $c(n)$). Since $B(n)$ is so big compared to $c(n)$, by the pigeonhole principle there must be four boundaries with the same crossing sequence, out of which there are two on the same side of $X$. It follows that if we "excise" the substring bounded by these boundaries, we get a smaller string with a boundary with crossing number at least $A$. Q.E.D.

A computation is of course needed to show that there is an appropriate choice of $A,B(n),c(n)$ whenever the time is $o(n\log n)$. That's left for you to verify.

Claim (Hennie '65): If the crossing number of a TM is $O(1)$ for any input and any boundary, then the language it accepts is regular.

Proof idea: For strings $x,y$, define $x \equiv y$ if for any $z$, whenever the machine accepts $xz$ then it accepts $yz$, and vice versa. We show that $\equiv$ has finitely many equivalence classes, so the claim follows from Myhill-Nerode.

For a string $x$, let $S(x)$ denote the set of crossing sequences at the boundary $x|z$ for any $z$ such that the TM accepts $xz$. Suppose that $S(x) = S(y)$. We will show that $x \equiv y$. Indeed, take any $z$ such that the TM accepts $xz$. Since $S(x) \subseteq S(y)$, there is $w$ such that the TM accepts $yw$, and the crossing sequences at $x|z$ and $y|w$ are the same. This implies (after some thought) that indeed $yz$ is accepted by the TM. Q.E.D.


Single-tape $o(n \log n)$ Turing machines can recognize only regular languages. See http://arxiv.org/abs/cs/0310046 for reference. I recall seeing the proof for $O(n)$ only (can't remember the source now), and it was definitely not simple.


When you get below general polynomial time, you need to be careful which computation model we're working with -- a RAM can achieve much more in $O(n)$ time than a Turing machine can.

Anyway, with a "realistic" model, DTIME($O(n)$) can recognize much more than regular languages, such as $\{a^nb^n\mid n\in\mathbb N\}$ on a counter machine.

More generally, it is easy to see that a Turing machine with a read-only input tape and a separate scratch tape can recognize at least every LR($k$) context-free language in linear time -- and more than that, such as $\{a^nb^nc^n\mid n\in\mathbb N\}$ which is not context-free.