6. Turing Machines

6.1 Introduction

Even though pushdown automata are equipped with a simple logical memory, we have seen that there are many elementary languages that they cannot recognize. Put another way: Although automata and pushdown automata are well-suited to perform specific calculations, they are not general enough to keep track of their own internal movements. For that a better memory model is required.

Alan Turing, a British mathematician better known for his contribution to decoding the German Nazi ENIGMA machine and for his philosophic thesis, first appearing in the prestigious British philosophy journal Mind, about the machine nature of thinking6.1, proposed a model of effective computation in 1936 that still defines the limits of electronic computing machines. The naive 19th Century belief that all mathematical theory could be reduced to a set of mechanical processes, or algorithms, was dealt a crippling blow by the Czech logician Kurt Gödel in 1928 and a knockout punch by Alan Turing 8 years later. Thus, it is encumbant upon the student of information to learn what sorts of problems are amenable to mechanical processes and which are not.

In his effort to formalize mathematics with mechanical rules to solve a famous mathematical problem propounded by the German mathematician David Hilbert, Turing developed an abstract model of a computer. A Turing machine is basically a super typewriter6.2 with a sophisticated set of internal states (not just ``upper case'' and ``lower case''). Just as a typewriter is capable of moving about a page a supertypewriter moves back and forth on a tape. It turns out that the added dimension of movement is extraneous and that the tape model captures the essence of randomly seeking a position to posit a mark, or write upon the page. In this way the typewriter is able to store information on a random-access basis. Abstracting from the page structure, the super typewriter writes upon an unlimited one-dimensional tape.

Turing's super typewriter could carry out one additional function that was very restrictive for mechanical typewriters: it could read as well as write. Thus it is a kind of super automaton, i.e. one that can read as well as write onto its tape. Because it can write, e.g. erase symbols, its tape head is allowed to move freely in both directions. It is thus equipped with an (unlimited) memory of sorts upon which it performs its computations.

In a paper entitled `On Computable Numbers, with an application to the Entscheidungsproblem' (see below) Turing proposed such a machine or automaton, equipped with the logical equivalent of a random- access memory, to determine how far one can go in mechanising mathematics, i.e. solve hard problems using a device like the one described above.

The really astonishing thing is that this machine, or model of computation is still with us today. Taking his cue from Gödel Turing surmised that if it were possible to describe any particular machine by means of a formalized state transition table, then it must be possible to create a machine, a universal machine, that operates on such tables in such a manner to simulate any thusly constructed machine. Such a machine could actually be built using discrete electronic circuitry. Indeed, von Neumann's stored program, called ENIAC was one step in this direction, Babbage's Analytical Engine (conceived in 1837!) yet another. Turing further resolved to ``build a brain'' that could simulate neuronal processes in the human brain and thus, like Humpty-Dumpty, explain the Jabberwocky. Thus was initiated the EDVAC project at Cambridge under Turing's auspices. When we understand this machine, we will understand what computers are capable of. I dare say that this is the fundamental question of the newborn discipline that has come to be know as `Informatics' or `Computer Science'.6.3


6.2 Definitions and Examples

Definition

A Turing Machine is a 5-tuple $ \cal {T}$ = (X, Z, H, f, zA) where
  1. X is an alphabet of tape symbols containing the special symbol `$ \sqcup$' (Blank) and optionally a left marker `$'. The set Xinput $ \subset$ X of input symbols does not contain either of these two symbols.
  2. Z is a finite set of states.
  3. zA $ \in$ Z is the initial state.
  4. H = {hA, hR} is a set of two halting states, where hA is called the accept state and hR the reject state.
  5. f is the transition function

    f : X×(Z - H) $\displaystyle \longrightarrow$ X×(((Z - H)×{L, R,$\displaystyle \emptyset$}) $\displaystyle \cup$ H),

    where the symbols L, R and $ \emptyset$ denote the left, right and stationary head movements. It is required that initially f ($, z) = ($, z, R) for all z $ \in$ Z - H and f (x, z) never writes `$' onto the tape (`$ is write-protected') for x $ \neq$ `$'. The first condition ensures that the machine never falls off the left end of the tape.

As with automata an output function could be defined, but is of no theoretical interest here. The transition function f is not defined on the states of H because the machine halts there. Some authors use a set ZF of acceptance states instead of H, but this detracts from the primary motivation behind Turing machines. Finally, blanks `$ \sqcup$' initially occupy all cells on the input tape where there is no input or end marker. In particular, the first blank denotes the end of the input (at least initially).

Note that Turing machines are by definition deterministic.

Example

Set X = {$, $ \sqcup$ , 0, 1}, Z = {zA, z1, hA, hR}, H = {hA, hR} (hA is the accepting halt state). The transaction function is defined as follows
f ($, zA) = ($, zA, R)  
f (0, zA) = (0, zA, R)  
f (1, zA) = (1, z1, R)  
f ( $\displaystyle \sqcup$ , z1) = (1, z1, R)  
f (0, z1) = (1, hA)  

The rejection state is here unused. Note that when a halting state is reached, there is no further operation, hence the elision of the tape head direction. This machine has a very simple interpretation: when a `0' has been read after one or more occurences of `1' that cell is overwritten with `1' and the machine halts. This process is illustrated in Fig. 6.1

Figure 6.1: Processing the string `0011100'
\begin{figure}\begin{center}
\begin{tabular}{\vert c\vert c\vert c\vert c\vert c...
...}{c}{$\uparrow$}&
\multicolumn{2}{c}{} \\
\end{tabular}\end{center}\end{figure}

A Turing machine differs from an automaton in several points. One worth mentioning here is that after the input is presented on the input tape the Turing machine can read from or write onto the tape. Thus the tape serves as a primitive type of memory or store. Contrary to concrete machines the Turing machine's memory is unbounded. A Turing machine may also write a blank `$ \sqcup$' over any x $ \in$ Xinput on the tape. This operation is called erasing.

As with automata a configuration of a Turing machine is to be defined: if $ \cal {T}$ = (X, Z, H, f, zA) is a Turing machine then a configuration of $ \cal {T}$ is an element of $ . X*×Z×(X* . (X - { $ \sqcup$ })) $ \cup$ {$ \lambda$}. Intuitively, a configuration of a Turing machine is a state, the current contents of the tape and a head position. It is convenient to represent this information by dividing the tape into two halves: the first half consists of the symbols from the initial tape symbol `$' up to but not including the symbol under the tape head, and the second half consists either of nothing (i.e. the rest of the tape is blank) or a string of tape symbols that ends with a non-blank character. The symbol under the tape head always belongs to the second half. A configuration of a Turing machine can thus be written (x, z, x', where z is the state of the configuration, x the first half of the tape's contents and x' the rest. If z $ \in$ H then the Turing machine is in a halted configuration. The initial configuration is $zA (thus we need not bother about the action of the transition function at the outset.

A configuration C1 : ($ua, z1, bv) (a, b $ \in$ X) yields configuration C2 : ($u, z2, acv) (c $ \in$ X) with a left head movement, written C1 $ \Rightarrow_{L}^{}$ C2, if f (b, z1) = (c, z2, L). For a right move configuration C1 : ($u, z1, abv) yields configuration C2 : ($uc, z2, bv) with a right head movement, written C1 $ \Rightarrow_{R}^{}$ C2, if f (a, z1) = (c, z2, L). For stationary head moves C1 $ \Rightarrow_{0}^{}$ C2 we just have ($u, z1, av) yields ($u, z2, bv) with stationary head movement if f (a, z1) = (b, z2,$ \emptyset$). There are still two special cases worthy of mention: if v = $ \lambda$ and b = ` $ \sqcup{^\prime}$, then ($ua, z1,$ \lambda$) $ \Rightarrow_{L}^{}$ ($u, z2, a) if f ( $ \sqcup$ , z1) moves left into state z2. If f (a, z1) = ( $ \sqcup$ , z2, R) then ($u, z1, a) $ \Rightarrow_{R}^{}$ ($u $ \sqcup$ , z2,$ \lambda$). Note the position of the tape head in this case. For an interpretation of the concept `configuration' consult the illustrations in Fig. 6.2

Figure 6.2: Yielding Configurations in a Turing Machine
\begin{figure}\begin{center}
\begin{tabular}{\vert c\vert c\vert c\vert c\vert c...
...}{}&\multicolumn{4}{c}{\bf R-Move Config.}
\end{tabular}\end{center}\end{figure}

In all cases we write C1 $ \Rightarrow$ C2 in order to denote one configuration C1 yielding another C2. The previous example illustrates explicitly the movement of a Turing machine from initial to halted configuration.

A computation of a Turing machine $ \cal {T}$ is a sequence of configurations

C0 $\displaystyle \Rightarrow$ C1 $\displaystyle \Rightarrow$ ... $\displaystyle \Rightarrow$ Cn

The computation in this case is said to have length n. The transitive closure of the relation ` $ \Rightarrow$' is denoted by ` $ \overset{*}{\Rightarrow}$'.

Finally, $ \cal {T}$ accepts x $ \in$ Xinput* if ($, zA, x)$ \overset{*}{\Rightarrow}$(x', hA, x''). Note that it is only required that $ \cal {T}$ be in the acceptance state hA using the input symbols. If, on the other hand ($, zA, x)$ \overset{*}{\Rightarrow}$(x', hR, x''), then $ \cal {T}$ rejects x.

$ \cal {T}$ decides a language L $ \subseteq$ X*input, if for each x $ \in$ L $ \cal {T}$ accepts x and for each x $ \in$ X*input - L, $ \cal {T}$ rejects x. For historical reasons then language L is then called recursive, i.e. there is some Turing machine that decides L.

One final remark. There is one other possibility for a Turing machine that did not occur for automata: a Turing machine may never halt on some input. This eventuality is very important for the sequel.

This section will now be concluded by constructing a Turing machine that recognizes the language of all strings that have an even number of 1's.

Example

X = {1}, Z = {zA, z1, z2, hA, hR}, H = {hA, hR}. The transition function is defined as follows:
f ($, zA) = ($, zA, R)  
f (1, zA) = (1, z1, R)  
f (1, z1) = (1, zA, R)  
f ( $\displaystyle \sqcup$ , zA) = ( $\displaystyle \sqcup$ , z2, L)  
f ( $\displaystyle \sqcup$ , z1) = ( $\displaystyle \sqcup$ , z3, L)  
f (1, z2) = (1, hA)  
f (1, z3) = (1, hR)  
f ($, z2) = ($, hA)  

Note that the empty string `$ \lambda$' is also accepted. A sample run on the input `1111' is seen in Fig. 6.3.

Figure 6.3: Derivation of the string `1111' by a Turing machine
\begin{figure}\begin{center}
\begin{tabular}{\vert c\vert c\vert c\vert c\vert c...
...c}{}&\multicolumn{1}{c}{$\uparrow h_A$}\\
\end{tabular}\end{center}\end{figure}


6.3 Programming Turing Machines

Describing Turing machines with transition functions is extremely unwieldy for all but the simplest machines. One way to simplify this process is to design a simple programming language that facilitates the formulation of an algorithm on a Turing machine. This is much the same as using a high-level programming language to solve some problem and then allowing a compiler to translate the solution onto the low-level hardware of the target machine.

This is indeed the approach that Turing took when he set out to define his universal machine, i.e. one that takes any Turing machine as input. One of the problems Turing faced was how to encode that data upon which a given Turing machine operates into a string of neutral data, e.g. 0's and 1's, that can then be represented on a ``computer's'', or, better, universal Turing machine's, input tape. Ignoring for the nonce the problem of encoding states, concentrate upon that of encoding data on the input string.

There are two easy encodings for data. The first so-called unary encoding deals with a specific enumeration of the input symbols x1, x2,..., xn. Then these specific symbols can be encoded by encoding the subscripts. By using just one data symbol, call it `|', it is possible to encode strings consisting of any number of symbols simply be repeating that symbol sufficiently often. In addition, Turing required an abstract symbol `#' to separate input symbols and a blank symbol `$ \sqcup$' to denote the end of the input.

Another natural, more efficient encoding for natural numbers would be to employ binary sequences to represent these values: if x = x1x2 ... xn $ \in$ {0, 1}* then define

val(x) = x12n-1 + x22n-2 + ... xn-12 + xn.

This latter encoding will be used later in dealing with recursive functions.

In this way a Turing machine $ \cal {T}$ can also be regarded as an embodiment of a numerical function n = f (n1, n2,..., nk): if xi $ \in$ {0, 1}* with val(xi) = ni  (i = 1,..., n) and y $ \in$ {0, 1}* with val(y) = n

The control for machines having this kind of input is then described in the seven programming statement given in Table 6.1.


Table 6.1: A Simple Language for Turing Machines
Nr. Statement Description
1. Write $ \sqcup$ Write a blank onto the tape at current tape head position
2. Write | Write the cipher `|' onto the tape at current head position
3. Left Move one cell to the left on input tape
4. Right Move one cell to the right on input tape
5. if $ \sqcup$ goto i Conditional branching instruction: If a blank is read
    then continue processing at staement number i
6. id | goto i Similar to 5.
7. halt stop the machine


A Turing program is then a finite sequence of sequentially labelled machine instructions.

Example

Construct an adder that adds two input integers given in binary notation. Thus, the input to this machine is encoded

$\displaystyle \underline{{\$}}$||#||| $\displaystyle \sqcup$ .

The current tape head position is indicated by underlining the cell in question. The result of the machine's action upon this string should be

$||||$\displaystyle \underline{{\vert}}$ $\displaystyle \sqcup$ .

Note the position of the head after processing: the tape is empty to the right of this position.

Algorithm

Overwrite the separating # with a `|' and shorten the second group of digits by one at the end of that group. Thus
  1. Search for the end of the first summand
  2. Overwrite the following # with a `|'.
  3. Search for the end of the second summand.
  4. Overwrite the last `|' with a blank `$ \sqcup$'.

Program

The Turing programm that realizes the above algorithm is now given.


Nr. Statement Tape
1 Right $$ \underline{{\vert}}$|#||| $ \sqcup$
2 if | goto 1 $||$ \underline{{\char93 }}$||| $ \sqcup$
3 Write | $||$ \underline{{\vert}}$||| $ \sqcup$
4 Right $|||$ \underline{{\vert}}$|| $ \sqcup$
5 if | goto 4 $||||||$ \underline{{\sqcup}}$
6 Left $|||||$ \underline{{\vert}}$ $ \sqcup$
7 Write $ \sqcup$ $|||||$ \underline{{\sqcup}}$ $ \sqcup$
8 Left $||||$ \underline{{\vert}}$ $ \sqcup$ $ \sqcup$
9 halt  


Lest it be thought that the level of abstraction used in Turing programs is not realizable on the rather simplistic model given for Turing machines, it is now intended to demonstrate how to realize each of the seven statements to Turing machines that are constructed by concatenating Turing machines of a similar type.

  1. If a $ \in$ X is any tape symbol, then a Turing machine $ \cal {T}$a that writes that symbol to the current tape position is constructed as follows: Z = {zA, hA, hR}, H = {hA, hR} and

    f (b, z) = $\displaystyle \left\{\vphantom{\begin{array}{ll}
(a,z_A,\emptyset ) &\;\mbox{if} \;b\neq \lq \$'\\
(\$,z,R)&\;\mbox{if}\; b = \lq \$'
\end{array}}\right.$$\displaystyle \begin{array}{ll}
(a,z_A,\emptyset ) &\;\mbox{if} \;b\neq \lq \$'\\
(\$,z,R)&\;\mbox{if}\; b = \lq \$'
\end{array}$

    Thus, the tape head remains stationary and every input is accepted.
  2. A Turing machine $ \cal {T}$R that always moves 1 cell to the right of the current tape position is also very easy to construct: Z = {zA, hA, hR}, H = {hA, hR} and f (b, z) = (b, hA, R). A corresponding machine that always moves 1 cell to the left is similarly constructed.6.4
  3. Concatenation. Given the two Turing machines $ \cal {T}$1 = (X, Z1, H1, f1, z1A) and $ \cal {T}$2 = (X, Z2, H2, f2, z2A), their concatenation is defined as follows: assume firstly that the states Z1 of $ \cal {T}$1 and Z2 of $ \cal {T}$2 are disjoint. The set Z = Z1 $ \cup$ Z2, zA = z1A, the initial state of the first machine and finally H = H1 $ \cup$ H2. For x $ \in$ X and z $ \in$ Z define

    f (x, z) = $\displaystyle \left\{\vphantom{\begin{array}{ll}
f_1(x,z) &\;\mbox{if} \;z \in Z_1-H_1\\
f_2(x,z)&\;\mbox{if}\;z \in Z_2-H_2
\end{array}}\right.$$\displaystyle \begin{array}{ll}
f_1(x,z) &\;\mbox{if} \;z \in Z_1-H_1\\
f_2(x,z)&\;\mbox{if}\;z \in Z_2-H_2
\end{array}$

    For z $ \in$ H1 set F(x, z) = f2(x, zA). The concatenation of $ \cal {T}$1 and $ \cal {T}$2 is written $ \cal {T}$1 $ \longrightarrow$ $ \cal {T}$2.
  4. Finally, conditional branching upon reading an a $ \in$ X is a straightforward application of the previous construction and, as such, is left to the reader. It is denoted by $ \cal {T}$1 $ \longrightarrow$ $ \cal {T}$2. In case $ \cal {T}$1 = $ \cal {T}$2 then the special notation $ \cal {T}$1$\scriptstyle \circlearrowleft$ is employed.

With these elementary operations the preceding Turing program could be written as drawn in Fig. 6.4.

Figure 6.4: Turing Program Represented as a Combination of Turing Machines
\begin{figure}\begin{center}
\begin{tabular}{c}
${\cal T}_R^{\circlearrowleft}...
...\
$\downarrow$ \\
${\cal T}_L$ \\
\end{tabular} \end{center}\end{figure}

Note this this machine halts and accepts automatically after the last shift.

It goes without saying that every time a Turing machine is repeated in such diagrams a new machine is generated, having distinct states from all other such generated machines. This procedure was been left out for aesthetic reasons.

6.4 Recursive Functions

Turing machines are essentially more complex than all types of automata because of their ability to write onto the tape. Indeed, this is a completely adequate way of modelling a computer's random access memory and, as such, allows a Turing machine to compute, much as modern computers do. Very little reflection reveals that what the Turing machine computes is given by the configuration yhA, for y $ \in$ X*. This can also be described as follows: if x is the current input string to the Turing machine $ \cal {T}$ that leads to the configuration yhA, then define the function f : X* $ \longrightarrow$ X* by f (x) = y. The language L recognized by this machine is, then, the set of such x and what the machine computes on such input is the set of halting configurations. Thus one could loosely say that $ \cal {T}$ computes the function f. Such functions are called, for historical reasons, recursive functions.

Employing the binary numerical value representation of data from the previous section, a Turing machine $ \cal {T}$ can also be regarded as an embodiment of a numerical function n = f (n1, n2,...nk): if xi $ \in$ {0, 1}* with val(xi) = ni  (i = 1,..., n) and y $ \in$ {0, 1}* with val(y) = n In this way, x $ \in$ L($ \cal {T}$) can be represented as an argument for the function f. Generalizing this procedure one can assume that initially the input tape has the unary representation

$\displaystyle \underline{{\$}}$$\displaystyle \underset{x_1}{\underbrace{(0 \cup 1)^*}}$#$\displaystyle \underset{x_2}{\underbrace{(0 \cup 1)^*}}$# ... #$\displaystyle \underset{x_k}{\underbrace{(0\cup 1)^*}}$ $\displaystyle \sqcup$ $\displaystyle \sqcup$ ...

for the arguments x1, x2,...xk $ \in$ X*input. If, after finitely many steps the machine halts in its accept state hA and if the tape's contents are

$$\displaystyle \underset{y}{\underbrace{(0 \cup 1)^*}}$ $\displaystyle \sqcup$

then we say $ \cal {T}$ calculates the function y = f (x1, x2,...xn). Put another way, $ \cal {T}$ halts on x and ($x, zA)$ \overset{*}{\Longrightarrow}$(hA,$y) Every integral numerical recursive function can evidently be written in this way. The nomenclature ``recursive'' should become evident from the following example.

Example

If the contents of the input tape are

$\displaystyle \underline{{\$}}$$\displaystyle \underset{x}{\underbrace{(0 \cup 1)^*}}$ $\displaystyle \sqcup$ ,

define succ(x) as follows: scan to the end of x and then move left until a `0' is encountered overwriting each `1' with a `0' on the way. Then change the found 0 with a `1' and halt. If, instead, a `$' is encountered, then shift the entire binary string right by one space and overwrite the very first `1' with `0'. It is not very hard explicitly to construct a Turing machine that computes succ(x). The main point here is that succ(x) is also (intuitively) recursively defined by

succ(x) = $\displaystyle \left\{\vphantom{\begin{array}{ll}
1 &\;\mbox{if} \;x=0\\
1+\mbox{\sf succ}(x-1)&\;\mbox{if}\; x \neq 0
\end{array}}\right.$$\displaystyle \begin{array}{ll}
1 &\;\mbox{if} \;x=0\\
1+\mbox{\sf succ}(x-1)&\;\mbox{if}\; x \neq 0
\end{array}$

The addition function defined by the Turing machine construction in Section6.3 can also be (intuitively) recursively defined in terms of the successor function succ(x):

plus(x, y) = $\displaystyle \left\{\vphantom{\begin{array}{ll} x &\;\mbox{if} \;y=0  \mbox{\sf succ}(\mbox{\sf plus}(x,y-1))&\;\mbox{if}\; y \neq 0 \end{array} }\right.$$\displaystyle \begin{array}{ll} x &\;\mbox{if} \;y=0  \mbox{\sf succ}(\mbox{\sf plus}(x,y-1))&\;\mbox{if}\; y \neq 0 \end{array}$ (6.1)

The functions succ(x) and plus(x, y) are examples of so-called primitive recursive functions. Before adducing a formal definition of a recursive function consider again the above example of addition in Equation 6.1: First plus is defined for all values of its first argument and for the special value 0 of its second argument. The right-hand side is a function g of its first argument, in this case g(x) = x. The general case is then a function h , in this case succ, of its first argument x and the successor function succ(y) applied to its second argument. It is defined to be some function, in this case succ, of the function defined on (x, y).

Consider yet another recursive definition, this time for the multiplication of two natural numbers:

mult(x, 0) = 0  
mult(x,succ(y)) = (plus(mult(x, y), x)).  

Note the use of the argument x in the definition of h. It should by now be clear that there is no reason to exclude the use also of the second argument y to which the successor function has been applied. This motivates the following definition of recursivity:

Definition

Let f (n1, n2,...nk+1) be a natural-number valued function of the k + 1 natural numbers n1, n2,...nk+1 and let h(n1,..., nk+2) be a similar function of k + 2 natural numbers. Then the function f is recursively defined if

f (n1, n2,...nk, 0) = g(n1, n2,...nk)

for some known function g, and if

f (n1, n2,..., nk,succ(nk+1)) = h(n1, n2,..., nk, nk+1, f (n1, n2,...nk, nk+1)).

Thus, f is defined firstly in terms of its first k variables and 0. Then f is defined for the same k arguments and (k + 1)st argument succ(nk+1) as a function h having k + 2 arguments.

The class of such functions f : $ \underset{k}{\underbrace{\mathbb{N}\times \cdots \mathbb{N}}}$ $ \longrightarrow$ $ \mathbb {N}$ is formally obtained as follows:

  1. For any k $ \geq$ 0 the zero function

    zero(n1,..., nk) = 0

    is primitive recursive.
  2. For any k $ \geq$ i > 0 the projection function

    proji(n1,...nk) = ni

    is primitive recursive.
  3. For any k $ \geq$ 0 and n $ \in$ $ \mathbb {N}$ the constant function

    constn(n1,..., nk) = n

    is primitive recursive.
  4. The successor function

    succ(n) = n + 1

    is primitive recursive.
  5. The composition of two primitive recursive functions (wherever this makes sense) is primitive recursive: Let g : $ \underset{k}{\underbrace{\mathbb{N}\times \cdots \mathbb{N}}}$ $ \longrightarrow$ $ \mathbb {N}$ and fi : $ \underset{l}{\underbrace{\mathbb{N}\times \cdots \mathbb{N}}}$ $ \longrightarrow$ $ \mathbb {N}$ for ( i = 1,..., k) be primitive recursive. Then the composition

    g(f1(n1,..., nl),..., fk(n1,..., nl))

    is also primitive recursive.
  6. Finally, all recursive definitions of a function f by the primitive recursive functions g and h are again primitive recursive.

The class of primitive recursive functions actually provides a very rich, but enumerable class of numerical functions. For example, in addition to the canonical functions of numerical addition, subtraction, multiplication, numerical predecessor, numerical successor, any polynomial with natural number coefficients, any exponential with natural number arguments such as

power2(n) = 2n,

are all primitive recursive. This is easier to recognize as might be thought: for example, after having established that the sum and product of two primitive recursive functions is primitive recursive as has been already done, note that a polynomial is just a repeated sum of products. That they are enumerable, i.e. in 1-1 correspondence with the natural numbers is because they are built up from the basic functions set forth in the first four parts of the definition of `primitive recursive.' Since this is so, write down all those with 1 argument n as a sequence:

f0, f1, f2,...

Then for every natural number n set

g(n) = fn(n) + 1.

This ``diagonalization argument'' produces a Turing computable (recursive) function that is not primitive recursive. Thus

Theorem 6..1   Every primitive recursive function is Turing computable. There are Turing computable functions that are not primitive recursive.

As an example of a Turing computable, but not primitive, function consider the Ackermann function A : $ \mathbb {N}$×$ \mathbb {N}$ $ \longrightarrow$ $ \mathbb {N}$ defined by

A(m, n) = $\displaystyle \left\{\vphantom{\begin{array}{ll}
n+1 &\mbox{if} \;m=0\\
A(m-1,1)&\mbox{if}\;n=0\\
A(m-1,A(m,n-1))&\mbox{otherwise}
\end{array}}\right.$$\displaystyle \begin{array}{ll}
n+1 &\mbox{if} \;m=0\\
A(m-1,1)&\mbox{if}\;n=0\\
A(m-1,A(m,n-1))&\mbox{otherwise}
\end{array}$

Basically, A(m, n) grows faster than any primitive recursive function f : $ \mathbb {N}$ $ \longrightarrow$ $ \mathbb {N}$, i.e. if f is any such primitive recursive function then there is some n0 $ \in$ $ \mathbb {N}$ such that f (n) < A(n, n) for all n $ \geq$ n0. Thus if one wants to characterize the class of recursive, or Turing computable, functions one must work a bit harder.

Definition

Let f : $ \mathbb {N}$k+1 $ \longrightarrow$ $ \mathbb {N}$ be a primitive recursive function. Then the minimization g : $ \mathbb {N}$k $ \longrightarrow$ $ \mathbb {N}$ of f is given by

g(n1, n2,..., nk) = $\displaystyle \left\{\vphantom{\begin{array}{ll}
\mbox{min}\{n\;\vert\;A=g(n_1,...
...,n_k,n) = 1\;\} &\mbox{if} \;A\neq 0\\
0&\mbox{otherwise}
\end{array}}\right.$$\displaystyle \begin{array}{ll}
\mbox{min}\{n\;\vert\;A=g(n_1,n_2,\ldots ,n_k,n) = 1\;\} &\mbox{if} \;A\neq 0\\
0&\mbox{otherwise}
\end{array}$

Note that there is no obvious way to compute this function.

Definition

A function f : $ \mathbb {N}$k $ \longrightarrow$ $ \mathbb {N}$ is $ \mu$ recursive if it has been obtained from the basic function set for primitive recursive functions, i.e. the first four types of functions in the definition of primitive recursive, by allowing not only the operations of composition and recursive definition, but also the operation of minimization.

Theorem 6..2   A function f : $ \mathbb {N}$k $ \longrightarrow$ $ \mathbb {N}$ is recursive if and only if it is $ \mu$ recursive.

Definition

A language L $ \subseteq$ Ainput* is recursively enumerable if there is a Turing machine $ \cal {T}$ such that: if x $ \in$ L then $ \cal {T}$ halts on x; if x $ \notin$L then $ \cal {T}$ does not halt on x.

Note that it is irrelevant in wich halting state $ \cal {T}$ ends up in; the term semi decides is often used for this type of acceptance. Note also that every recursive language is also recursively enumerable.

Thus the class of Turing-computable functions is remarkably large. It is also a remarkable fact that this class cannot be enlarged by introducing more than one input tape, an input tape that is open in both directions or even by allowing random, instead of sequential, access to the machine's memory cells. Even more unexpected is the fact that the class of computable functions is not enlarged if nondeterminism, i.e. allowing a relation instead of a function when describing the state transitions, is introduced. The concept of `Turing machine' thus seems to be the appropriate mathematical model for a stored-program computer. The next section is devoted to showing that Turing machines, despite their great generality, cannot compute everything. In our constructive euphoria one important fact has been neglected: a Turing machine may never halt on some input, or, even worse, we may never know whether it will eventually halt or not.

6.5 Computability and the Halting Problem

In 1900 the German mathematician David Hilbert conjectured that every well-defined problem in mathematics can be answered by yes/no, depending on whether that problem is provably true or false. Indeed, Hilbert seemed to believe that there is always a well-defined procedure, or algorithm, that will produce the correct answer. The power and generality of Turing machines seems primae facie to support this attitude.

Assuming an algorithm can be identified with a specific-purpose Turing machine, it did not take long for the mathematical community to discover the naive optimism behind Hilbert's conjecture. It has, for instance, been demonstrated that even very simply-stated conjectures, e.g. the feasibility of a general testing procedure to determine whether a polynomial in several variables has one or more integral roots, are algorithmically undecidable. Perversely enough, the algorithm for polynomials in one variable is known to most any first-year student in Computer Science. It is, however, interesting that many of these problems, including the one mentioned, are recursively enumerable. For instance, given a polynomial, just feed it integers. If there is an integral root then polynomial evaluation mechanism, i.e. the relevant Turing machine, will eventually halt and answer `yes.' If not, then it will never halt. The problem is thus recursively enumerable, but not decidable.

If an algorithm is identified with a computer program then the existence or non-existence of algorithms takes on a compelling relevance. If there are easily stateable mathematical problems that are algorithmically undecidable, then there are problems - easily stateable problems - for whose solutions computers offer little or no help.

Consider the following socially relevant problem: can one write a computer program that validates software? The input to this hypothetical program is a complete specification of a given program's allowable input. Then the program is run on this input and it is determined whether it computes its results correctly. It turns out that this is a mere pipe dream, a chimera.

What is the problem here? Just this: given a Turing and an allowable input, it is impossible to decide whether the Turing machine accepts the input. Of course there may be some specific input and some specific Turing machine for which there is indeed some ad hoc method, but there is no procedure that works in general.

Analogous to the idea of a stored-program computer, it is necessary first of all to define a Turing machine that takes as its input a Turing machine and an input string for the latter Turing machine. Then the decidability of the above problem reduces to the decidability of this Turing machine.

6.5.1 The Universal Turing Machine

Just as a stored-program computer accepts a program and its input as its input, Turing proposed to construct a Turing machine $ \cal {U}$ whose input consists of a pair ($ \cal {T}$, x) of a Turing machine $ \cal {T}$ and an input x to $ \cal {T}$.6.5 Superficially, this would seem impossible because of the varifold state designations and alphabets allowed. But, again, just as a stored-program computer does not see a source file description of its program input, but rather a compiled, equivalent version of the of the source as a binary sequence that is executable on the target machine and is functionally equivalent to the original source program, a neutral notation is required for the construction of a universal Turing machine.


6.5.1.1 Encoding the Input

The issue of the concrete input to a concrete Turing machine has already been dealt with in Section 6.3 where a unary encoding for tape symbols was introduced. Thus the string ``x2x3x1'' can be written `` #||#|||#|#'' as explained there. Denote this encoding of a general string x $ \in$ X* with ``x''. Note that strings of an arbitrary length are allowed. Thus the alphabet for $ \cal {U}$ is potentially infinite, although for any particular input machine it is always finite.

The neutral encoding of a general Turing machine $ \cal {T}$ requires a little more care. The main problem is how to devise a neutral encoding of a Turing machine's state transition table. Because there are extra symbols involved the following table is offered to encode the general transition function:


Symbol Description Code
si State Symbol |i+1
xi Alphabet Symbol |i+3
$ Start Tape Symbol |
$ \sqcup$ Blank ||
L Move Left |
R Move Right ||
0 Don't Move |||
h Halt State Reached ||||


The extra symbol h is required to facilitate a uniform code for all state transitions. Indeed, for any Turing machine $ \cal {T}$ with state transition function f (x, z) can be written

f (x, z) = (x', z', t),

where t is a symbol from {L, R, 0, h}. If t = h then it may be indirectly inferred that the state assumed is a halting state. With this notation the state table can be represented by a set of 5-tuples (x, z, x', z', t). Separate each quantity in this 5-tuple with a '#' and separate each state table entry with a `##''. A complete neutral description of any specific Turing machine is then given by a string consisting of all encoded 5-tuples strung out as indicated. For instance the example Turing machine from Section 6.2 is given by the string
###|#|#|#|#||##|||#|#|||#|#||##||||#|#||||#||#||##||||#||#||||#||#||      
###|||#||#||||#|||#||||###        

Denote this string by ``$ \cal {T}$''. Assume that the start state is always given by s0. As tacitly represented, assume that ### separates the machine state table string from the ensuing input string x. The current position of $ \cal {T}$'s tape head will always be indicated with a P on the string x.

6.5.1.2 Defining the Machine $ \cal {U}$

An outline of the way in which $ \cal {U}$ simulates its input $ \cal {T}$ on its input string x is now given. Even it should be apparent how the details could be specified at the micro level, it is counterproductive to bombard the reader with such a trivial formalism. Instead a plausible description is offered.

In general $ \cal {U}$ must keep track of at least the three quantities

Thus, a three head machine would be the easiest way to construct $ \cal {U}$, but one can also proceed explicitly as follows: divide $ \cal {U}$'s input tape into three sectors containing 1) the (read only) machine description ``$ \cal {T}$'', 2) the input string x with a properly marked P indicating the current reading position (sort of a program counter) and 3) $ \cal {U}$'s private work area set off from the previous two sections with an M. This latter contains $ \cal {T}$'s current state si and the current input symbol xj. $ \cal {U}$ operates according to the following paradigm:

Algorithm

  1. Initialize: Read in the first symbol marker `$' and look up the successor state si to s0. Read the next input symbol xi = x1 and set the current tape position at position = R.
  2. while position $ \neq$ `h' do
    1. Match the pair (xi, sj) against the (first two coordinates of the) state table entries. If there is no match halt.
    2. If the match (xi, sj, xi', sj', t) is found then write xi' at position P. Write sj = sj' onto $ \cal {U}$'s work area M.
    3. Set position = t and move P accordingly.
    4. Fetch the next input symbol from position P and set xi to be this symbol.

Notice the tedious shift operations required when writing on the input tape. Owing to the unary representation the input symbols will vary greatly in length and room needs to be made to accomodate the new symbols in write operations. This could even involve shifting the entire tape to the right to make enough space. The same holds true for $ \cal {U}$'s work area. Finally, note the uncanny resemblance to a ``real'' stored-memory computer program.

6.5.2 The Halting Problem

Consider now the software verification problem of writing a computer program that somehow analyzes any computer program and is then able to ascertain whether said program at least halts on any of its specified inputs, e.g. detects potentially infinite loops. As desirable as such a piece of software would be, Turing proved its very feasibility to be a mirage.

The Halting Problem

Is there a Turing machine that decides whether any given Turing machine will eventually halt on any given input string to that Turing machine?

If we were to identify a Turing machine with an algorithm (so-called Church-Turing Thesis) then the Halting Problem asks whether there is some general algorithm that incorporates only properties general to all Turing machines and determines whether any given Turing machine will eventually halt on some given input machine. Of course, this ``only'' asks after a general algorithm. If a specific Turing machine is analyzed there may well be some ad hoc method to settle that particular question one way or the other.

Theorem 6..3   The Halting Problem is undecidable

Proof

Assume such a Turing machine, call it $ \cal {M}$1, did in fact exist. The input to $ \cal {M}$1 is a string (``$ \cal {T}$'',``x'') upon which $ \cal {M}$1 always halts. The encoding of this string is as explained in Section 6.5.1.1. If ``x'' is already unary encoded, then of course it makes more sense to write the pair (``$ \cal {T}$'', x). $ \cal {M}$1 will halt in state hA if $ \cal {T}$ halts on ``x''and will halt in hR if $ \cal {T}$ does not. As will be presently seen the main problem is that $ \cal {M}$1 halts for every $ \cal {T}$ and for every x.

Now use $ \cal {M}$1 to construct another Turing machine $ \cal {M}$2 as follows: in $ \cal {M}$2 the halting state hA is demoted to just another non-halting state. Thus, if $ \cal {M}$1 halts in hA then $ \cal {M}$2 will go into an infinite loop:

f2(x, hA) = (x, hA, R)

for all x $ \in$ X*. If $ \cal {M}$1 halts in hR then so does $ \cal {M}$2. An acceptance halt state for $ \cal {M}$2 is irrelevant since $ \cal {M}$2 will never enter that state. Any dummy state will serve the purpose. Thus $ \cal {M}$2 halts on the input (``$ \cal {T}$'',``x'') only when $ \cal {M}$2 fails to do so. If $ \cal {M}$1 does not halt on (``$ \cal {T}$'',``x'') then $ \cal {M}$2 does not either.

Now apply $ \cal {M}$1 to itself to point up the inner contradiction of $ \cal {M}$1's very existence. To this end, construct the Turing machine $ \cal {M}$3 from $ \cal {M}$2 as follows: given an input string $ \alpha$ = (``$ \cal {T}$'',``x'') the very first thing $ \cal {M}$3 does is to duplicate it to ($ \alpha$,$ \alpha$) and then sets $ \cal {M}$2 loose on ($ \alpha$,$ \alpha$).

If $ \cal {M}$3 halts on `` $ \cal {M}$3'' then $ \cal {M}$2 halts on (``$ \cal {M}$3'',$ \cal {M}$3''), i.e. $ \cal {M}$1 halts in hR with the input (``$ \cal {M}$3'',``$ \cal {M}$3''). But if $ \cal {M}$1 halts on some input, then, by construction, $ \cal {M}$2 cannot halt on that input and hence $ \cal {M}$3 cannot halt on ``$ \cal {M}$3''. ( $ \cal {M}$3'' is already unary encoded.) This is a contradiction.

If, on the other hand, $ \cal {M}$3 does not halt on `` $ \cal {M}$3'', then $ \cal {M}$2 does not halt on the input (``$ \cal {M}$3'',``$ \cal {M}$3''), i.e. $ \cal {M}$1 halts in state hA on the input (``$ \cal {M}$3'',``$ \cal {M}$3''), which by definition means that $ \cal {M}$3 halts on the input `` $ \cal {M}$3'' - another contradiction.

6.6 A Short Catalogue of Undecidable Problems

In this section a list of important problems in Computer Science is presented for which there is no algorithm to decide their truth or falsity. This list is by no means complete.

6.6.1 Undecidable Problems About Turing Machines

Lest it be surmised that the halting is just another like Empedocles' `All Cretans are liars' the following list shows that there are substantive algorithmic problems connected with Turing machines. They are to be compared with similar statements about automata, where it is only required to wait until the input tape has been processed from left to right.
  1. If $ \cal {T}$ is a Turing machine and x is a string of input symbols to $ \cal {T}$, does $ \cal {T}$ halt on x?
  2. If $ \cal {T}$ is a Turing machine that computes a function f with no arguments, does $ \cal {T}$ halt on a blank tape?
  3. Given a Turing machine $ \cal {T}$ halt for any string of input symbols?
  4. If $ \cal {T}$ is a Turing machine is the set of input strings on which $ \cal {T}$ does halt regular? Is it context-free? Is it Xinput*?
  5. Given two Turing machines $ \cal {T}$1 and $ \cal {T}$2 over the same alphabet, do $ \cal {T}$1 and $ \cal {T}$2 halt on the same set of input strings?

6.6.2 Undecidable Problems about Pushdown Automata and Context-Free Grammars

The following list points up algorithmic difficulties connected with context-free grammars. As contrasted with regular grammars, it is impossible to decide in general even very simple questions, such as whether a given string is generated by a given context-free grammar.
  1. If G1 and G2 are context-free grammars is L(G1) $ \cap$ L(G2) = $ \emptyset$? Is L(G1) = L(G2)?
  2. If G is a context-free grammar is L(G) regular?
  3. If G is a context-free grammar and x an input string is x $ \in$ L(G)?
  4. Is there an algorithm with input a pushdown automaton and output an equivalent pushdown automaton with a minimal number of states?

6.6.3 General Undecidable Problems in Computer Science

There is another class of problems that are easily stated, but for which there is no way to decide on their truth or falsity.
  1. The domino matching problem, usually called the Post Correspondence Problem: Given a finite set of dominos upon which are written 2 strings over some alphabet, one above and one below, can the dominos be so arranged that the strings produced by reading across the top row of dominos yields the same string as reading below? See Fig. 6.5 for an easy example where this is possible.
  2. The Tiling Problem for the plane. Given a set of tiles, i.e. squares of unit side length each of whose sides is assigned a specific color, is it possible to tile the first quadrant {(x, y)  |  x $ \geq$ 0  &  y $ \geq$ 0} with copies of these tiles so that As many copies of each tile may be used as required.

Figure 6.5: PCP has solution II,I,III for x = 10101101
\includegraphics[]{pcp}

Prof. Raymond Zavodnik, July 2002