What is an algorithm?

(Wee Kean here.) Some time a year ago, for some reason, the then observers and I were reading the following paper. It is related to Hilbert's tenth problem which asks: given any diophantine equation, is there a general algorithm to decide whether the equation has a solution in integer values? But how do we decide what an algorithm can or cannot do?

Formalizing algorithms

A turing machine is one way to represent computation. Whether or not turing machines can fully encapsulate what algorithms or computability are is not proven. There are other conceptions of computability but they have all shown to be equivalent to the turing machine.

Informally, you can imagine a device with a head, an internal state and an infinitely long tape. At each step, the device reads the symbol on the cell and based on its internal state and the symbol read, it writes a symbol (possibly the same one) on the current cell and moves either left or right.

Very useful depiction

Formally, a turing machine is composed of

finite set of internal states $Q$
some input symbols $\Sigma$
tape symbols $\Gamma$,
a transition function $\delta: Q \times \Gamma \rightarrow Q \times \Gamma \times {L,R}$ (i.e. for a given state and tape alphabet, the function describes the next state, the symbol to write and whether to move $L$ or $R$)
and some accepting states $F \subseteq Q$.

The input to a turing machine is written on the tape with the leftmost symbol at the starting position of the head and remaining cells are assumed to have a blank symbol $B \in \Gamma - \Sigma$ written on them. The turing machine is said to accept an input $x$ if and only if at some point, it enters an accepting state. For our purpose, we assume that the turing machine halts whenever it enters an accepting state.

As an example, the following turing machine accepts all input with an even number of symbols:

$Q = \{q_0, q_1, q_2\}, \Sigma = \{a\}, F = \{q_2\}$

$ \delta (q_1, a) = (q_0, a, R), \delta (q_0, a) = (q_1, a, R), \delta (q_0, B) = (q_2, B, R)$.

Notice that a turing machine is not guaranteed to halt on all inputs $x$. When the turing machine does not halt, we say it does not accept the input $x$.

Let a language refer to any subset of all possible strings over some alphabet $\Sigma$ (e.g. $L = \{``aa",``ab",``abc"\}$ is a language over the alphabet $\{a,b,c\}$). Then, when we say a turing machine accepts a language $L$, we mean it accepts an input $x$ if and only if it is in $L$.

Another formulation of the turing machine is in terms of function computation where the input $x$ is given and whenever the turing machine halts (if it does), what is left on the tape is the output $y$. Then, we say the turing machine describes some function $f$ such that $f(x) = y$ if it halts and is undefined otherwise. You can show that these two formulations are equivalent i.e. there is some turing machine which accepts $(x,y)$ if and only if there is some turing machine which computes some function $f(x) = y$.

Encoding

So far, we have referred to languages as subset of strings but we can also use this strings to encode certain objects. For example, we can encode any positive integer $k$ as $\underbrace{111...1}_{k \text{ times}}$. Then, the set of prime numbers is recursive in the sense that there exists a turing machine which accepts all strings of that form $\underbrace{111...1}_{k \text{ times}}$ where $k$ is prime.

We can encode strings as number and turing machines as numbers too. For strings, we can view the string as a base $|\Sigma|$ number (we may need to prepend a 1 to avoid multiple strings mapping to the same number). For turing machines, it suffices to encode the transitions and the accepting states. We can enumerate the states $q_0, q_1, q_2, ...$ as $0,00,000,...$. To represent the accepting states, we place appropriate delimiters $1$. For example $01001000$ represents three accepting states $q_1,q_2,q_3$. We can encode each transition similarly. We can interpret the resulting string in base 2 (prepending if needed) to convert the turing machine into some number.

(Disclaimer: whether different encodings are equivalent is something I'm not sure about. Intuitively, encodings should be equivalent as long as there exists some recursive function (detailed later) which maps one to the other, e.g. whether we encode number as binary string or as shown above, they are equivalent since there exists some turing machine which maps one to the other.)

Recursively enumerable / recursive

A language $L$ is recursively enumerable if there exists some turing machine which accepts $L$.

A language $L$ is recursive if there exists some turing machine which accepts $L$ and it halts on all inputs $x$.

(Side note, technically, I believe they were referred to as computably enumerable and computable respectively and recursive was used for a different formulation of computability but they were later shown to be equivalent so... ¯\(ツ)/¯)

Part 2?

Naturally, you could ask:

Do there exists not recursively enumerable sets? Yes since the set of turing machines is countable but the set of languages is uncountable.
Do there exists recursively enumerable sets which are not recursive? The following set is recursively enumerable but not recursive - $\{M | L(M) \neq \emptyset \}$. Unfortunately, I not think I can explain it well without making this post absurdly long. The main idea is that there exists a universal turing machine which can take as input a turing machine and a input string $(M, w)$ and simulate $M$ on $w$. Intuitively, to figure out if $L(M) \neq \emptyset$, we need to simulate $M$ on all inputs until $M$ accepts one. If we dovetail, i.e. in the $k$-th iteration, we simulate $M$ on the first $k$ strings for $k$ steps, then, for any turing machine $M$, such that $L(M) \neq \emptyset$, there exists some $t, w$ such that $M$ accepts $w$ in $t$ steps and our algorithm will accept in the $max(t,w)$-th iteration. The argument for why it is not recursive follows by considering the complement set and show that it cannot be recursively enumerable.

Hilbert's tenth problem can then be rephrased as, given any diophantine equation, $D(x_1, x_2, ..., x_n, y_1, y_2, ..., y_m)$, let $S_D$, called the diophantine set, be the set of tuples $(x_1, x_2, ..., x_n)$ such that $D(x_1, x_2, ..., x_n, y_1, y_2, ..., y_m) = 0$ has a solution in integers. Is the set of diophantine equations $\{D | S_D = \emptyset \}$ recursive?

Hopefully, you've found this interesting!

Search This Blog

SIMO Retiree Blog