CSCE 551
Spring 2001
Course Notes

1/17/2001
Theory of computation: Automata, Computability, Complexity

[Math prelims (don't lecture on, except for starred **)
**Natural numbers: book: {1,2,3,...}, me and others: {0,1,2,3,...}
**Relative complement: B-A = {x | x\in B and x\not\in A}
Venn diagrams (mention)
tuples and sequences, k-tuple, (ordered) pair, power set, Cartesian
product
functions, mapping, domain, range, codomain
**book: range := codomain; me and others: range \subseteq codomain
one-to-one = injection, onto = surjection, both = bijection (one to
one correspondance), arguments, arity, infix notation, prefix
notation, postfix notation
modular arith, Z_m
predicate (property) = Boolean-valued function (range = {true,false}
or {1,0})
property on A\times A \times ... \times A is a relation, k-ary
relation, or k-ary relation on A.  Binary relation (infix) R:
aRb means R(a,b) = true, R(a1,...,ak) means R(...) = true
identify relation on A with subset of A^k
equivalence relation: reflexive, symmetric, transitive
equivalence classes
Graphs
graph (undirected)
nodes (vertices), edges
degree, labeled graph
path, simple path, connected, cycle, simple cycle, tree, leaf, root
digraph (directed)
indegree, outdegree, directed path,
**strongly connected means directed path in both directions,
depicting binary relations
Strings and languages
alphabet = any finite set, members are symbols
string (over an alphabet), length of w = |w|, empty string = \epsilon
(identity under concatenation), reverse of w = w^R, substring,
concatenation, lexicographic ordering = length first then dictionary
order, language
Boolean logic
**Boolean values true and false (1=true and 0=false)
Boolean operations: conjunction (and, \wedge), disjunction (or, \vee),
negation (not, \neg), exclusive or (xor, \oplus),
**book: equality \leftrightarrow, me and others: equivalence,
**biconditional
**book: implication \rightarrow, me and others: conditional
operands
{and, not} form complete set of connectives
distributive laws
summary on page 16]

Definitions, theorems, and proofs
lemmas, corollaries

A formal proof (of statement S) is a sequence of mathematical
statements written in a rigorously defined syntax, such that each
statement is either an axiom or follows from some previous statements
in the sequence by a rule of inference, and whose last statement is S.
S is then considered a theorem.
The syntax specification is a formal language
The specification of axioms and rules of inference is a formal system.

The idea is that, at least in principle, a proof can be checked for
correctness by a computer using a purely algorithmic procedure, based
solely on the proof's syntactical form without regard to its meaning.
Formal proofs, even of simple theorems, are almost always long and
difficult for a human to read and digest; therefore, they almost never
appear in the mathematical liturature.  Instead, what appears are
informally written (but still rigorous) proofs.

Such an informal proof is written basically as prose, but may include
mathematical formulas.  It is a piece of rhetoric meant to convince
the reader beyond any doubt that the proposed assertion is true (a
theorem), that is, that a formal proof exists.  The informal proof can
appeal to previous theorems.  It can also appeal to the reader's
mathematical understanding and intuition.  In this case, the prover
must be prepared to explain or "fill in" more formally these appeals
if challenged to do so.  Writing a proof is more of an art than a
mechanical exercise.  It contains elements of style, as with any good
writing.  The best way to learn to write good proofs is to read good
proofs that others have written.

Proof methods: construction, contradiction, induction

Theorem: There are two (real) irrational numbers a, b such that a^b is
rational.

Proof: We know that sqrt(2) is irrational.  If sqrt(2)^{sqrt(2)} is
rational we are done (set a = b = sqrt(2)).  Otherwise, it is
irrational.  Set a = sqrt(2)^{sqrt(2)} and b = sqrt(2).  Then
a^b = (sqrt(2)^{sqrt(2)})^{sqrt(2)} = sqrt(2)^{sqrt(2) sqrt(2)} =
sqrt(2)^2 = 2, which is certainly rational, so the theorem is proved. //

Proof that \pi > 3
Proof that area of circle is \pi r^2

Intuition is crucial in forming a proof!  Pictures and diagrams are
very helpful, however, they cannot replace a formal argument.

1/22/2001
Definitions: path in a graph, simple path, cycle, degree, connected
graph, tree, leaf

Theorem: Any tree on n > 0 nodes has exactly n-1 edges
(strong induction on number of edges)

Theorem: Every tree on n > 1 nodes has at least one leaf
(proof by contradiction)

1/24/2001
Overview of computation: theory mapped out in 20s, 30s, before
electronic computers (Church, Hilbert, Kleene, Goedel).  Turing's and
von Neumann's ideas led to first electronic computer

Regular languages
Finite state machine (automaton - simplest computational model)
models limited memory computers

Examples: door opener:
states: closed, open
input conditions: front, rear, both, neither
nonloop transitions: closed -> open   on   front
                     open -> closed   on   neither

(probabilistic counterpart: Markov chains)

(other examples: number of 1's is even, number of 0's is multiple of
4, automaton for strings ending in 00)

Formal Definition:  A _finite_automaton_ is a 5-tuple
(Q,\Sigma,\delta,q_0,F), where
1.  Q is a finite set called the _states_,
2.  \Sigma is a finite set called the _alphabet_,
3.  \delta : Q \times \Sigma -> Q is the _transition_function_,
4.  q_0 \in Q is the _start_state_, and
5.  F \subseteq Q is the _set_of_accpt_states_ (or "final states")

M_1:
\delta : 0 : q_1 -> q_1, q_2 -> q_3, q_3 -> q_2
         1 : q_1 -> q_2, q_2 -> q_2, q_3 -> q_2
F : q_2
q_0 : q_1

formal definition of a computation

examples of reflexive, symmetric, and transitive binary relations

1/29/2001
Automaton that accepts multiples of 3 in binary

Definition of a regular language.

Define \Sigma_{\epsilon} = \Sigma \union \{ \epsilon \}

Nondeterministic automaton: \delta : Q \times \Sigma_{\epsilon} ->
P(Q)

Two machines are equivalent if they recognize the same language.

Equivalence of DFAs and NFAs
One direction is obvious: for any DFA, make an equivalent NFA using the same
graph diagram (each \delta(q,a) is a singleton for a\in\Sigma, and
\delta(q,\epsilon) = \emptyset

Other direction is nontrivial and (somewhat) surprising

Theorem: For every NFA there is an equivalent DFA
Corollary: A langauge is regular iff it is recognized by an NFA

Proof idea: states of the new DFA are sets of states of the given NFA

1/31/2001
We need to define acceptance/rejection for NFAs

Definition: Let M = (Q,\Sigma,\delta,q_0,F) be an NFA, and let
w = w_1w_2 ... w_n be a string in \Sigma^*.  We define a
_computation_path_of_M_on_w_ as some finite sequence of ordered pairs
(s_0,p_0),(s_1,p_1),..., (s_k,p_k) where the s_i are states of Q and
the p_k are integers such that all the following hold:

1.  r_0 = q_0, the start state, and p_0 = 1,
2.  for all i from 1 to k, either
    a.  r_i \in \delta(r_{i-1},w_j) (where j = p_{i-1}) and p_i = 1 + p+{i-1},
	or
    b.  r_i \in \delta(r_{i-1},\epsilon) and p_i = p_{i-1},
3.  p_k = n+1.

The computation path is _accepting_ if s_k \in F.  If the sequence
satisfies (1) and (2) but not necessarily (3), then we call it a
_partial_computation_path_.

Idea: s_i is the state at time i, and p_i is the position in w at time i.
Note: for the same M and w, there may be lots of (or no) computations, which
is not the case for a DFA.

Definition: Let M and w be as above.  We say that _M_accepts_w_ if there is
some (at least one) accepting computation of M on w.

Proof sketch of the equivalence theorem:
we let NFA states be sets of states of the DFA

2/5/2001
We fix an alphabet \Sigma for the following discussion.
Here, a _language_ is any set of string over the alphabet \Sigma.

Closure properties of regular languages: union, concatenation, Kleene
closure (*-operator): proof by pictorial construction
A regular expression is either
   1. a            (for any a \in \Sigma)
   2. \epsilon     (the empty string)
   3. \emptyset    (the empty set)
   4. (r*)         (where r is a regular expression)
   5. (rs)         (where r and s are regular expressions)
   6. (r \union s) (where r and s are regular expressions)

Each regular expression represents a language:

    expression     |     represents
-------------------+-------------------------------------
        a          | the singleton language {a}
     \epsilon      | the singleton language {\epsilon}
    \emptyset      | the empty language (no members)
       (r*)        | the set of concatenations of some
                   | finite number (0 or more) strings
                   | in r, repeats allowed
       (rs)        | the set of all strings which are
                   | concatenations of a string from r
                   | with (followed by) a string from s
   (r \union s)    | the union of r and s (strings which
                   | are in either or both)

2/7/2001
Equivalence of regular expressions and automata:

Thm: Every regular regular expression represents a regular language.
Proof: Induction on the complexity of a regular expression, using the
closure properties of a regular language. //

Thm: Every regular language is represented by some regular expression.
Proof: By construction of a regular expression from an NFA.
Two operations: edge combination and vertex removal. //

Proving languages nonregular:

The Pumping Lemma: If A is a regular language, then there is a number
p (the _pumping_length_) where, if s is any string in A of length at
least p, then s may be "pumped", that is, s may be divided into three
pieces, s = xyz, satisfying the following conditions:
  1.  for each i >= 0, xy^iz is in A,
  2.  |y| > 0, and
  3.  |xy| <= p.
(note: y^i is yyy...y i times, y^0 is \epsilon)

Applications:
{ 0^n1^n | n >= 0 } is not regular
{ s | #0s in s = #1s in s } is not regular
{ ww | w \in {0,1}* } is not regular

2/12/2001
Context-Free Languages
context-free grammars (invented to study natural language, used to
parse computer programs for compilation)
grammar given by list of substitution rules (productions):

<variable> -> <string of variables and terminals>

(variables also called nonterminals)
terminals are lower-case letters and symbols (input alphabet)
variables (upper-case letters)
start variable (lhs of topmost rule)

Ex:
A -> 0A1
A -> B
B -> #

Generating strings with a grammar:
1. write down the start variable
2. find a written-down variable and some production whose lhs is that
variable, and replace the variable occurrence with the rhs of the
production ("splicing in")
3. repeat step 2 on the string produced until no variables remain

(derivation, or parse tree)

2/14/2001
Grammars for
{ w | w starts and ends with the same symbol }
{ w | the length of w is odd and its middle symbol is 0 }
{ w | w = w^R }

(alphabet is {0,1} in each case)

pushdown automata (for the languages above)

formal definitions of grammars and pushdown automata

2/21/2001
More sample grammars: (\Sigma = {a,b})
{ a^mb^n | m \leq n }
S -> \epsilon | Sb | aSb
ambiguous? disambiguate

{ w | w has equal numbers of a's and b's } (hard!)
S -> \epsilon | SS | aSb | bSa

Proof that this grammar works:
One direction: any string generated by the grammar has an equal
number of a's and b's.
Proof sketch: at the beginning of any derivation, there are no
terminals, only the start variable.  Whenever a variable is expanded
in a derivation, either no new terminals are added (first two
productions), or exactly one a and exactly one b are added (last two
productions).  Thus a's and b's are always added in equal number in
any derivation, so the final string must have the same number of a's
as b's.
Other direction (converse): if a string w has an equal number of a's and
b's, then w can be generated by the grammar.
Proof: we prove this by strong induction on the length of w.
Basis case: |w| = 0.  Then w = \epsilon, the empty string, which is
clearly generated by the grammar (one application of the first
production).
Inductive case: Let k > 0, and suppose that the statement is true for
any string of length less than k.  We prove the statement true for any
string w of length k.
Case 1: k is odd.  In this case, w clearly cannot have an equal number
of a's and b's, so we have nothing to prove (the statement holds
"vacuously").
Case 2: k is even.  Suppose w has an equal number of a's and b's.
Case 2a: w = axb or w = bxa for some string x.  Since w has an equal
number of a's and b's, then clearly so does x.  Further, since |x| =
k-2, by the inductive hypothesis, x can be generated by the grammar.
That is, S =>^* x.  But then we have S => aSb =>^* axb and S => bSa
=>^* bxa.  So w can also be generated by the grammar.
Case 2b: w = aya or w = byb for some string y.  We'll assume that w =
aya; the other case is similar.  Since w has an equal number of a's
and b's, it must by that y has two more b's than a's.  By scanning y
from left to right, keeping track of the number of a's and b's, it is
clear that we can split y up into two strings u and v (y = uv) such
that both u and v have exactly one more b's than a's.  But then, w =
auva, where au and va both have an equal number of a's as b's.  Both
au and va have length less than k, so by the induction hypothesis,
both au and va are generatable from the grammar.  But then

     S => SS =>^* auS =>^* auva = w,

so w is also generatable from the grammar.  QED

Note that in the proof above, we needed to use all four productions.
Actually, this grammar is ambiguous (proof: exercise!).  An
unambiguous grammar for the same language is

     S -> \epsilon | aTbS | bUaS
     T -> \epsilon | aTbT
     U -> \epsilon | bUaU

PDAs
Finite state set Q
Finite input alphabet \Sigma
Finite stack alphabet \Gamma
Transition function \delta
Start state q_0 \in Q
Set of final (accepting) states F \subseteq Q

Notation: \Sigma_\epsilon = \Sigma \cup \{ \epsilon \}
          \Gamma_\epsilon = \Gamma \cup \{ \epsilon \}

\delta : Q \times \Sigma_\epsilon \times \Gamma_\epsilon ->
         P(Q \times \Gamma_\epsilon)

Idea: suppose \delta(q,a,x) contains (r,y).  This means that if we are
in state q, reading a on the input, and with x on top of the stack,
then we are allowed to shift to state r, read a, pop x and push y
(this is one step).

We start at the left of the input, in the start state q_0, with an
empty stack.

We accept if we can read through the entire string and wind up in a
final state (stack contents arbitrary).

If a = \epsilon, then we don't advance the input pointer.
If x = \epsilon, we don't read/pop a symbol off the stack.
If y = \epsilon, we don't push a symbol onto the stack.

Formal table and state diagram for PDA recognizing { 0^n1^n | n \geq 0 }
\Gamma = {0,$}
Push $ first, this gives us an empty stack test.  (No built-in test in
the definition of PDA).
Note that this is deterministic.

2/26/2001
State diagram for PDAs for the languages above (for which we have
grammars)

Here's a table for the "equal numbers of a's and b's" language
\Gamma = {$,a,b}

Input: |           a           |           b           |        epsilon
Stack: |  $  |  a  |  b  | eps |  $  |  a  |  b  | eps |  $  |  a  |  b  | eps
------------------------------------------------------------------------------
    0  |                                                                   1,$
    1  |             1,e   1,a         1,e         1,b   2,e
    2  |

F = { 2 }, q_0 = 0
(nondeterministic)

Idea: during the computation, if d is the number of a's minus the
number of b's read so far, then the stack contents has the same
difference d in a's vs b's.  The input string has the same number
of a's as b's if and only if there is a computation that ends with
only '$' on the stack.

Formal definition of acceptance:
Let M = (Q,\Sigma,\Gamma,\delta,q_0,F) be a PDA and w \in \Sigma^* a string.
We say that _M_accepts_w_ if

    there are w_1,...,w_m \in \Sigma_\epsilon (m \geq 0) with w = w_1...w_m,
    there are states r_0,...,r_m with r_0 = q_0 and r_m \in F,
    there are strings s_0,...,s_m \in \Gamma^* with s_0 = \epsilon,

    for each i \in {0,...,m-1}, there are a,b \in \Gamma_\epsilon,

    s_i = at and s_{i+1} = bt for some t \in \Gamma^* and
    (r_{i+1},b) \in \delta(r_i,w_{i+1},a).

Notes: some w_i may be \epsilon.  These correspond to \epsilon-moves;
a is the symbol popped and b is the symbol pushed;
popping or pushing \epsilon does not change the stack.

2/28/2001
PDA for the language { w | |w| is odd and its middle symbol is 0 }

Equivalence of PDAs and grammars.  (Two directions)

Lemma 2.13: If a language is context-free, then some PDA recognizes
it.

Proof: let A be a CFL.  Then A is generated by some CFG G.  We convert
G into an equivalent PDA P.  P must accept a string w iff w is
(leftmost) derivable according to G.  P runs the derivation forward,
keeping on its stack some suffix of the current state of the
derivation.

Shorthand: we'll allow P to push any string onto the stack in "one
step."

Stack alphabet: $, variables, terminals of grammar

Initialization: push $ then the start variable on the stack

Do forever:
  a. If the top of the stack is $, enter the accept state
  b. If the top of the stack is a terminal a, pop and read an input
  symbol.  If the input symbol is a, then continue, else immediately
  reject.
  c. If the top of the stack is a variable symbol V,
  nondeterministically select one of the rules for V, and replace V
  with the right-hand-side of the rule.

Lemma 2.15: If a PDA recognizes some language, then it is
context-free.

Assume: the PDA has a unique accept state q_accept,
        it empties its stack before accepting
        each transition either push a symbol onto the stack or pops a
        symbol off of the stack, but does not do both at the same
        time.

Variables: A_pq (p,q \in Q)
Start var: A_q_0q_accept

3/5/2001
Pumping Lemma for CFLs.  Proof.

3/7/2001
Finish proof of Pumping Lemma.  Application to the language
{ a^mb^nc^md^n | m,n >= 0 }

Intro to Turing machines
one-way infinite read/write tape
finite state control
accepting and rejecting states (two halting states; take effect immediately)
start state
transition (q,a) -> (q',a',L or R)

3/19/2001
(after Spring break)
Turing program M_1 for B = { w#w | w in {0,1}^* }:

On input w:
1. scan input to make sure it contains a single # symbol.  If not,
reject.
2. Zig-zag across #, crossing off left symbol, and checking
corresponding symbol on right side.  If different, reject.  Otherwise,
cross off right symbol.
3. When all left symbols crossed off, check right to see if any
symbols left.  If so, reject; else, accept.

Formal Definition of a TM: (Q,Sigma,Gamma,delta,q_0,q_accept,q_reject)
where Q,Sigma,Gamma are all finite sets, and

1. Q is the set of states,
2. Sigma is the input alphabet not containing the special _blank_
symbol ...,
3. Gamma is the tape alphabet, where blank \in Gamma and Sigma
\subseteq Gamma,
4. delta : Q x Gamma -> Q x Gamma x {L,R} is the transition function,
5. q_0 \in Q is the start state,
6. q_accept \in Q is the accept state, and
7. q_reject \in Q is the reject state, where q_reject != q_accept

Start with input w \in Sigma^* on leftmost part of the tape, rest
blanks, state q_0, tape head scanning leftmost square.

(If machine tries to move off of the left end, the head stays put.)

Configuration uqv means current state is q, current tape contents is
uv, and head is scanning leftmost symbol of v.

State diagram for machine described above

3/21/2001
State diagram (continued)

Don't worry about stage 1.  Start zig-zag immediately.

Simplified transition diagram for M_1 (0 is the start state):

0,0 -> 1,x,R
0,1 -> 2,x,R
0,# -> 7,#,R

1,0 -> 1,0,R
1,1 -> 1,1,R
1,# -> 3,#,R

2,0 -> 2,0,R
2,1 -> 2,1,R
2,# -> 4,#,R

3,x -> 3,x,R
3,0 -> 5,x,L

4,x -> 4,x,R
4,1 -> 5,x,L

5,x -> 5,x,L
5,# -> 6,#,L

6,0 -> 6,0,L
6,1 -> 6,1,L
6,x -> 0,x,R

7,x -> 7,x,R
7,_ -> accept

9 states

Rejecting state is omitted.  All omitted edges go to the rejecting state.

Define start, accepting , rejecting configurations
Define C_1 yields C_2 (special cases for extreme ends of the input)

Define acceptance.  Turing-recognizable = computably enumerable
Turing decidable = computable

Multitape turing machines, k-tape.  Input tape and work tapes.

3/26/2001
Discussion of the Myhill-Nerode Theorem

Simulating a k-tape TM with a 1-tape (original) TM: alternate method:
use bigger tape alphabet and multitrack tape.  (Each tape alphabet
symbol encodes corresponding contents of the k tapes, and for each
tape, whether or not the head is scanning that cell.)

Technicality: must first mark beginning and end of "active" region of
the tape: on input w, convert to $w$, where $ is a new end marker.

Each step of the k-tape machine is simulated by a full pass of the
active region of the 1-tape machine:
    going left to right: gather information about which symbol is
    being scanned on which tape,
    going right to left: simulate the writing and head movement for
    each tape.
All additional information, including the state of the k-tape machine,
constitutes a fixed finite amount of information and so can be
stored in the state of the simulating machine.

3/28/2001
Review k-tape to 1-tape simulation

Nondeterministic TMs (NTMs as opposed to DTMs): analogous with NFAs:
transition function gives a set of possible successor states.

Nondeterministic computation best viewed as a tree (nodes =
configurations, root is start configuration).  Machine accepts if
there is some path that leads to the accept state.

Theorem: For every NTM N there is an equivalent DTM D (i.e., a DTM
recognizing the same language as the NTM).

Proof idea: D uses breadth-first search on N's tree (why not
depth-first search?)
3 tapes for D:
  tape 1 always contains the input and is never altered,
  tape 2 maintains a copy of N's tape on some branch of its
  computation,
  tape 3 keeps track of D's location in N's tree: succession of
  numbers 1,...,b, where b is the maximum branching of N.

Tape 3 counts up in (length first) lex order.  For each, N's
computation branch is simulated from the beginning.

4/2/2001
An NTM is a _decider_ if it halts on all branches on all inputs
(Koenig's lemma ==> whole computation tree is finite).

NTMs equivalent to DTMs at deciding languages.

Mention enumerators
Thm: A language is T-recognizable iff some enumerator enumerates it.

Lots of different models (TMs of various sorts (Alan Turing 1936),
lambda-calculus (Alonzo Church 1936), various reasonable programming
languages, etc.) are all equivalent, describing exactly the same class
of algorithms, because they all can simulate each other.

Church-Turing Thesis: All these (equivalent) formal definitions
precisely capture the intuitive (or even physical) notion of an
algorithm.

Emphasis switches from TM to algorithm.  Three levels of description:
formal (TM), implementation level (informal TM), and high-level
(algorithmic).

Encode objects O (numbers, graphs, polynomials, etc.) as strings <O>
(inputs to TMs).  Severally, O_1,...,O_k as <O_1,...,O_k>

4/4/2001
We'll allow stationary head movements with multitape machines.

Chapter 4:
Decidable languages:
A_{DFA} = { <B,w> | B is a DFA accepting w }  (via TM M)
Implementation details with multitape machine: write description of B
on separate work tape, keep track of current state on another work
tape.  Read the input w as B would read it.

A_{NFA} = .................NFA..............  (via TM N using M as subroutine)
N first converts B into an equivalent DFA C, then runs M on <C,w>

A_{REX} = { <R,w> | w matches regexp R }
Convert R into an equivalent DFA A (Thm 1.28 is effective), then run M
on input <A,w>.

E_{DFA} = { <A> | A is a DFA and L(A) = \emptyset }
Search from start state of A for a final state:
T = "On input <A> where A is a DFA:
1. Mark the start state of A.
2. Repeat until no new states get marked:
3.   Mark any state that has a transition coming into it from any
     state that is already marked.
4. If no accept state is marked, accept, else reject."

EQ_{DFA} = { <A,B> | A and B are DFAs and L(A) = L(B) }
Construct DFA C recognizing the language L(A) \symdiff L(B).  Run T on
<C>.  (L \symdiff L' is the "symmetric difference" of L and L',
defined as (L - L') \union (L' - L).

A_{CFG} = { <G,w> | G is a CFG that generates string w }
TM S: Convert G to Chomski Normal Form.  Then any derivation of w has
exactly 2|w|-1 steps.

E_{CFG} = { <G> | G is a CFG and L(G) = \emptyset }
R = "On input <G> where G is a CFG:
1. Mark all terminal symbols in G.
2. Repeat until no new variables get marked:
3.   Mark any variable A where G has a rule A -> U_1U_2...U_k and each
     symbol U_i has already been marked.
4. If the start symbol is not marked, accept, else reject."

EQ_{CFG} = { <G,H> | G and H are CFGs and L(G) = L(H) }
This language is not decidable!

Thm: Every CFL is decidable.

Proof: Suppose L is a CFL.  Let G be a CFG such that L = L(G).
"On input w:
1. Run S on input <G,w>.
2. If S accepts, accept, else reject.

regular => context-free => decidable => Turing-recognizable
We know the first two =>'s are strict.  What about the last one?

The Halting Problem (4.2)

A_{TM} = { <M,w> | M is a TM and M accepts w }
A_{TM} is undecidable.

A_{TM} is Turing-recognizable:
U = "On input <M,w>, where M is a TM and w is a string:
1. Simulate M on input w.
2. If M ever enters its accept state, accept; if M ever enters its
reject state, reject."

U loops on <M,w> if M loops on w, so U does not decide A_{TM}.  If U
had some way of finding out that M would not halt on w, then it could
reject.  A_{TM} is sometimes called the halting problem.

U is a universal Turing machine (first proposed by Alan Turing).  It
can simulate any other TM.  U inspired the stored program computer: M
is the program, w its input.

Diagonalization Method (Georg Cantor, 1873): used to prove that there
are uncountably many reals, etc.  We'll skip over this mostly.

Thm: A_{TM} is undecidable.

Proof: Assume (for the purposes of contradiction) that there is a TM H
which is a decider for A_{TM}.  That is, for all M, w:
M is a TM accepting w => H on input <M,w> halts and accepts,
M is not ............ => H on input <M,w> halts and rejects.

Let D be the following machine:
D = "On input <M> where M is a TM:
1. Run H on input <M,<M>>
2. Output the opposite of the output of H (accept -> reject; reject ->
accept)."

Thus for every TM M,
D(<M>) = accept if M does not accept <M>
D(<M>) = reject if M accepts <M>.

What about D(<D>)?

4/9/2001
Review the proof that A_{TM} is not decidable.

Explain in terms of prediction failure paradox.
(Explain in terms of diagonalization?)

Thm: A is decidable if both A and A-bar are Turing-recognizable.

Cor: A_{TM}-bar is not Turing-recognizable.

Chapter 5: reducibility

Problem A reduces to problem B if any solution for B yields a solution
for A.

If A and B are languages, then we say that A reduces to B if any
decision procedure for B yields a decision procedure for A.

A reduction (from A to B) is an algorithm that decides A by using
answers to questions about membership in B "for free."

Key fact: Suppose A reduces to B.  Then
   - if B is decidable then A is decidable
   - if A is undecidable, then B is undecidable

Example: A_{DFA} reduces to A_{NFA}, hence, A_{NFA} is decidable.

HALT = { <M,w> | M is a TM that halts on input w }

(HALT is the real halting problem)

Thm: A_{TM} reduces to HALT.

Pf: On input <M,w>:
1.  Ask if <M,w> is in HALT (assumed subroutine for HALT)
2.  If no, reject.
3.  If yes, run M on input w.
4.  If M(w) accepts, accept
5.  If M(w) rejects, reject

So, HALT is undecidable.

E_{TM} = { <M> | M is a TM and L(M) = \emptyset }

Thm: A_{TM} reduces to E_{TM}

Pf: On input <M,w>, construct <N> such that N rejects all x \neq w,
but simulates M when the input is = w.

4/11/2001
Second midterm exam

4/16/2001
(no class; Easter)

4/18/2001
A Linear Bounded Automaton (LBA) is a TM where the tape head is not
permitted to move off the input (i.e., finite tape).  (Head instead
stays where it is.)

A_{LBA} = { <M,w> | M is an LBA that accepts input w }
E_{LBA} = { <M> | M is an LBA and L(M) = \emptyset }

Theorem: A_{LBA} is decidable.

Proof: Let q = |Q|, and g = |\Gamma|.  There are exactly
qng^n possible distinct configurations of M for a tape of length n.
Run M on input w for qng^n steps (where n=|w|) or until it halts.  If
M has accepted, then accept, else reject.

Theorem: E_{LBA} is undecidable.

Proof: (Reduction from A_{TM} using computational history method): let
M be a TM and w be an input string.  Construct an LBA B_{M,w} that
accepts an input x iff x is the complete history of an accepting
computation of M on input x, i.e.,

     x = $C_1#C_2#...#C_k$,

where the C_i are the successive confugurations of M on input w.

B = "On input x:
1.  (B can find C_1,...,C_k when necessary, using the delimiters)
2.  Check that C_1 is the start configuration of M on input w, that
    is, C_1 = q_0w
3.  Check that each C_{i+1} legally follows from C_i by the rules of M
4.  Check that C_k is an accepting configuration, that is, C_k =
    ...q_{accept}..."

(Tape alphabet is Q \union \Gamma \union {#,$}, where \Gamma is the
tape alphabet of M.  Assume these sets are disjoint.)

Mapping Reducibility (a.k.a., many-one reducibility, m-reducibility)

A function f : \Sigma^* -> \Sigma^* is a _computable_function_ if some
TM M, on every input w, halts wit just f(w) on its tape.

Ex: all the usual arithmetic operators on integers (in binary) are
computable functions.

Ex: transformations of machine descriptions.

Def: Let A and B be languages.  We say that A is _mapping_reducible_
to B (A \leq_m B) if there is a computable function f such that, for
every w,
          w \in A  <==>  f(w) \in B.
f is a _reduction_ of A to B.

Thm: If A \leq_m B and B is decidable (T-recognizable), then A is
decidable (T-recognizable).

Contrapositives are useful.

Ex: E_{TM} \leq_m EQ_{TM}

4/23/2001
(start with proof of previous theorem)

Is \leq_m transitive?  reflexive?  symmetric?

Another example: A_{TM} \leq_m HALT_{TM}

PCP:

Fix an alphabet.  A _domino_ is of the form [w/x], where w and x are
strings over the alphabet.  A _match_ is a finite sequence [w_1/x_1],
[w_2/x_2],..., [w_k/x_k] of dominoes such that w_1w_2...w_k =
x_1x_2...x_k.  Give an example.

Post Correspondence Problem (PCP): Given a finite set P of dominoes, is
there a match with all dominoes taken from P (repetitions allowed)?

PCP = { <P> | P is a set of dominoes with a match }

Theorem: PCP is undecidable.

Proof Sketch: We show that A_{TM} \leq_m PCP.  Let M and w be given.
Assume: M never tries to move off the left end of the tape (this
assumption can be dropped later); any match must start with the first
domino (call this problem MPCP; we show later that A_{TM} \leq_m MPCP
\leq_m PCP).  Build a set P' of dominoes:

Part 1: put [#/q_0w#] into P
Part 2: for each a,b \in \Gamma and q,r\in Q, if \delta(q,a) =
        (r,b,R), put [qa/br] into P
Part 3: for a,b,c \in \Gamma and q,r \in Q, if \delta(q,a) = (r,b,L),
        put [cqa/rcb] into P
Part 4: for every a \in \gamma, put [a/a] into P
Part 5: put [#/#] and [#/_#] into P
Part 6: for each a \in \Gamma, put [aq_{accept}/q_{accept}] and
[q_{accept}a/q_{accept}] into P
Part 7: put [q_{accept}##/#] into P

P has a match starting with the first domino iff M accepts w. //

4/25/2001
We have shown that A_{TM} \leq_m MPCP.  Now, to show that MPCP \leq_m
PCP, we show how to convert P' into an equivalent instance P of PCP:
Replace each [t/u] in P' with [*t/u*], then add [*t_1/*u_1*] and
[*$/$].

Rice's Theorem (Exercise 5.22): Let P be any language (usually a
problem about TMs) that satisfies:

a.  For any TMs M_1 and M_2, if L(M_1) = L(M_2) then <M_1> \in P iff
<M_2> \in P.  [P is known as an "index set"]
b.  There exist TMs M_1 and M_2 such that <M_1> \in P and <M_2>
\not\in P.  [P is nontrivial]

Then P is undecidable.  In fact, either A_{TM} \leq_m P or A_{TM} \leq
P-bar (the complement of P).

Upshot: any nontrivial problem about TMs that just depends on the
languages they recognize is undecidable.  For example, the following
languages are undeciable:

  { <M> | L(M) = \emptyset }
  { <M> | \epsilon \not\in L(M) }
  { <M> | L(M) = \Sigma^* }
  { <M> | L(M) is finite }
  { <M> | L(M) is infinite }
  { <M> | L(M) is cofinite (L(M)-bar is finite) }
  { <M> | L(M) has exactly 17 elements }
  etc.

Proof: Fix a TM M_0 such that L(M_0) = \emptyset.  Suppose first, that
<M_0> \not\in P.  Then since <M_1> \in P, we have L(M_0) != L(M_1).
Consider a computable function f that takes an input <M,w> where M is
a TM and w\in\Sigma^* and outputs <N> where N is a TM that behaves as
follows:
N = "On input x:
    1.  Run M on input w.
    2.  If or when M accepts w, then run M_1 on input x and do as M_1
        does.
    3.  If M (ever) rejects w, then reject."

Suppose M accepts w.  Then for all x, N accepts x iff M_1 accepts x,
so L(N) = L(M_1) and so <N> \in P.  On the other hand, if M does
not accept x, then L(N) = \emptyset = L(M_0), and so <N> \not\in P.  Thus
A_{TM} \leq_m P via the reduction f.
Now if <M_0> \in P, then we do the same proof as above with M_2
instead of M_1, and get an m-reduction of A_{TM} to P-bar. //

6.1: The Recursion Theorem

Codifies self-reference.

Lemma: there is a computable function q : \Sigma^* -> \Sigma^* such
that, on input w, q(w) is the description of a machine P_w that prints
out w and then halts.

Self-printing programs

Recursion Theorem:  Let T be a TM that computes a function t :
\Sigma^* x \Sigma^* -> \Sigma^*.  There is a TM R that computes a
function r : \Sigma^* -> \Sigma^*, where for every w,

                r(w) = t(<R>,w)

(R behaves as t does, but with its own description filled in
automatically)

Apps: A_{TM} is undecidable.
Define MIN_{TM}, show it is not T-recognizable.

4/30/2001
The Recursion Theorem ("institutionalizes" machine self-reference).

Theorem (Recursion Theorem):
Let t be any computable function taking any two strings as input and
outputing a single string (t : \Sigma^* x \Sigma^* -> \Sigma^*).  Then
there is a TM R computing a function r : \Sigma^* -> \Sigma^* such
that, for all w \in \Sigma^*,

                r(w) = t(<R>,w).

Lemma: There is a computable function q : \Sigma^* -> \Sigma^* such
that, for all w\in\Sigma^*, q(w) = <P_w>, where P_w is a TM that
erases its input, prints w, then accepts.

Proof: Let w = w_1w_2...w_n.  Then P_w has the following transition
diagram:

q_0 -----------> q_1 ----------> ... ----------> q_n -----------> accept
       w_1,R           w_2,R           w_n,R          blank -> R

and a single loop that erases any of the input that is left:

q_n -------------------------------> q_n
    (anything but blank) -> blank,R

Clearly, P_w behaves as advertized, and <P_w> can be computably
generated given w. //

Lemma: There is a computable function c : \Sigma^* x \Sigma^* ->
\Sigma^* such that, for any TMs A and B, c(<A>,<B>) = <AB>, where AB
is a TM which first runs A with a blank input tape, then runs B on A's
output as B's first input (B may take additional inputs, in which case
these are considered inputs to AB).

Proof: AB first simulates A on a separate tape (we can assume WLOG
that AB is a 2-tape machine).  When A finishes, AB prepends A's output
onto whatever is on the input tape, resets the head to the left, then
runs B.  <AB> can be computed from <A> and <B> in a straightforward
way, using the fact that any description of a 2-tape TM can be
computably converted into the description of an equivalent 1-tape
(standard) TM.

Proof of the Recursion Theorem: let t be given.  Let B be the
following TM:
B = "On input <M> and w, where M is a TM:
    1.  Compute q(<M>) = a = <P_<M>>.
    2.  Run P_<M> (with a blank tape), getting output b.
    3.  Compute r = c(a,b).
    4.  Compute and output t(r,w)."

Let A = P_<B>, and let R = AB.

On any input w, R behaves as follows:
1.  Run A, which outputs <B>.
2.  Run B with inputs <B> (the output of A) and w:
    a.  Compute a = q(<B>) = <P_<B>> (= <A>).
    b.  Run A (with a blank tape), getting output b = <B> (since A = P_<B>)
    c.  Compute r = c(a,b) (= c(<A>,<B>) = <AB> = <R>).
    d.  Compute and output t(r,w) (= t(<R>,w)).
//

Using the recursion theorem:
When defining an algorithm to be implemented on a TM M, we may freely
assume that M has access to its own description <M> on its tape.  So
a legitimately implementable algorithm can look something like

M = "On input w:
     ... <M> ... w ..."

Proof: Let T be the algorithm
T = "On input <N> and w where N is a TM:
     ... <N> ... w ..."
Then letting R be as in the recursion theorem, we have
R = "On input w:
     ... <R> ... w ..."
So R can be used for our machine M above.  R can refer to, examine,
simulate, or otherwise use itself during its computation.  In
particular, it can call itself recursively.  This allows algorithms
that are freely recursive.  For example:

P = "On input <b,e>, where b and e are natural numbers:
    1.  If e = 0, then output 1.
    2.  If e > 0, then run P on input <b,e-1>, and let p be its output.
    3.  Output p times b"

P implements integer exponentiation using recursion.

Besides making recursive calls, a machine can also examine its own
description.  For example, there is a TM SELF which outputs its own
description (ignoring its input):

SELF = "Output <SELF>."

There's a machine that returns the number of states it has:

S = "Output the size of the state set of S."

It's important to realize that the Recursion Theorem doesn't give us
any additional control over what these machines are, only that they
exist and implement the specified algorithms.  For example, we can't
make any assumptions about how many states S actually has.

5/2/2001
Some Final Exam-like sample problems with solutions.


1.  Give the state diagram of a TM that on input 1^i#1^j with i >= j
    accepts leaving 1^{i-j} on its tape.  Will not accept any input
    not of this form.

Answer:
Rows are indexed by states 0,1,2,...,a,r; columns areindexed by
tape alphabet symbols (_ is the blank).  0 is the start state, a is
the accepting state, and r is the rejecting state.
Entries are of the form qsd, where q is the new state, s is the
written symbol, and d is the direction of head movement.  Entries with
r as the new state are not shown.  The symbol $ is a symbol not in the
input alphabet.

    state|  1  |  #  |  $  |  _
    -----------------------------
      0  | 01R | 1$R |     |
      1  | 2$L |     | 1$R | 3_L
      2  | 1$R |     | 2$L |
      3  | a1L |     | 3_L | a_L


3.  Define the empty-recognition problem as

         EMPTY-REC = { <M> | M is a TM and L(M) = {epsilon} }

    That is, EMPTY-REC is the set of machines that accept the empty
    string and nothing else.  Describe a mapping reduction of
    A_{TM} to EMPTY-REC.  Is EMPTY-REC Turing-recognizable?

Answer:
Let f be a computable function such that, for any TM M and string w,
f(<M,w>) = <N>, where N is a TM which behaves as follows:

N = "On input x:
     1.  If x != epsilon, reject.
     2.  Run M on input w.
     3.  If M accepts w, then accept.
     4.  If M halts (rejecting w), then reject."

By step 1, we know that either L(N) is empty or L(N) = {epsilon}.
Further, we have

     <M,w> is in A_{TM}  <==>  M accepts w
                         <==>  L(N) = {epsilon}
                         <==>  <N> is in EMPTY-REC,

so f reduces A_{TM} to EMPTY-REC.
EMPTY-REC is _not_ Turing-recognizable: (not finished ...)


4.  Using the Recursion Theorem, we can define the following TM M:
    M = "On input w in {0,1}^*:
         1.  If w = epsilon, then accept.
         2.  Otherwise let w = ya where a is the last symbol in w.
         3.  Run M on input y.
         4.  If M accepts y and a = 0, then accept.
         5.  If M halts and rejects y and a = 1, then accept.
         6.  Reject."
    What is the behavior of M on input 0010110?  Accept?  Reject?
    Loop?

Answer:
M rejects this string.  In general, M accepts all strings with an even
number of 1s, and halts and rejects all other strings.


5.  Let f be some function (outputting natural numbers) such that for
    any TM M, if L(M) is finite, then f(<M>) = the number of elements
    of L(M).  Use the Recursion Theorem to show that f cannot be
    computable.

Answer:
Suppose there is a computable f as above.  By the Recursion Theorem,
we know there is a TM R which behaves as follows:

R = "On input n >= 0:
     1.  If n <= f(<R>) then accept, else reject."

Let m = f(<R>).  Then L(R) = {0,1,2,...,m} and so L(R) has m+1 many
elements.  So L(R) is finite, but f(<R>) = m, which is not the
cardinality of L(R), so f doesn't give the correct output on input <R>.