See any bugs/typos/confusing explanations? Open a GitHub issue. You can also comment below
★ See also the PDF version of this chapter (better formatting/references) ★
Restricted computational models
- See that Turing completeness is not always a good thing.
- Another example of an always-halting formalism: context-free grammars and simply typed calculus.
- The pumping lemma for non context-free functions.
- Examples of computable and uncomputable semantic properties of regular expressions and context-free grammars.
“Happy families are all alike; every unhappy family is unhappy in its own way”, Leo Tolstoy (opening of the book “Anna Karenina”).
We have seen that many models of computation are Turing equivalent, including Turing machines, NAND-TM/NAND-RAM programs, standard programming languages such as C/Python/Javascript, as well as other models such as the calculus and even the game of life. The flip side of this is that for all these models, Rice’s theorem (Theorem 9.15) holds as well, which means that any semantic property of programs in such a model is uncomputable.
The uncomputability of halting and other semantic specification problems for Turing equivalent models motivates restricted computational models that are (a) powerful enough to capture a set of functions useful for certain applications but (b) weak enough that we can still solve semantic specification problems on them. In this chapter we discuss several such examples.
We can use restricted computational models to bypass limitations such as uncomputability of the Halting problem and Rice’s Theorem. Such models can compute only a restricted subclass of functions, but allow to answer at least some semantic questions on programs.

Turing completeness as a bug
We have seen that seemingly simple computational models or systems can turn out to be Turing complete. The following webpage lists several examples of formalisms that “accidentally” turned out to Turing complete, including supposedly limited languages such as the C preprocessor, CSS, (certain variants of) SQL, sendmail configuration, as well as games such as Minecraft, Super Mario, and the card game “Magic: The Gathering”. Turing completeness is not always a good thing, as it means that such formalisms can give rise to arbitrarily complex behavior. For example, the postscript format (a precursor of PDF) is a Turing-complete programming language meant to describe documents for printing. The expressive power of postscript can allow for short descriptions of very complex images, but it also gave rise to some nasty surprises, such as the attacks described in this page ranging from using infinite loops as a denial of service attack, to accessing the printer’s file system.
An interesting recent example of the pitfalls of Turing-completeness arose in the context of the cryptocurrency Ethereum. The distinguishing feature of this currency is the ability to design “smart contracts” using an expressive (and in particular Turing-complete) programming language. In our current “human operated” economy, Alice and Bob might sign a contract to agree that if condition X happens then they will jointly invest in Charlie’s company. Ethereum allows Alice and Bob to create a joint venture where Alice and Bob pool their funds together into an account that will be governed by some program that decides under what conditions it disburses funds from it. For example, one could imagine a piece of code that interacts between Alice, Bob, and some program running on Bob’s car that allows Alice to rent out Bob’s car without any human intervention or overhead.
Specifically Ethereum uses the Turing-complete programming language solidity which has a syntax similar to JavaScript. The flagship of Ethereum was an experiment known as The “Decentralized Autonomous Organization” or The DAO. The idea was to create a smart contract that would create an autonomously run decentralized venture capital fund, without human managers, where shareholders could decide on investment opportunities. The DAO was at the time the biggest crowdfunding success in history. At its height the DAO was worth 150 million dollars, which was more than ten percent of the total Ethereum market. Investing in the DAO (or entering any other “smart contract”) amounts to providing your funds to be run by a computer program. i.e., “code is law”, or to use the words the DAO described itself: “The DAO is borne from immutable, unstoppable, and irrefutable computer code”. Unfortunately, it turns out that (as we saw in Chapter 9) understanding the behavior of computer programs is quite a hard thing to do. A hacker (or perhaps, some would say, a savvy investor) was able to fashion an input that caused the DAO code to enter into an infinite recursive loop in which it continuously transferred funds into the hacker’s account, thereby cleaning out about 60 million dollars out of the DAO. While this transaction was “legal” in the sense that it complied with the code of the smart contract, it was obviously not what the humans who wrote this code had in mind. The Ethereum community struggled with the response to this attack. Some tried the “Robin Hood” approach of using the same loophole to drain the DAO funds into a secure account, but it only had limited success. Eventually, the Ethereum community decided that the code can be mutable, stoppable, and refutable. Specifically, the Ethereum maintainers and miners agreed on a “hard fork” (also known as a “bailout”) to revert history to before the hacker’s transaction occurred. Some community members strongly opposed this decision, and so an alternative currency called Ethereum Classic was created that preserved the original history.
Context free grammars
If you have ever written a program, you’ve experienced a syntax error. You probably also had the experience of your program entering into an infinite loop. What is less likely is that the compiler or interpreter entered an infinite loop while trying to figure out if your program has a syntax error.
When a person designs a programming language, they need to determine its syntax. That is, the designer decides which strings corresponds to valid programs, and which ones do not (i.e., which strings contain a syntax error). To ensure that a compiler or interpreter always halts when checking for syntax errors, language designers typically do not use a general Turing-complete mechanism to express their syntax. Rather they use a restricted computational model. One of the most popular choices for such models is context free grammars.
To explain context free grammars, let us begin with a canonical example. Consider the function that takes as input a string over the alphabet and returns if and only if the string represents a valid arithmetic expression. Intuitively, we build expressions by applying an operation such as ,, or to smaller expressions, or enclosing them in parentheses, where the “base case” corresponds to expressions that are simply numbers. More precisely, we can make the following definitions:
A digit is one of the symbols .
A number is a sequence of digits. (For simplicity we drop the condition that the sequence does not have a leading zero, though it is not hard to encode it in a context-free grammar as well.)
An operation is one of
An expression has either the form “number”, the form “sub-expression1 operation sub-expression2”, or the form “(sub-expression1)”, where “sub-expression1” and “sub-expression2” are themselves expressions. (Note that this is a recursive definition.)
A context free grammar (CFG) is a formal way of specifying such conditions. A CFG consists of a set of rules that tell us how to generate strings from smaller components. In the above example, one of the rules is “if and are valid expressions, then is also a valid expression”; we can also write this rule using the shorthand . As in the above example, the rules of a context-free grammar are often recursive: the rule defines valid expressions in terms of itself. We now formally define context-free grammars:
Let be some finite set. A context free grammar (CFG) over is a triple such that:
, known as the variables, is a set disjoint from .
is known as the initial variable.
is a set of rules. Each rule is a pair with and . We often write the rule as and say that the string can be derived from the variable .
The example above of well-formed arithmetic expressions can be captured formally by the following context free grammar:
The alphabet is
The variables are .
The rules are the set containing the following rules:
The rules , , , and .
The rules ,, .
The rule .
The rule .
The rule .
The rule .
The rule .
The starting variable is
People use many different notations to write context free grammars. One of the most common notations is the Backus–Naur form. In this notation we write a rule of the form (where is a variable and is a string) in the form <v> := a
. If we have several rules of the form , , and then we can combine them as <v> := a|b|c
. (In words we say that can derive either , , or .) For example, the Backus-Naur description for the context free grammar of Example 10.3 is the following (using ASCII equivalents for operations):
operation := +|-|*|/
digit := 0|1|2|3|4|5|6|7|8|9
number := digit|digit number
expression := number|expression operation expression|(expression)
Another example of a context free grammar is the “matching parentheses” grammar, which can be represented in Backus-Naur as follows:
A string over the alphabet (
,)
can be generated from this grammar (where match
is the starting expression and ""
corresponds to the empty string) if and only if it consists of a matching set of parentheses. In contrast, by Lemma 6.20 there is no regular expression that matches a string if and only if contains a valid sequence of matching parentheses.
Context-free grammars as a computational model
We can think of a context-free grammar over the alphabet as defining a function that maps every string in to or depending on whether can be generated by the rules of the grammars. We now make this definition formally.
If is a context-free grammar over , then for two strings we say that can be derived in one step from , denoted by , if we can obtain from by applying one of the rules of . That is, we obtain by replacing in one occurrence of the variable with the string , where is a rule of .
We say that can be derived from , denoted by , if it can be derived by some finite number of steps. That is, if there are , so that .
We say that is matched by if can be derived from the starting variable (i.e., if ). We define the function computed by to be the map such that iff is matched by . A function is context free if for some CFG .
A priori it might not be clear that the map is computable, but it turns out that this is the case.
For every CFG over , the function is computable.
As usual we restrict attention to grammars over although the proof extends to any finite alphabet .
We only sketch the proof. We start with the observation we can convert every CFG to an equivalent version of Chomsky normal form, where all rules either have the form for variables or the form for a variable and symbol , plus potentially the rule where is the starting variable.
The idea behind such a transformation is to simply add new variables as needed, and so for example we can translate a rule such as into the three rules , and .
Using the Chomsky Normal form we get a natural recursive algorithm for computing whether for a given grammar and string . We simply try all possible guesses for the first rule that is used in such a derivation, and then all possible ways to partition as a concatenation . If we guessed the rule and the partition correctly, then this reduces our task to checking whether and , which (as it involves shorter strings) can be done recursively. The base cases are when is empty or a single symbol, and can be easily handled.
While we focus on the task of deciding whether a CFG matches a string, the algorithm to compute actually gives more information than that. That is, on input a string , if then the algorithm yields the sequence of rules that one can apply from the starting vertex to obtain the final string . We can think of these rules as determining a tree with being the root vertex and the sinks (or leaves) corresponding to the substrings of that are obtained by the rules that do not have a variable in their second element. This tree is known as the parse tree of , and often yields very useful information about the structure of .
Often the first step in a compiler or interpreter for a programming language is a parser that transforms the source into the parse tree (also known as the abstract syntax tree). There are also tools that can automatically convert a description of a context-free grammars into a parser algorithm that computes the parse tree of a given string. (Indeed, the above recursive algorithm can be used to achieve this, but there are much more efficient versions, especially for grammars that have particular forms, and programming language designers often try to ensure their languages have these more efficient grammars.)
The power of context free grammars
Context free grammars can capture every regular expression:
Let be a regular expression over , then there is a CFG over such that .
We prove the theorem by induction on the length of . If is an expression of one bit length, then or , in which case we leave it to the reader to verify that there is a (trivial) CFG that computes it. Otherwise, we fall into one of the following case: case 1: , case 2: or case 3: where in all cases are shorter regular expressions. By the induction hypothesis, we can define grammars and that compute and respectively. By renaming variables, we can also assume without loss of generality that and are disjoint.
In case 1, we can define the new grammar as follows: we add a new starting variable and the rule . In case 2, we can define the new grammar as follows: we add a new starting variable and the rules and . Case 3 will be the only one that uses recursion. As before we add a new starting variable , but now add the rules (i.e., the empty string) and also add, for every rule of the form , the rule to .
We leave it to the reader as (a very good!) exercise to verify that in all three cases the grammars we produce capture the same function as the original expression.
It turns out that CFG’s are strictly more powerful than regular expressions. In particular, as we’ve seen, the “matching parentheses” function can be computed by a context free grammar, whereas, as shown in Lemma 6.20, it cannot be computed by regular expressions. Here is another example:
Let be the function defined in Solved Exercise 6.4 where iff has the form . Then can be computed by a context-free grammar
A simple grammar computing can be described using Backus–Naur notation:
One can prove by induction that this grammar generates exactly the strings such that .
A more interesting example is computing the strings of the form that are not palindromes:
Prove that there is a context free grammar that computes where if but .
Using Backus–Naur notation we can describe such a grammar as follows
palindrome := ; | 0 palindrome 0 | 1 palindrome 1
different := 0 palindrome 1 | 1 palindrome 0
start := different | 0 start | 1 start | start 0 | start 1
In words, this means that we can characterize a string such that as having the following form
where are arbitrary strings and . Hence we can generate such a string by first generating a palindrome (palindrome
variable), then adding on either the left or right and on the opposite side to get something that is not a palindrome (different
variable), and then we can add arbitrary number of ’s and ’s on either end (the start
variable).
Limitations of context-free grammars (optional)
Even though context-free grammars are more powerful than regular expressions, there are some simple languages that are not captured by context free grammars. One tool to show this is the context-free grammar analog of the “pumping lemma” (Theorem 6.21):
Let be a CFG over , then there is some numbers such that for every with , if then such that , , and for every .
The context-free pumping lemma is even more cumbersome to state than its regular analog, but you can remember it as saying the following: “If a long enough string is matched by a grammar, there must be a variable that is repeated in the derivation.”
We only sketch the proof. The idea is that if the total number of symbols in the rules of the grammar is , then the only way to get with is to use recursion. That is, there must be some variable such that we are able to derive from the value for some strings , and then further on derive from some string such that is a substring of (in other words, for some ). If we take the variable satisfying this requirement with a minimum number of derivation steps, then we can ensure that is at most some constant depending on and we can set to be that constant ( will do, since we will not need more than applications of rules, and each such application can grow the string by at most symbols).
Thus by the definition of the grammar, we can repeat the derivation to replace the substring in with for every while retaining the property that the output of is still one. Since is a substring of , we can write and are guaranteed that is matched by the grammar for every .
Using Theorem 10.8 one can show that even the simple function defined as follows:
Let be the function such that if and only if for some . Then is not context free.
We use the context-free pumping lemma. Suppose towards the sake of contradiction that there is a grammar that computes , and let be the constant obtained from Theorem 10.8.
Consider the string , and write it as as per Theorem 10.8, with and with . By Theorem 10.8, it should hold that . However, by case analysis this can be shown to be a contradiction.
Firstly, unless is on the left side of the separator and is on the right side, dropping and will definitely make the two parts different. But if it is the case that is on the left side and is on the right side, then by the condition that we know that is a string of only zeros and is a string of only ones. If we drop and then since one of them is non-empty, we get that there are either less zeroes on the left side than on the right side, or there are less ones on the right side than on the left side. In either case, we get that , obtaining the desired contradiction.
Semantic properties of context free languages
As in the case of regular expressions, the limitations of context free grammars do provide some advantages. For example, emptiness of context free grammars is decidable:
There is an algorithm that on input a context-free grammar , outputs if and only if is the constant zero function.
The proof is easier to see if we transform the grammar to Chomsky Normal Form as in Theorem 10.5. Given a grammar , we can recursively define a non-terminal variable to be non-empty if there is either a rule of the form , or there is a rule of the form where both and are non-empty. Then the grammar is non-empty if and only if the starting variable is non-empty.
We assume that the grammar in Chomsky Normal Form as in Theorem 10.5. We consider the following procedure for marking variables as “non-empty”:
We start by marking all variables that are involved in a rule of the form as non-empty.
We then continue to mark as non-empty if it is involved in a rule of the form where have been marked before.
We continue this way until we cannot mark any more variables. We then declare that the grammar is empty if and only if has not been marked. To see why this is a valid algorithm, note that if a variable has been marked as “non-empty” then there is some string that can be derived from . On the other hand, if has not been marked, then every sequence of derivations from will always have a variable that has not been replaced by alphabet symbols. Hence in particular is the all zero function if and only if the starting variable is not marked “non-empty”.
Uncomputability of context-free grammar equivalence (optional)
By analogy to regular expressions, one might have hoped to get an algorithm for deciding whether two given context free grammars are equivalent. Alas, no such luck. It turns out that the equivalence problem for context free grammars is uncomputable. This is a direct corollary of the following theorem:
For every set , let be the function that on input a context-free grammar over , outputs if and only if computes the constant function. Then there is some finite such that is uncomputable.
Theorem 10.10 immediately implies that equivalence for context-free grammars is uncomputable, since computing “fullness” of a grammar over some alphabet corresponds to checking whether is equivalent to the grammar . Note that Theorem 10.10 and Theorem 10.9 together imply that context-free grammars, unlike regular expressions, are not closed under complement. (Can you see why?) Since we can encode every element of using bits (and this finite encoding can be easily carried out within a grammar) Theorem 10.10 implies that fullness is also uncomputable for grammars over the binary alphabet.
We prove the theorem by reducing from the Halting problem. To do that we use the notion of configurations of NAND-TM programs, as defined in Definition 8.8. Recall that a configuration of a program is a binary string that encodes all the information about the program in the current iteration.
We define to be plus some separator characters and define to be the function that maps every string to if and only if does not encode a sequence of configurations that correspond to a valid halting history of the computation of on the empty input.
The heart of the proof is to show that is context-free. Once we do that, we see that halts on the empty input if and only if for every . To show that, we will encode the list in a special way that makes it amenable to deciding via a context-free grammar. Specifically we will reverse all the odd-numbered strings.
We only sketch the proof. We will show that if we can compute then we can solve , which has been proven uncomputable in Theorem 9.9. Let be an input Turing machine for . We will use the notion of configurations of a Turing machine, as defined in Definition 8.8.
Recall that a configuration of Turing machine and input captures the full state of at some point of the computation. The particular details of configurations are not so important, but what you need to remember is that:
A configuration can be encoded by a binary string .
The initial configuration of on the input is some fixed string.
A halting configuration will have the value a certain state (which can be easily “read off” from it) set to .
If is a configuration at some step of the computation, we denote by as the configuration at the next step. is a string that agrees with on all but a constant number of coordinates (those encoding the position corresponding to the head position and the two adjacent ones). On those coordinates, the value of can be computed by some finite function.
We will let the alphabet . A computation history of on the input is a string that corresponds to a list (i.e., comes before an even numbered block, and comes before an odd numbered one) such that if is even then is the string encoding the configuration of on input at the beginning of its -th iteration, and if is odd then it is the same except the string is reversed. (That is, for odd , encodes the configuration of on input at the beginning of its -th iteration.) Reversing the odd-numbered blocks is a technical trick to ensure that the function we define below is context free.
We now define as follows:
We will show the following claim:
CLAIM: is context-free.
The claim implies the theorem. Since halts on if and only if there exists a valid computation history, is the constant one function if and only if does not halt on . In particular, this allows us to reduce determining whether halts on to determining whether the grammar corresponding to is full.
We now turn to the proof of the claim. We will not show all the details, but the main point if at least one of the following three conditions hold:
is not of the right format, i.e. not of the form .
contains a substring of the form such that
contains a substring of the form such that
Since context-free functions are closed under the OR operation, the claim will follow if we show that we can verify conditions 1, 2 and 3 via a context-free grammar.
For condition 1 this is very simple: checking that is of the correct format can be done using a regular expression. Since regular expressions are closed under negation, this means that checking that is not of this format can also be done by a regular expression and hence by a context-free grammar.
For conditions 2 and 3, this follows via very similar reasoning to that showing that the function such that iff is context-free, see Solved Exercise 10.2. After all, the function only modifies its input in a constant number of places. We leave filling out the details as an exercise to the reader. Since if and only if satisfies one of the conditions 1., 2. or 3., and all three conditions can be tested for via a context-free grammar, this completes the proof of the claim and hence the theorem.
Summary of semantic properties for regular expressions and context-free grammars
To summarize, we can often trade expressiveness of the model for amenability to analysis. If we consider computational models that are not Turing complete, then we are sometimes able to bypass Rice’s Theorem and answer certain semantic questions about programs in such models. Here is a summary of some of what is known about semantic questions for the different models we have seen.
Model |
Halting |
Emptiness |
Equivalence |
---|---|---|---|
Regular expressions |
Computable |
Computable |
Computable |
Context free grammars |
Computable |
Computable |
Uncomputable |
Turing-complete models |
Uncomputable |
Uncomputable |
Uncomputable |
- The uncomputability of the Halting problem for general models motivates the definition of restricted computational models.
- In some restricted models we can answer semantic questions such as: does a given program terminate, or do two programs compute the same function?
- Regular expressions are a restricted model of computation that is often useful to capture tasks of string matching. We can test efficiently whether an expression matches a string, as well as answer questions such as Halting and Equivalence.
- Context free grammars is a stronger, yet still not Turing complete, model of computation. The halting problem for context free grammars is computable, but equivalence is not computable.
Exercises
Suppose that are context free. For each one of the following definitions of the function , either prove that is always context free or give a counterexample for regular that would make not context free.
.
.
where is the reverse of : for .
Prove that the function such that if and only if is a power of two is not context free.
Consider the following syntax of a “programming language” whose source can be written using the ASCII character set:
Variables are obtained by a sequence of letters, numbers and underscores, but can’t start with a number.
A statement has either the form
foo = bar;
wherefoo
andbar
are variables, or the formIF (foo) BEGIN ... END
where...
is list of one or more statements, potentially separated by newlines.
A program in our language is simply a sequence of statements (possibly separated by newlines or spaces).
Let be the function that given a string , outputs if and only if corresponds to an ASCII encoding of a valid variable identifier. Prove that is regular.
Let be the function that given a string , outputs if and only if is an ASCII encoding of a valid program in our language. Prove that is context free. (You do not have to specify the full formal grammar for , but you need to show that such a grammar exists.)
Prove that is not regular. See footnote for hint
Bibliographical notes
As in the case of regular expressions, there are many resources available that cover context-free grammar in great detail. Chapter 2 of (Sipser, 1997) contains many examples of context-free grammars and their properties. There are also websites such as Grammophone where you can input grammars, and see what strings they generate, as well as some of the properties that they satisfy.
The adjective “context free” is used for CFG’s because a rule of the form means that we can always replace with the string , no matter what is the context in which appears. More generally, we might want to consider cases where the replacement rules depend on the context. This gives rise to the notion of general (aka “Type 0”) grammars that allow rules of the form where both and are strings over . The idea is that if, for example, we wanted to enforce the condition that we only apply some rule such as when is surrounded by three zeroes on both sides, then we could do so by adding a rule of the form (and of course we can add much more general conditions). Alas, this generality comes at a cost - general grammars are Turing complete and hence their halting problem is uncomputable. That is, there is no algorithm that can determine for every general grammar and a string , whether or not the grammar generates .
The Chomsky Hierarchy is a hierarchy of grammars from the least restrictive (most powerful) Type 0 grammars, which correspond to recursively enumerable languages (see Exercise 9.10) to the most restrictive Type 3 grammars, which correspond to regular languages. Context-free languages correspond to Type 2 grammars. Type 1 grammars are context sensitive grammars. These are more powerful than context-free grammars but still less powerful than Turing machines. In particular functions/languages corresponding to context-sensitive grammars are always computable, and in fact can be computed by a linear bounded automatons which are non-deterministic algorithms that take space. For this reason, the class of functions/languages corresponding to context-sensitive grammars is also known as the complexity class ; we discuss space-bounded complexity in Chapter 17). While Rice’s Theorem implies that we cannot compute any non-trivial semantic property of Type 0 grammars, the situation is more complex for other types of grammars: some semantic properties can be determined and some cannot, depending on the grammar’s place in the hierarchy.
As in the case of Definition 6.7 we can also use language rather than function notation and say that a language is context free if the function such that iff is context free.
Try to see if you can “embed” in some way a function that looks similar to in , so you can use a similar proof. Of course for a function to be non-regular, it does not need to utilize literal parentheses symbols.
Comments
Comments are posted on the GitHub repository using the utteranc.es app. A GitHub login is required to comment. If you don't want to authorize the app to post on your behalf, you can also comment directly on the GitHub issue for this page.
Compiled on 12/06/2023 00:07:33
Copyright 2023, Boaz Barak.
This work is
licensed under a Creative Commons
Attribution-NonCommercial-NoDerivatives 4.0 International License.
Produced using pandoc and panflute with templates derived from gitbook and bookdown.