11 Lecture 10 – Introduction to Compiler Correctness
In the previous class, and in the notes posted share/lecture9-closure-conversion-code.rkt, we saw the closure conversion translation. This is a standard compiler translation used to transform higher order functions into a pair of a closed function, representing a function pointer, and a data structure representing all the free variables of the function. This is a relatively simple translation, so we use it as our example translation to ask: how do we prove compiler correctness?
Before we can prove compiler correctness, we need to say what it means for a compiler to be correct.
11.1 Closure Conversion Review
Below, I partially define two languages and the closure conversion translation between them.
(naturals) n |
(variables) x |
|
e ::= λx.e | e e | n | x | e + e |
o ::= n |
v ::= n | λx.e |
|
[ e -> e ] |
|
----------------------- |
(λx.e₁) e₂ -> e_1[e₂/x] |
|
n₃ = n₁ [[+]] n₂ |
-------------- |
n₁ + n₂ -> n₃ |
|
|
[ e ->* e] |
|
------- |
e ->* e |
|
e₁ -> e₂ |
------- |
e₁ ->* e₂ |
|
e₁ ->* e₂ |
------------------ |
(λx.e₁) ->* (λx.e₂) |
|
[ eval(e) = o ] |
|
e ->* o |
----------- |
eval(e) = o |
(naturals) n |
(variables) x y |
|
e ::= λ(x,y).e | e e e | n | x | e + e | <e₁, ..., e_n> | prjᵢ e |
| let x = e in e |
o ::= n | <o, ..., o> |
v ::= n | λx.e | <v, ..., v> |
|
|
[ e -> e ] |
|
----------------------- |
(λx.e₁) e₂ e₃ -> e_1[e₂/x][e₃/y] |
|
n₃ = n₁ [[+]] n₂ |
-------------- |
n₁ + n₂ -> n₃ |
|
--------------------------- |
let x = e₁ in e₂ -> e₂[e₁/x] |
|
[ eval(e) = o ] |
|
e ->* o |
----------- |
eval(e) = o |
|
In the target language, we want it to be the case that every function has no free variables. This way, they can be compiled directly to labeled blocks of assembly.
If you have no functional programming background, it might be counter-intuitive that functions can have free variables. so imagine the method of an object that implicitly references fields from the object. For example, in Java, you can refer to a field f implicitly, or explicitly self.f. The field name can be treated as a free variable in the method. A higher-order function is essentially an object with a single method, apply, and whose fields are any free variables in scope when the function is created. Closure conversion is translating them to an explicit object, where the environment parameter can be thought of as the self parameter.
[ 〚e〛 = e ] |
〚x〛 = x |
|
〚n〛 = n |
|
〚λx.e〛 = <(λ(x,y). |
let x₁ = prj₁ y |
. |
. |
. |
x_n = prj_n y in |
〚e〛) |
<x₁, ..., x_n>> |
where (x₁, ..., x_n) = free-variables(λx.e) |
|
〚e₁ e₂〛= let p = 〚e₁〛in (prj₁ p) 〚e₂〛 (prj₂ p) |
11.2 What even is compiler correctness
Intuitively, what we want from a "correct" compiler is that a program from the source language means the same thing as a program in the target language. To make this formal, we need a formal definition of "program" and "means the same thing".
Thankfully, all we’ve been doing all semester long is getting really formal about both of those things. We formalized programs as expression for which evaluation is defined. And we interpreted the meaning of those expressions as evaluation to an observation.
11.2.1 Whole-Program Correctness
This leads us to our first definition of compiler correctness: (Whole-Program) Correctness Theorem: If eval(e) = o_1 then eval(〚e〛) = o_2 and o_1 ~ o_2
This requires that we define a relation between source observations and target observations. In this language, our only observations are natural numbers in decimal notation, so we can define the following simple relation.
[ o ~ o ] |
|
------ |
n ~ n |
Two observations are related if they are literally the same.
Because of its simplicity, we can restate the theorem a bit more tersely: Theorem: eval(e) ~ eval(〚e〛)
This theorem is what is usually meant when people use the phrase "correctness". That a whole program reduces to a related result after compilation.
11.2.2 Correctness of Separate Compilation
Unfortunately, the above theorem gives us very few guarantees. For example, we get no guarantees if we want to separately compile two components and then link them. It only gives us guarantees about whole programs that evaluate in the source language.
I don’t know anyone who has written a whole program very often. Usually, we write modules or components that link with other components. We often separately compile these for simplicity of distributing software or to speed up compilation times. The minimum a realistic correct compiler must support is separate compilation.
Recall from lecture that we can formalize linking as substitution, and define the following notions of valid program and program component. For these definitions, I’ll assume our languages have a type system. This makes the definitions easier, although any implementation that obeys the following properties will work will the compiler correctness theorems. Recall that type systems are a simple and common way to easily make predictions about programs, such as what kind of observation they will return.
[ ⊢ e ] (valid whole program, suitable for evaluation) |
|
· ⊢ e : Nat |
----- |
⊢ e |
|
[ Γ ⊢ e ] (valid program component, suitable for evaluation after linking) |
|
Γ ⊢ e : Nat |
----- |
Γ ⊢ e |
|
[ Γ ⊢ γ ] (valid closing substitution, representing a set of modules to link with) |
|
----- |
· ⊢ · |
|
|
Γ ⊢ γ |
· ⊢ v : A |
----------------- |
Γ,x:A ⊢ γ[x -> v] |
|
[ γ(e) = e ] (linking) |
|
--------- |
·(e) = e |
|
γ(e[v/x]) = e_3 |
----------------------- |
γ[x -> v](e) = e_3 |
We desire the following properties from these judgments.
Type Safety Theorem: If ⊢ e then eval(e) = o
Linking Theorem: If Γ ⊢ e and Γ ⊢ γ then ⊢ γ(e)
We model linking as iterating substitution, replacing every free variables x by the value it is mapped to in a closing substitution γ.
Type safety tells us we get a valid observation from any valid program. Linking tells us that if we have a valid program components, and a compatible set of modules to link with, then linking results in a valid program.
So what would correctness of separate compilation be? Intuitively, we should be able to either link in the source language, or separate compile all components and then link in the target language, and get related results.
Correctness of Separate Compilation Theorem: If Γ ⊢ e and Γ ⊢ γ then eval(γ(e)) ~ eval(〚γ〛(〚e〛))
This requires that we lift the compiler from expressions to closing substitution. This is not hard, though, and usually a research paper will leave out this step. Since the closing substitution is just a map containing expression, we can lift the compiler to the map by looping over the map and applying the expression translation.
〚·〛= · |
〚γ[x -> v]〛= 〚γ〛[〚x〛 -> 〚v〛] |
Note that nothing about this translation is specific to the languages or the translation, so it works for any language with the same notion of linking.
11.2.3 Compilitional Compiler Correctness
Unfortunately, the above theorem still does not give us as many guarantees as a programmer might expect. For example, I can compile a C program with icc or gcc or clang, and I expect to be able to link the results. I can even link with hand-written assembly that follows the C calling convention. The above theorem does not support any guarantees in this setting; it requires that we compile all components from the source language with the same compiler.
To support such a theorem, we need to define when closing substitutions, γ_b and γ_r, are related across languages. This relation states when some target language components behave like something in the source language, regardless of whether the target language components were compiled or hand-written. This relation must be independent of the compiler. Otherwise, we only get separate compilation.
The techniques for defining this relation are complex.
Typically, we want to allow functions as inputs—
The typical way we would define this relation syntactically is to first define a relation on closed values. The relation should be type directed, otherwise it is difficult to ensure it is inductively well defined. Since we have two languages, each expression has two types. We index the relation by the source type.
[ v ~ v : A ] |
|
forall v₁, v₂. v₁ ~ v₂ : A |
e₁[v₁/x] ~ e₂[v₂/x][e₃/y] : B |
---------------------------------- |
λx.e₁ ~ <(λ(x,y). e₂), e₃> : A -> B |
This relation says that a source function is related to a pair of a function and an expression, at type A -> B, if the bodies e₂ and e₃ are related at type B after replacing the parameters by some related inputs, for all related inputs of type A. Note that in the the closure converted function we also substitute the environment into e₂.
Since this relation refers to expressions and not values, we must also define when two arbitrary expressions are related. This is also difficult to do in a way that is inductively well defined. We want to simply say two expressions are related if they reduce to related values. However, this doesn’t work if the languages may not terminate, or contain run-time errors.
These relations are called logical relations, and there are many techniques for defining useful logical relations.
There are other ways to define a relation across languages, some of which we’ll read about later in this class.
After we have such a relation, we can define compositional compiler correctness.
Theorem: If Γ ⊢ e and Γ ⊢ γ_b ~ γ_r then eval(γ_b(e)) ~ eval(γ_r(〚e〛))
Note that we only compile e, but link with γ_r, some comopnents that already exist in the target language. If the relation Γ ⊢ γ_b ~ γ_r by simply translating γ_b, then this statement is exactly separate compilation. If the relation is independent of the compiler, then we can get guarantees when linking with arbitrary target language code, no matter if it was compiled or hand-written.
11.3 Proving Compiler Correctness
To prove correctness, we use essentially the same techniques we’ve seen all semester: proof by induction, and building derivations. I demonstrate part of the proof of whole-program correctness. If we design the compiler and proof architecture well, correctness of separate compilation falls out for free.
Theorem: eval(e) ~ eval(〚e〛)
Before we start the proof, we should observe that eval is defined in terms of another judgment ->*, which is defined in terms of ->. It’s therefore a good idea to state lemmas that both of these judgments are preserved by compilation.
Conversion Preservation Lemma: If e_1 ->* e_2 then 〚e_1〛 ->* 〚e_2〛.
Reduction Preservation Lemma: If e_1 -> e_2 then 〚e_1〛 ->* 〚e_2〛.
This second lemma is somewhat counter-intuitive. It states that if e₁ reduces to e₂, then the translation of e₁ converts to e₂. That is, a term that takes a single step in the source may take 0 or more steps to the equivalent term in the target. This is important to allow the compiler to optimize away simple steps (translating 1 step to 0 steps), or implement complex steps as many (translating 1 step to many steps).
Note that since our theorem only requires that we show the translated term evaluates, we only need a conclusion that refers to ->* (since evaluation is defined by ->*). We don’t need to know anything about reduction directly in the target language.
The proofs of the main theorem and the second lemma are straightforward:
Theorem: eval(e) ~ eval(〚e〛)
By "induction" on eval(e). There is one case:
e ->* o
-----------
eval(e) = o
We must show that eval(〚e〛) = o
We know that e ->* o, and therefore by by Conversion Preservation, we know that 〚e〛->* 〚o〛. Since our only observation is natural numbers, and the compiler transforms natural numbers to the same natural number, we know that o ~ 〚o〛. QED.
Note that while this proof is technically by induction, the judgment has no inductive sub-derivations, so we can never use the induction hypothesis. In a paper, we would just call this proof by cases.
Conversion Preservation Lemma: If e_1 ->* e_2 then 〚e_1〛 ->* 〚e_2〛.
By induction on e_1 ->* e_2. There is one interesting case; most cases following easily by the induction hypothesis.
Case:
e_1 -> e_2
-----------
e_1 ->* e_2
We must show that 〚e_1〛->* 〚e_2〛. We know e_1 -> e_2, so by Reduction Preservation, we know 〚e_1〛->* 〚e_2〛.
Case:
e_1 ->* e_2
----------- [Fun-Compat]
λx.e_1 ->* λx.e_2
We must prove that 〚λx.e_1〛 ->* 〚λx.e_2〛.
By the induction hypothesis applied to e_1 ->* e_2, we know that 〚e_1〛 ->* 〚e_2〛.
The translation of λx.e_1 is:
<(λ(x,y).
let x₁ = prj₁ y
.
.
.
x_n = prj_n y in
〚e_1〛),
<x₁, ..., x_n>>
By compatibility rules for lists, functions, and let in the target languages, it’s simple to construct the derivation:
〚e_1〛 ->* 〚e_2〛
-----------------
... [Fun-Compat, n uses of Let-Compat]
---------------------------------------------------------------- [List-Compat]
<(λ(x,y) <(λ(x,y).
let x₁ = prj₁ y let x₁ = prj₁ y
. .
. ->* .
. .
x_n = prj_n y in x_n = prj_n y in
〚e_1〛), 〚e_2〛),
<x₁, ..., x_n>> <x₁, ..., x_n>>
Actually, this isn’t quite true, since e_1 and e_2 might have different sets of free variables. Instead, you need to modify all the theorems slightly to define equivalence between terms with different, but equivalent, environments. We’ll do this shortly.
The real work is in the proof of reduction preservation. In closure conversion, we translate one step of β reduction into many steps.
Lemma: If e_1 -> e_2 then 〚e_1〛 ->* 〚e_2〛.
By "induction" on e_1 -> e_2.
Case:
----------------------
(λx.e₁) e₂ -> e₁[e₂/x]
We must show 〚(λx.e₁) e₂〛 ->* 〚e₁[e₂/x]〛.
By the definition of the translation, we know that 〚(λx.e₁) e₂〛 =
let p = <(λ(x,y).
let x₁ = prj₁ y
.
.
.
x_n = prj_n y in
〚e_1〛),
<x₁, ..., x_n>>
in (prj₁ p) 〚e_2〛 (prj₂ p)
Note that this term converts (->*) (in 3 steps) to:
(λ(x,y).
let x₁ = prj₁ y
.
.
.
x_n = prj_n y in 〚e_1〛) 〚e_2〛 <x₁, ..., x_n>
This is an application form that reduces (->) to:
let x₁ = prj₁ <x₁, ..., x_n>
.
.
.
x_n = prj_n <x₁, ..., x_n>
in 〚e_1〛[〚e_2〛/x]
This converts (->*), using n reductions to reduce projections and n reductions of lets, to:
〚e_1〛[〚e_2〛/x][x_1/x_1]...[x_n/x_n]
Each of the substitutions [x_1/x_1]...[x_n/x_n] is useless, replacing a name by the same name, so this is equal (syntactically) to:
〚e_1〛[〚e_2〛/x]
Note that this is very similar to our goal: 〚e₁[e₂/x]〛. We can complete the proof if we knew that 〚e_1〛[〚e_2〛/x] ->* 〚e₁[e₂/x]〛 This is a new property about a different judgment, subtitution, and looks like a job for a lemma.
Actually, we state a more general lemma:
Compositionality Lemma: 〚e_1〛[〚e_2〛/x] ≡ 〚e₁[e₂/x]〛
This states that substituting and then compiling is equivalent to compiling and then substituting. We can define program equivalence as:
[ e ≡ e ] |
|
e₁ ->* e |
e₂ ->* e |
------- |
e₁ ≡ e₂ |
Using this definition of compositionality, we need to change all the previous theorem conclusions to be modulo program equivalence.
We also may need additional equivalence rules depending on the translation. For example, we may want to consider two closures equivalence even when they have different environments, if substituing the environments results in equivalent programs:
... |
e₁[e₂/y] ≡ e₃[e₄/y] |
------- |
<(λ(x,y).e₁),e₂> ≡ <(λ(x,y).e₃),e₄> |
If linking is defined as substitution, then compositionality and whole-program correctness implies separate compilation.
Correctness of Separate Compilation Theorem: If Γ ⊢ e and Γ ⊢ γ then eval(γ(e)) ~ eval(〚γ〛(〚e〛))
By compositionality, 〚γ〛(〚e〛) ≡ 〚γ(e)〛. Therefore, eval(〚γ〛(〚e〛)) ≡ eval(〚γ(e)〛). By whole program correctness, we know that eval(γ(e)) ~ eval(〚γ(e)〛).
Note that for natural numbers, o₁ ≡ o₂ if and only if o₁ ~ o₂.
So eval(γ(e)) = o₁, o₁ ~ eval(〚γ(e)〛), o₁ ~ eval(〚γ〛(〚e〛)), eval(γ(e)) ~ eval(〚γ〛(〚e〛)).