Publications
Papers (Chronological)
A Low-Level Look at A-normal Form.
William J. Bowman
Proc. of the ACM on Programming Languages (PACMPL). OOPSLA. 2024.
A-normal form (ANF) is a widely studied intermediate form in which local control and data flow is made explicit in syntax, and a normal form in which many programs with equivalent control-flow graphs have a single normal syntactic representation.
However, ANF is difficult to implement effectively and, as we formalize, difficult to extend with new lexically scoped constructs such as scoped region-based allocation.
The problem, as has often been observed, is that normalization of commuting conversions is hard.
This traditional view of ANF that normalizing commuting conversions is hard, found in formal models and informed by high-level calculi, is wrong.
By studying the low-level intensional aspects of ANF, we can derive a normal form in which normalizing commuting conversion is easy, does not require join points, or code duplication, or renormalization after inlining, and is easily extended with new lexically scoped effects.
We formalize the connection between ANF and monadic form and their intensional properties, derive an imperative ANF, and design a compiler pipeline from an untyped lambda-calculus with scoped regions, to monadic form, to a low-level imperative monadic form in which A-normalization is trivial and safe for regions.
We prove that any such compiler preserves, or optimizes, stack and memory behaviour compared to ANF.
Our formalization reconstructs and systematizes pragmatic choices found in practice, including current production-ready compilers.
The main take-away from this work is that, in general, monadic form should be preferred over ANF, and A-normalization should only be done in a low-level imperative intermediate form.
This maximizes the advantages of each form, and avoids all the standard problems with ANF.Abstract |
Preprint |
Artifact
Type Universes as Allocation Effects.
Paulette Koronkevich and William J. Bowman
2024.
In this paper, we explore a connection between type universes and memory allocation. Type universe hierarchies are used in dependent type theories to ensure consistency, by forbidding a type from quantifying over all types. Instead, the types of types (universes) form a hierarchy, and a type can only quantify over types in other universes (with some exceptions), restricting cyclic reasoning in proofs. We present a perspective where universes also describe where values are allocated in the heap, and the choice of universe algebra imposes a structure on the heap overall. The resulting type system provides a simple declarative system for reasoning about and restricting memory allocation, without reasoning about reads or writes. We present a theoretical framework for equipping a type system with higher-order references restricted by a universe hierarchy, and conjecture that many existing universe algebras give rise to interesting systems for reasoning about allocation. We present 3 instantiations of this approach to enable reasoning about allocation in the simply typed λ-calculus: (1) the standard ramified universe hierarchy, which we prove guarantees termination of the language extended with higher-order references by restricting cycles in the heap; (2) an extension with an impredicative base universe, which we conjecture enables full-ground references (with terminating computation but cyclic ground data structures); (3) an extension with universe polymorphism, which divides the heap into fine-grained regions.Abstract |
arXiv
Indexed Types for a Statically Safe WebAssembly.
Adam. T. Geller and Justin P. Frank and William J. Bowman
Proc. of the ACM on Programming Languages (PACMPL).. POPL. 2024.
We present Wasm-precheck, a superset of WebAssembly (Wasm) that uses indexed types to express and check simple constraints over program values. This additional static reasoning enables safely removing dynamic safety checks required by Wasm, such as memory bounds checks. We implement Wasm-precheck as an extension of the Wasmtime compiler and runtime, evaluate the run-time and compile-time performance of Wasm-precheck vs Wasm, and find an average run-time performance gain of 1.71x faster in the widely used PolyBenchC benchmark suite, for a small overhead in binary size (7.18% larger) and type-checking time (1.4% slower). We also prove type and memory safety of Wasm-precheck, prove Wasm safely embeds into Wasm-precheck ensuring backwards compatibility, prove Wasm-precheck type-erases to Wasm, and discuss design and implementation trade-offs.Abstract |
Paper (preprint) |
Open Access DOI |
Artifact DOI
One Weird Trick to Untie Landin’s
Knot
Paulette Koronkevich and William J. Bowman
Talk at Workshops on Higher-order Programming with Effects (HOPE 2023).
In this work, we explore Landin’s Knot, which is understood as a pattern for encoding general recursion, including non-termination, that is possible after adding higher-order references to an otherwise terminating language. We observe that this isn’t always true—
Is Sized Typing for Coq Practical?.
Jonathan Chan and Yufeng Li and William J. Bowman
Journal of Functional Programming. 2023.
Contemporary proof assistants such as Coq require that recursive functions be terminating and core- cursive functions be productive to maintain logical consistency of their type theories, and some ensure these properties using syntactic checks. However, being syntactic, they are inherently delicate and restrictive, preventing users from easily writing obviously terminating or productive functions at their whim.
Meanwhile, there exist many sized type theories that perform type-based termination and produc- tivity checking, including theories based on the Calculus of (Co)Inductive Constructions (CIC), the core calculus underlying Coq. These theories are more robust and compositional in comparison. So why haven’t they been adapted to Coq?
In this paper, we venture to answer this question with CIC∗, a sized type theory based on CIC. It extends past work on sized types in CIC with additional Coq features such as global and local definitions. We also present a corresponding size inference algorithm and implement it within Coq’s kernel; for maximal backward compatibility with existing Coq developments, it requires no additional annotations from the user.
In our evaluation of the implementation, we find a severe performance degradation when compil- ing parts of the Coq standard library, inherent to the algorithm itself. We conclude that if we wish to maintain backward compatibility, using size inference as a replacement for syntactic checking is impractical in terms of performance.Abstract |
Open Access DOI |
arXiv |
Artifact
ANF Preserves Dependent Types up to Extensional Equality.
Paulette Koronkevich, Ramon Rakow, Amal Ahmed, and William J. Bowman
Journal of Functional Programming. 2022.
Many programmers use dependently typed languages such as Coq to machine-verify
high-assurance software.
However, existing compilers for these languages provide no
guarantees after compiling, nor when linking after compilation.
Type-preserving compilers preserve guarantees encoded
in types, then use type checking to verify compiled code and ensure
safe linking with external code.
Unfortunately, standard compiler passes do not preserve the
dependent typing of commonly used (intensional) type theories.
This is because assumptions valid in simpler type systems no longer hold,
and intensional dependent type systems are highly sensitive to
syntactic changes, including compilation.
We develop an A-normal form (ANF) translation with
join-point optimization—
Macro-embedding Compiler
Intermediate Languages in Racket
William J. Bowman.
Full Paper, in Proc. of the Scheme Workshop 2022.
We present the design and implementation of a macro-embedding of a
family of compiler intermediate languages, from a Scheme-like language
to x86-64, into Racket. This embedding is used as part of a testing
framework for a compilers course to derive interpreters for all the
intermediate languages. The embedding implements features including
safe, functional abstractions as well as unsafe assembly features, and
the interactions between the two at various intermediate stages.
This paper aims to demonstrate language-oriented techniques and
abstractions for implementing (1) a large family of languages and (2)
interoperability between low- and high-level languages. The primary
strength of this approach is the high degree of code reuse and
interoperability compared to implementing each interpreter separately.
The design emphasizes modularity and compositionality of an open set of
language features by local macro expansion into a single host language,
rather than implementing a language pre-defined by a closed set of
features. This enables reuse from both the host language (Racket) and
between intermediate languages, and enables interoperability between
high- and low-level features, simplifying development of the
intermediate language semantics. It also facilitates extending or
redefining individual language features in intermediate languages, and
exposing multiple interfaces to the embedded languages.Abstract |
Scheme 2022 Talk |
Paper |
Software (archived)
Compilation as Multi-Language Semantics
William J. Bowman.
Talk at the Workshop on Principles of Secure Compilation (PriSC 2021).
Modeling interoperability between programs in different languages is a key
problem when modeling compositional and secure compilation, which has been
successfully addressed using multi-language semantics.
Unfortunately, existing models of compilation using multi-language semantics
define two variants of each compiler pass: a syntactic translation
on open terms, and a run-time translation of closed terms at multi-language
boundaries
We introduce a novel work-in-progress approach to uniformly model a compiler
entirely as a reduction system on open term in a multi-language semantics,
rather than as a syntactic translation.
This simultaneously defines the compiler and the interoperability semantics,
reducing duplication.
It also provides interesting semantic insights.
Normalization of the cross-language redexes performs ahead-of-time (AOT)
compilation.
Evaluation in the multi-language models just-in-time (JIT) compilation.
Confluence of multi-language reduction implies compiler correctness.
Subject reduction of the multi-language reduction implies type-preservation of
the compiler.
This model provides a strong attacker model through contextual equivalence,
retaining its usefulness for modeling secure compilation as full abstraction.Abstract |
PriSC 2021 Talk |
Extended Abstract
Dependent Type Systems as Macros.
Stephen Chang, Michael Ballantyne, Milo Turner, William J. Bowman
In Proc. of the Symposium on Principles of Programming
Languages (POPL 2020).
We present Turnstile+, a high-level, macros-based metaDSL for building
dependently typed languages.
With it, programmers may rapidly prototype and iterate on the design of new
dependently typed features and extensions.
Or they may create entirely new DSLs whose dependent type “power” is tailored
to a specific domain.
Our framework’s support of language-oriented programming also makes it suitable
for experimenting with systems of interacting components, e.g., a proof
assistant and its companion DSLs.
This paper explains the implementation details of Turnstile+, as well as how it
may be used to create a wide-variety of dependently typed languages, from a
lightweight one with indexed types, to a full spectrum proof assistant, complete
with a tactic system and extensions for features like sized types and SMT
interaction.Abstract |
Paper |
Artifact |
GitHub (Turnstile+) |
GitHub (Cur)
Compiling with Dependent Types.
William J. Bowman.
Northeastern University, Feb. 2019.
Dependently typed languages have proven useful for developing large-scale fully
verified software, but we do not have any guarantees after compiling that
verified software.
A verified program written in a dependently typed language, such as Coq, can be
type checked to ensure that the program meets its specification.
Similarly, type checking prevents us from importing a library and
violating the specification declared by its types.
Unfortunately, we cannot perform either of these checks after compiling a
dependently typed program, since all current implementations erase types before
compiling the program.
Instead, we must trust the compiler to not introduce errors into the
verified code, and, after compilation, trust the programmer to never introduce
errors by linking two incompatible program components.
As a result, the compiled and linked program is
not verified—
In this dissertation, I develop a theory for preserving dependent types through
compilation so that we can use type checking after compilation to check that no
errors are introduced by the compiler or by linking.
Type-preserving compilation is a well-known technique that has been used to
design compilers for non-dependently typed languages, such as ML, that
statically enforce safety and security guarantees in compiled code.
But there are many open challenges in scaling type preservation to dependent
types.
The key problems are adapting syntactic type systems to interpret low-level
representations of code, and breaking the complex mutually recursive structure
of dependent type systems to make proving type preservation and compiler
correctness feasible.
In this dissertation, I explain the concepts required to scale type preservation
to dependent types, present a proof architecture and language design that
support type preservation, and prove type preservation and compiler correctness
for four early-stage compiler translations of a realistic dependently typed
calculus.
These translations include an A-normal form (ANF), a continuation-passing style
(CPS), an abstract closure conversion, and a parametric closure conversion
translation.Abstract |
PDF |
Slides (keynote) |
Slides (PDF) |
Source (GitHub)
Typed Closure Conversion of the Calculus of Constructions.
William J. Bowman and Amal Ahmed
In Proc. of the Conference on Programming Language Implementation and
Design (PLDI 2018).
Dependently typed languages such as Coq are used to specify and verify the full functional
correctness of source programs.
Type-preserving compilation can be used to preserve these specifications and proofs of correctness
through compilation into the generated target-language programs.
Unfortunately, type-preserving compilation of dependent types is hard.
In essence, the problem is that dependent type systems are designed around high-level compositional abstractions
to decide type checking, but compilation interferes with the type-system rules for reasoning about
run-time terms.
We develop a type-preserving closure-conversion translation from the Calculus of Constructions
(CC) with strong dependent pairs (Σ types)—
Type-Preserving CPS Translation of Σ and Π Types is Not Not Possible.
William J. Bowman, Youyou Cong, Nick Rioux, and Amal Ahmed
In Proc. of the Symposium on Principles of Programming Languages (POPL 2018)
Dependently typed languages such as Coq are used to specify and prove functional correctness of source
programs, but what we ultimately need are guarantees about correctness of compiled code.
By preserving dependent types through each compiler pass, we could preserve source-level
specifications and correctness proofs into the generated target-language programs.
Unfortunately, type-preserving compilation of dependent types is a challenging problem.
In 2002, Barthe and Uustalu showed that type-preserving CPS is \emph{not possible} for languages such as
Coq.
Specifically, they showed that for strong dependent pairs ($\Sigma$ types), the standard typed
call-by-name CPS is \emph{not type preserving}.
They further proved that for dependent case analysis on sums, a class of typed CPS
translations—
In this paper, we prove that type-preserving CPS translation for dependently typed languages is
\emph{not} not possible.
We develop both call-by-name and call-by-value CPS translations from the Calculus of Constructions
with both $\Pi$ and $\Sigma$ types (CC) to a dependently typed target language, and prove type
preservation and compiler correctness of each translation.
Our target language is CC extended with an additional equivalence rule and an additional typing rule,
which we prove consistent by giving a model in the extensional Calculus of Constructions.
Our key observation is that we can use a CPS translation that employs \emph{answer-type polymorphism},
where CPS-translated computations have type $\forall \alpha. (A \rightarrow \alpha) \rightarrow
\alpha$.
This type justifies, by a \emph{free theorem}, the new equality rule in our target language and allows
us to recover the term/type equivalences that CPS translation disrupts.
Finally, we conjecture that our translation extends to dependent case analysis on sums, despite the
impossibility result, and provide a proof sketch.Abstract |
Paper |
Technical Appendix |
POPL 2018 Talk (by me) |
POPL 2018 Lightning Talk (by me) |
Slides |
Supplementary Materials
Dependently Typed Assembly and Secure Linking (short talk)
William J. Bowman.
Talk at the Workshop on Principles of Secure Compilation (PriSC 2018).
Type-preserving compilation is used to statically enforce safety and security properties through
type checking.
The idea is to design strongly typed compiler target languages, preserve type information through the
compiler, then use the types in the target language to enforce invariants when linking with untrusted
code.
Unfortunately, this technique is limited by the expressiveness of the target type system, and existing
simple and polymorphic typed assembly languages cannot express all security invariants we wish to
enforce.
Dependent types could be used to express safety, security, and full functional correctness invariants.
In this talk, I briefly describe work-in-progress on developing a dependently typed assembly, and how
it could be used to statically enforce security guarantees when linking.Abstract |
Slides
Only Control Effects and Dependent Types.
Youyou Cong, William J. Bowman.
Talk at the Workshop on Higher-order Programming with Effects (HOPE
2017).
Abstract |
GitHub
Growing a Proof Assistant.
William J. Bowman.
Talk at the Workshop on Higher-order Programming with Effects (HOPE
2016).
Sophisticated domain-specific and user-defined notation is widely used in
formal models, but is poorly supported by proof assistants.
Many proof assistants support simple notation definitions, but no proof
assistant enables users to conveniently define sophisticated
notation.
For instance, in modeling a programming language, we often define infix
relations such as Γ ⊢ e : t and use BNF notation to specify the syntax
of the language.
In a proof assistant like Coq or Agda, users can easily define the notation
for Γ ⊢ e : t, but to use BNF notation the user must use a preprocessing
tool external to the proof assistant, which is cumbersome.
To support sophisticated user-defined notation, we propose to use
language extension as a fundamental part of the design of a proof
assistant.
We describe how to design a language-extension systems that support safe,
convenient, and sophisticated user-defined extensions, and how to design a
proof assistant based on language extension.
We evaluate this design by building a proof assistant that features a small
dependent type theory as the core language and implementing the following
extensions in small user-defined libraries: pattern matching for inductive
types, dependently-typed staged meta-programming, a tactic-based proof
language, and BNF and inference-rule notation for inductive type definitions.Abstract |
Draft Paper |
HOPE 2016 Talk (by me) |
GitHub
Fully Abstract Compilation via Universal Embedding.
Max New, William J. Bowman, and Amal Ahmed.
In Proc. of the International Conference on Functional
Programming (ICFP 2016)
A fully abstract compiler guarantees that two source components
are observationally equivalent in the source language if and only if
their translations are observationally equivalent in the target.
Full abstraction implies the translation is secure: target-language
attackers can make no more observations of a compiled component than a
source-language attacker interacting with the original source
component.
Proving full abstraction for realistic compilers is challenging because
realistic target languages contain features (such as control effects)
unavailable in the source, while proofs of full abstraction require showing
that every target context to which a compiled component may be linked can be
back-translated to a behaviorally equivalent source context.
We prove the first full abstraction result for a translation whose target
language contains exceptions, but the source does not.
Our translation—
Noninterference for Free.
William J. Bowman, and Amal Ahmed.
In Proc. of the International Conference on Functional
Programming (ICFP 2015)
Abadi et. al. (1999) introduced the dependency core calculus
(DCC) as a framework for studying a variety of dependency analyses
(e.g., secure information flow). The key property provided by DCC is
noninterference, which guarantees that a low-level observer
(attacker) cannot distinguish high-level (protected) computations.
The proof of noninterference for DCC suggests a connection to
parametricity in System F, which suggests that it should be possible
to implement dependency analyses in languages with parametric
polymorphism.
In this paper, we present a translation from DCC into Fω and
prove that the translation preserves noninterference. To express
noninterference in Fω we define a notion of observer-sensitive
equivalence that makes essential use of both first-order and
higher-order polymorphism. Our translation provides insights into
DCC’s type system and shows how DCC can be implemented in a
polymorphic language without loss of the security/noninterference
guarantees available in DCC. Our contributions include proof
techniques that should be valuable when proving other secure
compilation or full abstraction results.Abstract |
Paper |
Technical Appendix |
ICFP 2015 Talk (by me) |
Slides |
Author-Izer
Profile-Guided Meta-Programming.
William J. Bowman, Swaha Miller, Vincent St-Amour, and R. Kent Dybvig.
In Proc. of the Conference on Programming Language Implementation and
Design (PLDI 2015).
Contemporary compiler systems such as GCC, .NET, and LLVM incorporate
profile-guided optimizations (PGOs) on low-level intermediate code and
basic blocks, with impressive results over purely static heuristics.
Recent work shows that profile information is also useful for performing
source-to-source optimizations via meta-programming.
For example, using profiling information to inform decisions about data
structures and algorithms can potentially lead to asymptotic
improvements in performance.
We present a design for profile-guided meta-programming in a
general-purpose meta-programming system.
Our design is parametric over the particular profiler and
meta-programming system.
We implement this design in two different meta-programming systems—
Dagger Traced Symmetric Monoidal Categories and Reversible Programming.
William J. Bowman, Roshan P. James, and Amr Sabry.
In Proc. of the 4th Workshop on Reversible Computation (RC
2011).
Paper |
Code
Talks
Compilation as Multi-Language Semantics
William J. Bowman
Modeling interoperability between programs in different languages is a key
problem when modeling verified and secure compilation, which has been
successfully addressed using multi-language semantics.
Unfortunately, existing models of compilation using multi-language semantics
define two variants of each compiler pass: a syntactic translation
on open terms to model compilation, and a run-time translation of closed terms
at multi-language boundaries to model interoperability.
In this talk, I discuss work-in-progress approach to uniformly model a compiler
entirely as a reduction system on open term in a multi-language semantics,
rather than as a syntactic translation.
This simultaneously defines the compiler and the interoperability semantics,
reducing duplication.
It also provides interesting semantic insights.
Normalization of the cross-language redexes performs ahead-of-time (AOT)
compilation.
Evaluation in the multi-language models just-in-time (JIT) compilation.
Confluence of multi-language reduction implies compiler correctness, and
part of the secure compilation proof (full abstraction), enabling focus on the difficult part of the proof.
Subject reduction of the multi-language reduction implies type-preservation of
the compiler.Abstract |
Video |
Slides (PDF) |
GitHub Repository
Cur: Designing a Less Devious Proof
Assistant
William J. Bowman
Dijkstra said that our tools can have a profound and devious influence on our
thinking. I find this especially true of modern proof assistants, with "devious"
out-weighing "profound". Cur is an experiment in design that aims to be less
devious. The design emphasizes language extension, syntax manipulation, and DSL
construction and integration. This enables the user to be in charge of how they
think, rather than requiring the user to contort their thinking to that of the
proof assistant. In this talk, my goal is to convince you that you want similar
capabilities in a proof assistant, and explain and demonstrate Cur’s attempt at
solving the problem.Abstract |
Video |
Slides (ODP) |
Slides (Google Slides) |
Demo Code |
Cur GitHub
Do Compilers Respect Programmers?
William J. Bowman
Video |
Keynote |
PDF
Other
Toward Type Preserving Compilation of Coq.
William J. Bowman.
POPL 2017 Student Research Competition
Extended Abstract |
Poster