Publications

Papers (Chronological)

A Low-Level Look at A-normal Form.
William J. Bowman
Proc. of the ACM on Programming Languages (PACMPL). OOPSLA. 2024.
A-normal form (ANF) is a widely studied intermediate form in which local control and data flow is made explicit in syntax, and a normal form in which many programs with equivalent control-flow graphs have a single normal syntactic representation. However, ANF is difficult to implement effectively and, as we formalize, difficult to extend with new lexically scoped constructs such as scoped region-based allocation. The problem, as has often been observed, is that normalization of commuting conversions is hard.

This traditional view of ANF that normalizing commuting conversions is hard, found in formal models and informed by high-level calculi, is wrong. By studying the low-level intensional aspects of ANF, we can derive a normal form in which normalizing commuting conversion is easy, does not require join points, or code duplication, or renormalization after inlining, and is easily extended with new lexically scoped effects. We formalize the connection between ANF and monadic form and their intensional properties, derive an imperative ANF, and design a compiler pipeline from an untyped lambda-calculus with scoped regions, to monadic form, to a low-level imperative monadic form in which A-normalization is trivial and safe for regions. We prove that any such compiler preserves, or optimizes, stack and memory behaviour compared to ANF. Our formalization reconstructs and systematizes pragmatic choices found in practice, including current production-ready compilers.

The main take-away from this work is that, in general, monadic form should be preferred over ANF, and A-normalization should only be done in a low-level imperative intermediate form. This maximizes the advantages of each form, and avoids all the standard problems with ANF.
AbstractAbstract (Hide) | Preprint | Artifact

Type Universes as Allocation Effects.
Paulette Koronkevich and William J. Bowman
2024.
In this paper, we explore a connection between type universes and memory allocation. Type universe hierarchies are used in dependent type theories to ensure consistency, by forbidding a type from quantifying over all types. Instead, the types of types (universes) form a hierarchy, and a type can only quantify over types in other universes (with some exceptions), restricting cyclic reasoning in proofs. We present a perspective where universes also describe where values are allocated in the heap, and the choice of universe algebra imposes a structure on the heap overall. The resulting type system provides a simple declarative system for reasoning about and restricting memory allocation, without reasoning about reads or writes. We present a theoretical framework for equipping a type system with higher-order references restricted by a universe hierarchy, and conjecture that many existing universe algebras give rise to interesting systems for reasoning about allocation. We present 3 instantiations of this approach to enable reasoning about allocation in the simply typed λ-calculus: (1) the standard ramified universe hierarchy, which we prove guarantees termination of the language extended with higher-order references by restricting cycles in the heap; (2) an extension with an impredicative base universe, which we conjecture enables full-ground references (with terminating computation but cyclic ground data structures); (3) an extension with universe polymorphism, which divides the heap into fine-grained regions.AbstractAbstract (Hide) | arXiv

Indexed Types for a Statically Safe WebAssembly.
Adam. T. Geller and Justin P. Frank and William J. Bowman
Proc. of the ACM on Programming Languages (PACMPL).. POPL. 2024.
We present Wasm-precheck, a superset of WebAssembly (Wasm) that uses indexed types to express and check simple constraints over program values. This additional static reasoning enables safely removing dynamic safety checks required by Wasm, such as memory bounds checks. We implement Wasm-precheck as an extension of the Wasmtime compiler and runtime, evaluate the run-time and compile-time performance of Wasm-precheck vs Wasm, and find an average run-time performance gain of 1.71x faster in the widely used PolyBenchC benchmark suite, for a small overhead in binary size (7.18% larger) and type-checking time (1.4% slower). We also prove type and memory safety of Wasm-precheck, prove Wasm safely embeds into Wasm-precheck ensuring backwards compatibility, prove Wasm-precheck type-erases to Wasm, and discuss design and implementation trade-offs.AbstractAbstract (Hide) | Paper (preprint) | Open Access DOI | Artifact DOI

One Weird Trick to Untie Landin’s Knot
Paulette Koronkevich and William J. Bowman
Talk at Workshops on Higher-order Programming with Effects (HOPE 2023).
In this work, we explore Landin’s Knot, which is understood as a pattern for encoding general recursion, including non-termination, that is possible after adding higher-order references to an otherwise terminating language. We observe that this isn’t always true— higher-order references, by themselves, don’t lead to non-termination. The key insight is that Landin’s Knot relies not primarily on references storing functions, but on unrestricted quantification over a function’s environment. We show this through a closure converted language, in which the function’s environment is made explicit and hides the type of the environment through impredicative quantification. Once references are added, this impredicative quantification can be exploited to encode recursion. We conjecture that by restricting the quantification over the environment, higher-order references can be safely added to terminating languages, without resorting to more complex type systems such as linearity, and without restricting references from storing functions.AbstractAbstract (Hide) | Extended Abstract

Is Sized Typing for Coq Practical?.
Jonathan Chan and Yufeng Li and William J. Bowman
Journal of Functional Programming. 2023.
Contemporary proof assistants such as Coq require that recursive functions be terminating and core- cursive functions be productive to maintain logical consistency of their type theories, and some ensure these properties using syntactic checks. However, being syntactic, they are inherently delicate and restrictive, preventing users from easily writing obviously terminating or productive functions at their whim.

Meanwhile, there exist many sized type theories that perform type-based termination and produc- tivity checking, including theories based on the Calculus of (Co)Inductive Constructions (CIC), the core calculus underlying Coq. These theories are more robust and compositional in comparison. So why haven’t they been adapted to Coq?

In this paper, we venture to answer this question with CIC∗, a sized type theory based on CIC. It extends past work on sized types in CIC with additional Coq features such as global and local definitions. We also present a corresponding size inference algorithm and implement it within Coq’s kernel; for maximal backward compatibility with existing Coq developments, it requires no additional annotations from the user.

In our evaluation of the implementation, we find a severe performance degradation when compil- ing parts of the Coq standard library, inherent to the algorithm itself. We conclude that if we wish to maintain backward compatibility, using size inference as a replacement for syntactic checking is impractical in terms of performance.
AbstractAbstract (Hide) | Open Access DOI | arXiv | Artifact

ANF Preserves Dependent Types up to Extensional Equality.
Paulette Koronkevich, Ramon Rakow, Amal Ahmed, and William J. Bowman
Journal of Functional Programming. 2022.
Many programmers use dependently typed languages such as Coq to machine-verify high-assurance software. However, existing compilers for these languages provide no guarantees after compiling, nor when linking after compilation. Type-preserving compilers preserve guarantees encoded in types, then use type checking to verify compiled code and ensure safe linking with external code. Unfortunately, standard compiler passes do not preserve the dependent typing of commonly used (intensional) type theories. This is because assumptions valid in simpler type systems no longer hold, and intensional dependent type systems are highly sensitive to syntactic changes, including compilation. We develop an A-normal form (ANF) translation with join-point optimization— a standard translation for making control flow explicit in functional languages— from the Extended Calculus of Constructions (ECC) with dependent elimination of booleans and natural numbers (a representative subset of Coq). Our dependently typed target language has equality reflection, allowing the type system to encode semantic equality of terms. This is key to proving type preservation and correctness of separate compilation for this translation. This is the first ANF translation for dependent types. Unlike related translations, it supports the universe hierarchy, and does not rely on parametricity or impredicativity.AbstractAbstract (Hide) | Open Access DOI

Macro-embedding Compiler Intermediate Languages in Racket
William J. Bowman.
Full Paper, in Proc. of the Scheme Workshop 2022.
We present the design and implementation of a macro-embedding of a family of compiler intermediate languages, from a Scheme-like language to x86-64, into Racket. This embedding is used as part of a testing framework for a compilers course to derive interpreters for all the intermediate languages. The embedding implements features including safe, functional abstractions as well as unsafe assembly features, and the interactions between the two at various intermediate stages.

This paper aims to demonstrate language-oriented techniques and abstractions for implementing (1) a large family of languages and (2) interoperability between low- and high-level languages. The primary strength of this approach is the high degree of code reuse and interoperability compared to implementing each interpreter separately. The design emphasizes modularity and compositionality of an open set of language features by local macro expansion into a single host language, rather than implementing a language pre-defined by a closed set of features. This enables reuse from both the host language (Racket) and between intermediate languages, and enables interoperability between high- and low-level features, simplifying development of the intermediate language semantics. It also facilitates extending or redefining individual language features in intermediate languages, and exposing multiple interfaces to the embedded languages.
AbstractAbstract (Hide) | Scheme 2022 Talk | Paper | Software (archived)

Compilation as Multi-Language Semantics
William J. Bowman.
Talk at the Workshop on Principles of Secure Compilation (PriSC 2021).
Modeling interoperability between programs in different languages is a key problem when modeling compositional and secure compilation, which has been successfully addressed using multi-language semantics. Unfortunately, existing models of compilation using multi-language semantics define two variants of each compiler pass: a syntactic translation on open terms, and a run-time translation of closed terms at multi-language boundaries

We introduce a novel work-in-progress approach to uniformly model a compiler entirely as a reduction system on open term in a multi-language semantics, rather than as a syntactic translation. This simultaneously defines the compiler and the interoperability semantics, reducing duplication. It also provides interesting semantic insights. Normalization of the cross-language redexes performs ahead-of-time (AOT) compilation. Evaluation in the multi-language models just-in-time (JIT) compilation. Confluence of multi-language reduction implies compiler correctness. Subject reduction of the multi-language reduction implies type-preservation of the compiler. This model provides a strong attacker model through contextual equivalence, retaining its usefulness for modeling secure compilation as full abstraction.
AbstractAbstract (Hide) | PriSC 2021 Talk | Extended Abstract

Dependent Type Systems as Macros.
Stephen Chang, Michael Ballantyne, Milo Turner, William J. Bowman
In Proc. of the Symposium on Principles of Programming Languages (POPL 2020).
We present Turnstile+, a high-level, macros-based metaDSL for building dependently typed languages. With it, programmers may rapidly prototype and iterate on the design of new dependently typed features and extensions. Or they may create entirely new DSLs whose dependent type “power” is tailored to a specific domain. Our framework’s support of language-oriented programming also makes it suitable for experimenting with systems of interacting components, e.g., a proof assistant and its companion DSLs. This paper explains the implementation details of Turnstile+, as well as how it may be used to create a wide-variety of dependently typed languages, from a lightweight one with indexed types, to a full spectrum proof assistant, complete with a tactic system and extensions for features like sized types and SMT interaction.AbstractAbstract (Hide) | Paper | Artifact | GitHub (Turnstile+) | GitHub (Cur)

Compiling with Dependent Types.
William J. Bowman.
Northeastern University, Feb. 2019.
Dependently typed languages have proven useful for developing large-scale fully verified software, but we do not have any guarantees after compiling that verified software. A verified program written in a dependently typed language, such as Coq, can be type checked to ensure that the program meets its specification. Similarly, type checking prevents us from importing a library and violating the specification declared by its types. Unfortunately, we cannot perform either of these checks after compiling a dependently typed program, since all current implementations erase types before compiling the program. Instead, we must trust the compiler to not introduce errors into the verified code, and, after compilation, trust the programmer to never introduce errors by linking two incompatible program components. As a result, the compiled and linked program is not verifiedwe have no guarantees about what it will do.

In this dissertation, I develop a theory for preserving dependent types through compilation so that we can use type checking after compilation to check that no errors are introduced by the compiler or by linking. Type-preserving compilation is a well-known technique that has been used to design compilers for non-dependently typed languages, such as ML, that statically enforce safety and security guarantees in compiled code. But there are many open challenges in scaling type preservation to dependent types. The key problems are adapting syntactic type systems to interpret low-level representations of code, and breaking the complex mutually recursive structure of dependent type systems to make proving type preservation and compiler correctness feasible. In this dissertation, I explain the concepts required to scale type preservation to dependent types, present a proof architecture and language design that support type preservation, and prove type preservation and compiler correctness for four early-stage compiler translations of a realistic dependently typed calculus. These translations include an A-normal form (ANF), a continuation-passing style (CPS), an abstract closure conversion, and a parametric closure conversion translation.
AbstractAbstract (Hide) | PDF | Slides (keynote) | Slides (PDF) | Source (GitHub)

Typed Closure Conversion of the Calculus of Constructions.
William J. Bowman and Amal Ahmed
In Proc. of the Conference on Programming Language Implementation and Design (PLDI 2018).
Dependently typed languages such as Coq are used to specify and verify the full functional correctness of source programs. Type-preserving compilation can be used to preserve these specifications and proofs of correctness through compilation into the generated target-language programs. Unfortunately, type-preserving compilation of dependent types is hard. In essence, the problem is that dependent type systems are designed around high-level compositional abstractions to decide type checking, but compilation interferes with the type-system rules for reasoning about run-time terms.

We develop a type-preserving closure-conversion translation from the Calculus of Constructions (CC) with strong dependent pairs (Σ types)— a subset of the core language of Coq— to a type-safe, dependently typed compiler intermediate language named CC-CC. The central challenge in this work is how to translate the source type-system rules for reasoning about functions into target type-system rules for reasoning about closures. To justify these rules, we prove soundness of CC-CC by giving a model in CC. In addition to type preservation, we prove correctness of separate compilation.
AbstractAbstract (Hide) | Paper | Technical Appendix | PLDI 2018 Talk (by me) | Slides

Type-Preserving CPS Translation of Σ and Π Types is Not Not Possible.
William J. Bowman, Youyou Cong, Nick Rioux, and Amal Ahmed
In Proc. of the Symposium on Principles of Programming Languages (POPL 2018)
Dependently typed languages such as Coq are used to specify and prove functional correctness of source programs, but what we ultimately need are guarantees about correctness of compiled code. By preserving dependent types through each compiler pass, we could preserve source-level specifications and correctness proofs into the generated target-language programs. Unfortunately, type-preserving compilation of dependent types is a challenging problem. In 2002, Barthe and Uustalu showed that type-preserving CPS is \emph{not possible} for languages such as Coq. Specifically, they showed that for strong dependent pairs ($\Sigma$ types), the standard typed call-by-name CPS is \emph{not type preserving}. They further proved that for dependent case analysis on sums, a class of typed CPS translations— including the standard translation— is \emph{not possible}. In 2016, Morrisett noticed a similar problem with the standard call-by-value CPS translation for dependent functions ($\Pi$ types). In essence, the problem is that the standard typed CPS translation by double-negation, in which computations are assigned types of the form $(A \rightarrow \bot) \rightarrow \bot$, disrupts the term/type equivalence that is used during type checking in a dependently typed language.

In this paper, we prove that type-preserving CPS translation for dependently typed languages is \emph{not} not possible. We develop both call-by-name and call-by-value CPS translations from the Calculus of Constructions with both $\Pi$ and $\Sigma$ types (CC) to a dependently typed target language, and prove type preservation and compiler correctness of each translation. Our target language is CC extended with an additional equivalence rule and an additional typing rule, which we prove consistent by giving a model in the extensional Calculus of Constructions. Our key observation is that we can use a CPS translation that employs \emph{answer-type polymorphism}, where CPS-translated computations have type $\forall \alpha. (A \rightarrow \alpha) \rightarrow \alpha$. This type justifies, by a \emph{free theorem}, the new equality rule in our target language and allows us to recover the term/type equivalences that CPS translation disrupts. Finally, we conjecture that our translation extends to dependent case analysis on sums, despite the impossibility result, and provide a proof sketch.
AbstractAbstract (Hide) | Paper | Technical Appendix | POPL 2018 Talk (by me) | POPL 2018 Lightning Talk (by me) | Slides | Supplementary Materials

Dependently Typed Assembly and Secure Linking (short talk)
William J. Bowman.
Talk at the Workshop on Principles of Secure Compilation (PriSC 2018).
Type-preserving compilation is used to statically enforce safety and security properties through type checking. The idea is to design strongly typed compiler target languages, preserve type information through the compiler, then use the types in the target language to enforce invariants when linking with untrusted code. Unfortunately, this technique is limited by the expressiveness of the target type system, and existing simple and polymorphic typed assembly languages cannot express all security invariants we wish to enforce. Dependent types could be used to express safety, security, and full functional correctness invariants. In this talk, I briefly describe work-in-progress on developing a dependently typed assembly, and how it could be used to statically enforce security guarantees when linking.AbstractAbstract (Hide) | Slides

Only Control Effects and Dependent Types.
Youyou Cong, William J. Bowman.
Talk at the Workshop on Higher-order Programming with Effects (HOPE 2017).
Abstract | GitHub

Growing a Proof Assistant.
William J. Bowman.
Talk at the Workshop on Higher-order Programming with Effects (HOPE 2016).
Sophisticated domain-specific and user-defined notation is widely used in formal models, but is poorly supported by proof assistants. Many proof assistants support simple notation definitions, but no proof assistant enables users to conveniently define sophisticated notation. For instance, in modeling a programming language, we often define infix relations such as Γ  e : t and use BNF notation to specify the syntax of the language. In a proof assistant like Coq or Agda, users can easily define the notation for Γ  e : t, but to use BNF notation the user must use a preprocessing tool external to the proof assistant, which is cumbersome.

To support sophisticated user-defined notation, we propose to use language extension as a fundamental part of the design of a proof assistant. We describe how to design a language-extension systems that support safe, convenient, and sophisticated user-defined extensions, and how to design a proof assistant based on language extension. We evaluate this design by building a proof assistant that features a small dependent type theory as the core language and implementing the following extensions in small user-defined libraries: pattern matching for inductive types, dependently-typed staged meta-programming, a tactic-based proof language, and BNF and inference-rule notation for inductive type definitions.
AbstractAbstract (Hide) | Draft Paper | HOPE 2016 Talk (by me) | GitHub

Fully Abstract Compilation via Universal Embedding.
Max New, William J. Bowman, and Amal Ahmed.
In Proc. of the International Conference on Functional Programming (ICFP 2016)
A fully abstract compiler guarantees that two source components are observationally equivalent in the source language if and only if their translations are observationally equivalent in the target. Full abstraction implies the translation is secure: target-language attackers can make no more observations of a compiled component than a source-language attacker interacting with the original source component. Proving full abstraction for realistic compilers is challenging because realistic target languages contain features (such as control effects) unavailable in the source, while proofs of full abstraction require showing that every target context to which a compiled component may be linked can be back-translated to a behaviorally equivalent source context.

We prove the first full abstraction result for a translation whose target language contains exceptions, but the source does not. Our translation— specifically, closure conversion of simply typed λ-calculus with recursive types— uses types at the target level to ensure that a compiled component is never linked with attackers that have more distinguishing power than source-level attackers. We present a new back-translation technique based on a deep embedding of the target language into the source language at a dynamic type. Then boundaries are inserted that mediate terms between the untyped embedding and the strongly-typed source. This technique allows back-translating non-terminating programs, target features that are untypeable in the source, and well-bracketed effects.
AbstractAbstract (Hide) | Paper | Technical Appendix | ICFP 2016 Talk (by Max New) | Author-Izer

Noninterference for Free.
William J. Bowman, and Amal Ahmed.
In Proc. of the International Conference on Functional Programming (ICFP 2015)
Abadi et. al. (1999) introduced the dependency core calculus (DCC) as a framework for studying a variety of dependency analyses (e.g., secure information flow). The key property provided by DCC is noninterference, which guarantees that a low-level observer (attacker) cannot distinguish high-level (protected) computations. The proof of noninterference for DCC suggests a connection to parametricity in System F, which suggests that it should be possible to implement dependency analyses in languages with parametric polymorphism.

In this paper, we present a translation from DCC into Fω and prove that the translation preserves noninterference. To express noninterference in Fω we define a notion of observer-sensitive equivalence that makes essential use of both first-order and higher-order polymorphism. Our translation provides insights into DCC’s type system and shows how DCC can be implemented in a polymorphic language without loss of the security/noninterference guarantees available in DCC. Our contributions include proof techniques that should be valuable when proving other secure compilation or full abstraction results.
AbstractAbstract (Hide) | Paper | Technical Appendix | ICFP 2015 Talk (by me) | Slides | Author-Izer

Profile-Guided Meta-Programming.
William J. Bowman, Swaha Miller, Vincent St-Amour, and R. Kent Dybvig.
In Proc. of the Conference on Programming Language Implementation and Design (PLDI 2015).
Contemporary compiler systems such as GCC, .NET, and LLVM incorporate profile-guided optimizations (PGOs) on low-level intermediate code and basic blocks, with impressive results over purely static heuristics. Recent work shows that profile information is also useful for performing source-to-source optimizations via meta-programming. For example, using profiling information to inform decisions about data structures and algorithms can potentially lead to asymptotic improvements in performance.

We present a design for profile-guided meta-programming in a general-purpose meta-programming system. Our design is parametric over the particular profiler and meta-programming system. We implement this design in two different meta-programming systems— the syntactic extensions systems of Chez Scheme and Racket— and provide several profile-guided meta-programs as usability case studies.
AbstractAbstract (Hide) | Paper | Slides | GitHub | Author-Izer

Dagger Traced Symmetric Monoidal Categories and Reversible Programming.
William J. Bowman, Roshan P. James, and Amr Sabry.
In Proc. of the 4th Workshop on Reversible Computation (RC 2011).
Paper | Code

Talks

Compilation as Multi-Language Semantics
William J. Bowman
Modeling interoperability between programs in different languages is a key problem when modeling verified and secure compilation, which has been successfully addressed using multi-language semantics. Unfortunately, existing models of compilation using multi-language semantics define two variants of each compiler pass: a syntactic translation on open terms to model compilation, and a run-time translation of closed terms at multi-language boundaries to model interoperability. In this talk, I discuss work-in-progress approach to uniformly model a compiler entirely as a reduction system on open term in a multi-language semantics, rather than as a syntactic translation. This simultaneously defines the compiler and the interoperability semantics, reducing duplication. It also provides interesting semantic insights. Normalization of the cross-language redexes performs ahead-of-time (AOT) compilation. Evaluation in the multi-language models just-in-time (JIT) compilation. Confluence of multi-language reduction implies compiler correctness, and part of the secure compilation proof (full abstraction), enabling focus on the difficult part of the proof. Subject reduction of the multi-language reduction implies type-preservation of the compiler.AbstractAbstract (Hide) | Video | Slides (PDF) | GitHub Repository

Cur: Designing a Less Devious Proof Assistant
William J. Bowman
Dijkstra said that our tools can have a profound and devious influence on our thinking. I find this especially true of modern proof assistants, with "devious" out-weighing "profound". Cur is an experiment in design that aims to be less devious. The design emphasizes language extension, syntax manipulation, and DSL construction and integration. This enables the user to be in charge of how they think, rather than requiring the user to contort their thinking to that of the proof assistant. In this talk, my goal is to convince you that you want similar capabilities in a proof assistant, and explain and demonstrate Cur’s attempt at solving the problem.AbstractAbstract (Hide) | Video | Slides (ODP) | Slides (Google Slides) | Demo Code | Cur GitHub

Do Compilers Respect Programmers?
William J. Bowman
Video | Keynote | PDF

Other

Toward Type Preserving Compilation of Coq.
William J. Bowman.
POPL 2017 Student Research Competition
Extended Abstract | Poster