What is a model?

:: notes, research, academia

What is a model, particularly of a programming language? I’ve been struggling with this question a bit for some time. The word “model” is used a lot in my research area, and although I have successfully (by some metrics) read papers whose topic is models, used other peoples’ research on models, built models, and trained others to do all of this, I don’t really understand what a model is.

Before I get into a philosophical digression on what it even means to understand something, let’s ignore all that and try to discover what a model is from first principles.

Definitions of “model”

The apparent place to start to understand the meaning of a word is to read its definition. This is actually no help at all. There are lots of uses of the word “model”, with several definitions. Here are some.

Definition 0 In science and engineering, a model is “an abstract description of a concrete system using mathematical concepts and language”. See Wikipedia provides a nice introduction to this kind of model, and the Standard Encylopedia of Philosophy provides a nice explanation in the context of model theory, which will be relevant later in this post.

Definition 1 A syntactic model (of a type theory) is defined by Boulier, Pédrot, and Tabareau as a translation from one type theory into another that preserves typing, the definition of false, and definitional equivalence. This syntactic model enables the source type theory to inherit properties of the target type theory—such as consistency.

Definition 2 A model (of a vocabulary also called a language \sigma ) in the sense of model theory (as defined by Elements of Finite Model Theory) is a \sigma -structure (“also called a model”) defining a set A along with 3 sets providing interpretations of that vocabulary. These sets are Ic_A , which interprets each constant in \sigma as an element of A , IP_A , which interprets each n-ary predicate symbol or relation symbol from \sigma as an n-ary (set-theoretic) relation between elements of A , and If_A , which interprets each n-ary function symbol in \sigma as a (set-theoretic) function from n elements of A to an element of A .

Definition 3 The above definition is confusing, since it conflates structure and model, which the text later distinguishes with the following separate definition. A model (of a theory (over a vocabulary \sigma )) is a structure (“also called a model”) of vocabulary \sigma such that every sentence in the theory is interpreted in the structure to make the sentence true. (A theory is a set of sentences drawn from a vocabulary.) My rephrasing of the definition of model is intentionally confusing and difficult to parse, to make apparent the inherit confusingness created by the several layers of definitions and one definition that defines “model” using a second definition of “model”.

Definition 4 Nlab hosts an article with a much clarified definition, which distinguishes language, theory, structure, and model carefully. In particular, it is careful to only call structure the interpretation of the language (call vocabulary above), and only call model an interpretation that makes true the axioms composing the theory of the language.

Definition 5 Carlo Anguli once gave me the following definition of model:

A collection of interpretation functions that interpret every syntactic category in such that the original relationship is respected.

e.g.,
- interpret every context as a set,
- interpret every (non-dependent) type as a set, and
- interpret every term-of-a-type indexed-by-a-context as an element-of-the-interpretation-of-that-type indexed-by-elements-of-the-interpretation-of-that-context.

Implicit in this definition is that the interpretations must respect equality — because if you don’t respect equality of arguments then you’re not a function!

This definition seems to be close to Definition 2, as it doesn’t mention axioms and their interpretation. However, it might be Definition 3 instead, as there could be implicit in the definition of syntax an inclusion of all judgements of the programming language, and therefore in the phrase “such that the original relationship is respected” a requirement that axioms of those judgements become true.

Also implicit in this definition of model is what it is a model of. Perhaps a programming language, but it again depends on what syntax means and thus what “every syntactic category” refers to.

I’m interested in the requirement that the interpretation is a collection of functions, which seems to be missing or only implied in some model theory definitions of “model”.

So what is a model?

One of the first thing that jumps out to me after reviewing the above definitions is that to understand each definition, you have to reframe the definition of model into model [of what]. It really never makes sense to give a definition of merely “model”.

Definition 0 defines a model of a system [of the real world]. Definition 1 defines a model of a type theory. Definitions 2 and 3 give definitions of a model, in the sense of model theory, but of two different objects: model of a vocabulary (or language), which is more often called a structure, and model of a theory (which everyone seems to agree is a “model”). Definition 4 makes this distinction very clear. Definition 5 seems to use “model” in the model-theoretic sense, but has abstracted a bit away from a particular notion of theory and generalized to syntax.

What is a model of a programming language?

I’ve had two problems understanding the word “model” in the context of programming languages.

First, we use “model” in three different senses, and I have neither understood that nor understood the relationships between them.

  1. Model in the sense of an abstract description of a system. This is Definition 0. This sense of “model” means something like “mathematical description”. What we want is a description in which we can work using math, so we can make predictions about the real world. Ideally, the predictions we make will be true.
  2. Model in the strict sense of model theory. These are Definitions 2, 3, and 4. This sense of “model” is the closet to having a strict definition. It often carries a set-theoretic connotation, asking for a set defining the domain of values, and three interpretation functions that interpret specifics parts of a theory in specific ways.
  3. Model in the generalized sense, inheriting from or related to model theory. I hesitate to even call this distinct from the second sense, but I will anyway. I also hesitate to speculate about history—perhaps this sense actually predates model theory. But I distinguish it from the second sense, because it frequently generalizes away from the strict 3 category “constant symbol”, “predicate symbol”, “function symbol” specification and doesn’t seem beholden to set theory. Definitions 1 and 5 use “model” in this sense.

In the second sense of “model”, the first sense of the word remains—we’re still interested in a description of some system (the theory), and of using the model to make predictions or reason. However, since the theory is also mathematical, we can be more rigid about our reasoning requirements—axioms of the theory must be true of the model, and relationships must be preserved in the model. This is rarely true of a model of the real world; e.g., the Newtonian model of gravity works pretty well, until it doesn’t, so it’s a model that doesn’t quite make all axioms true or preserve all relationships.

The third sense seems closer to the idea of semantics, in the mathematical logic sense of the word as assigning meaning or interpretation to syntax. In this sense, the word “model” frequently avoid committing to set theory as a formal foundation, generalizes away from the three interpretation functions, and focuses instead on the relationships between uninterpreted syntax being preserved by the interpretation. For example, in Definition 1, the relationships of interest are well-typedness, definitional equivalence, and falsehood, and the formal foundation is type theory. Category theory seems to come closest to a complete formalization of this sense of the word “model”, although I’ve had a hell of a time understanding that. Nlab articles don’t say this explicitly, but reading between the lines in articles linked to from the Nlab article on model theory for the words syntax and semantics implies that the idea of syntax, i.e., uninterpreted symbols with relationships between themselves and judgements about them, can be formalized in category theory, and then so can the idea of semantics, i.e., providing an interpretation in some other domain of those uninterpreted symbols; a domain in which one can use all the power of the other domain to reason about the judgements one wishes to make about the uninterpreted symbols.

The second problem with the word “model” is that we frequently work with two senses simultaneously.

When I write down a programming language, I’m often trying to model (in the first sense) a real programming language (or some feature of it), one actual software developers use to make real things happen in the real world. I am not merely describing a mathematical object for study. (Okay, sometimes I do that, but usually to the first end, eventually.) When I write down such a model, I may describe the abstract syntax, the typing judgement, and an abstract machines or reduction rules. These form a pretty good mathematical description of how a real language behaves. A compiler will reject syntactically invalid expressions. It may then type check the abstract syntax tree, and reject some possibly semantically invalid expressions. If judged well typed, the compiler may transform the tree into something that runs, and that run-time behaviour can be predicted using the reduction rules.

However, for much programming languages work, I’m not interested in merely predicting the behaviour of a single program. I might want to predict behaviour or properties of the entire language, or its typing judgement, etc. To reason about single programs, the model (in the first sense) may work well. But it might not work well for, say, trying to decide whether certain types can even be inhabited. To solve this, we might build a model (in the second or third sense). We interpret the abstract syntax tree and typing judgement in some other domain. That is, the AST and the typing judgement, being a model in the first sense, form a theory in the model theoretic sense. We can then construct a model (in the second sense) of a model (in the first sense). The Standard Encyclopedia of Philosophy article on model theory goes into this in detail in the context of model theory, which is great.

What’s more interesting is how these two senses of model interact in programming languages. If one is interested in a model, in the second sense, it may inform how one develops a model (in the first sense). If I know I will want to construct a model (in the second sense) to reason about the typing judgement, I may decide that single-step reduction rules are actually irrelevant; I only care that certain program equivalences hold, really, and any implementation that has those equivalences suffices. So rather than create a model (in the first sense) with an abstract machine or small-step operational semantics, I’ll specify an equivalence judgement. This might give less predictive power about a real world implementation, but allow the predictions I do make to apply to many implementations.

If you see these patterns, you may have some insight into how the author is approaching their work, and in what senses they are using the word “model”.