Bidirectional Pure Type Systems

Posted on 2020-08-03 by Oleg Grenrus linear

#Introduction

This are my notes, where I write things down to try to clarify my own thoughts. All the mistakes are my own.

I try to show show the rules for Pure Type Systems in a bidirectional type-checking style.

This post was motivated by me thinking why Conor McBride has sorts and function types as checkable types in his systems. For example look at I Got Plenty o’ Nuttin’ paper (definition 4 for syntax, and definition 17 for typing judgements in the linked version).

I present two variants of bidirectional pure type systems. In the first variant type formers are inferrable terms. It's slightly different then one used in Lambda Pi: A tutorial implementation of a dependently typed lambda calculus, and generalized to arbitrary (single sorted) PTS.

Type formers in the second system are checkable. This is requirement for cumulative universes, where types of types are not unique.

#Pure Type Systems

This is the review of Barendregt (Lambda calculi with types, 1992).

A Pure Type System (PTS) is a type system with following syntax

\begin{array}{lclr} A,B,t,s & {}\coloneqq{} & c & \qquad \text{constant} \\ & {}\mid{} & x & \qquad \text{variable} \\ & {}\mid{} & t\; s & \qquad \text{application} \\ & {}\mid{} & \lambda (x:A).t & \quad \text{abstraction} \\ & {}\mid{} & \forall (x:A).B & \quad \text{dependent function space} \\ \end{array}

The type judgement of PTS is parameterized by a specification \mathcal{S},\mathcal{A},\mathcal{R}

  1. \mathcal{S} is a subset of constants, called sorts
  2. \mathcal{A} is a set of axioms in the form c : s , with c constant and s \in \mathcal{S}
  3. \mathcal{R} is a set of rules of the form (s_1, s_2, s_3) with s_1, s_2, s_3 \in \mathcal{S} .

Typical examples are simply typed lambda calculus (without products or sums), \lambda^\to

\begin{aligned} \mathcal{S} &= \{ \star, \square \} \\ \mathcal{A} &= \{ \star : \square \} \\ \mathcal{R} &= \{ (\star,\star,\star) \} \end{aligned}

or a predicative system with an infinite hierarchy of sorts (like in Agda), which some call I_\omega :

\begin{aligned} \mathcal{S} &= \{ \star_i \mid i \in \mathbb{N} \} \\ \mathcal{A} &= \{ \star_i : \star_{i+1} \mid i \in \mathbb{N} \} \\ \mathcal{R} &= \{ (\star_i,\star_j,\star_{\max(i,j)}) \mid i, j \in \mathbb{N} \} \end{aligned}

Barendregt shows a lot of properties for pure type systems, One important result is that if PTS is single sorted:

  • (c : s_1), (c : s_2) \in \mathcal{A} \quad\Rightarrow\quad s_1 = s_2
  • (s_1, s_2, s_3), (s_1, s_2, s_4) \in \mathcal{R} \quad\Rightarrow\quad s_3 = s_4

in other words \mathcal{A} and \mathcal{R} are partial functions, then terms have unique types

\Gamma \vdash A : B_1 \land \Gamma \vdash A : B_2 \quad\Rightarrow\quad B_1 \mathrel{=_\beta} B_2.

A corollary of that property, is that beta-reduction preserves types.

\lambda^\to and I_\omega systems are single sorted.

Barendregt gives declarative typing judgements for the PTS. Here I omit writing context \Gamma in every rule, and only show how contexts are extended. Reverse \dashv is used to lookup in the context.

\begin{aligned} &\prftree[r]{axiom} {c : S \in \mathcal{A}} {c : S} \\ &\prftree[r]{start} {A : s} {\dashv x : A} {\vdash x : A} \\ &\prftree[r]{weaken} {A : B} {C : S} {x : C \vdash A : B} \\ &\prftree[r]{product} {A : s_1} {x : A \vdash B : s_2} {(s_1,s_2,s_3) \in \mathcal{R}} {\forall (x:A).B : s_3} \\ &\prftree[r]{application} {f : \forall (x:A).B} {a : A} {f\; a : B[x \mapsto a]} \\ &\prftree[r]{abstraction} {x : A \vdash b : B} {\forall (x : A). B : s} {\lambda (x : A). b : \forall (x : A). B} \\ &\prftree[r]{conversion} {A : B} {B&39; : s} {B \mathrel{=_\beta} B&39;} {A : B&39;} \end{aligned}

#Bidirectional Systems

Bidirectional systems have two syntactical groups, which McBride calls terms and elimination. I choose to call them check terms and synth terms, as I find the name elimination somewhat misleading.

There are three groups of terms: there are introduction and elimination forms, but also type formations, because types appear in terms.

One valid approach is to divide introduction, elimination and type formers arbitrarily into check and synth terms, trying to minimize the need for the type-annotations.

Example: Booleans We know that \mathsf{true} : \mathsf{Bool} and \mathsf{false} : \mathsf{Bool} , thus introduction forms can be synth terms. In the boolean elimination (if expression) scrutinee have to be Boolean, so we can check that. As we don't know anything about the type of branches, we can decide to avoid type annotation for the result type and ask it to be provided.

\begin{aligned} &\prftree[r]{$\mathsf{Bool}$-Intro$_1$} {} {\mathsf{true} \in \mathsf{Bool}} \qquad \prftree[r]{$\mathsf{Bool}$-Intro$_2$} {} {\mathsf{false} \in \mathsf{Bool}} \\ &\prftree[r]{$\mathsf{Bool}$-Elim} {\mathsf{Bool} \ni b} {C \ni t_1} {C \ni t_2} {C \ni \mathbf{if}\;b\;\mathbf{then}\;t_2\;\mathbf{else}\;t_1} \end{aligned}

Here I use the Conor's notation to make type judgments be "executable" in the clockwise order. A \ni t is read as "check that t has type A ", and e \in A as "infer (or synthesize) a type of e which will be A ".

These are the rules you will find in David Christiansen tutorial.

As said, this is a valid design strategy.

Another approach is to take normal forms, which are divided into neutral and normal terms, and add double agent, cut as a type annotation construct to normal forms.

Frank Pfenning in his lectures talks about natural deduction and verifications (and sequent calculus and cut elimination. I haven't been in his class, but have watched his and Brigitte Pientka's OPLSS lectures on Proof Theory. The atomic proposition are allowed to meet

\prftree[r]{$\downarrow\uparrow$} {P\downarrow} {P\uparrow}

which is restricted version of conversion rule we will have. (Restricting to atomic propositions means that you need to \eta -expand everything in normal forms). The opposite rule

\prftree[r]{$\uparrow\downarrow$} {P\uparrow} {P\downarrow}

is analogous to cut rule in sequent calculus, i.e. shows where the derivation can be simplified. This rule is admissible.

We can design bidirectional systems so the cut-rule has second purpose as type-annotation rule. This way, we know precisely where we need to perform reductions to normalize terms. As far as I understand, this is the design principle Conor McBride advocates for.

As a consequence, all introduction forms have to be checkable terms. This is natural for function abstraction (which is introduction form for function types), but feels less natural for "data". With these design principles, rules for Booleans are:

\begin{aligned} &\prftree[r]{$\mathsf{Bool}$-Intro$_1$} {} {\mathsf{Bool} \ni \mathsf{true}} \qquad \prftree[r]{$\mathsf{Bool}$-Intro$_2$} {} {\mathsf{Bool} \ni \mathsf{false}} \\ &\prftree[r]{$\mathsf{Bool}$-Elim} {b \in \mathsf{Bool}} {C \ni t_1} {C \ni t_2} {C \ni \mathbf{if}\;b\;\mathbf{then}\;t_1\;\mathbf{else}\;t_2} \end{aligned}

Now we cannot write

\mathbf{if}\;\mathsf{true}\;\mathbf{then}\;t_1\;\mathbf{else}\;t_2

as \mathsf{true} is check term, but the scrutinee of if have to be synth term. If we want to write such redundant program, it have to be written as

\mathbf{if}\;\mathsf{true} : \mathsf{Bool}\;\mathbf{then}\;t_1\;\mathbf{else}\;t_2

Notice, as there is a type annotation, we know that expression can be reduced:

\begin{aligned} \mathbf{if}\;\mathsf{true} : \mathsf{Bool}\;\mathbf{then}\;t_1\;\mathbf{else}\;t_2 &\leadsto t_1 \\ \mathbf{if}\;\mathsf{false} : \mathsf{Bool}\;\mathbf{then}\;t_1\;\mathbf{else}\;t_2 &\leadsto t_2 \end{aligned}

Exercises

Does if expression itself have to be checkable term, can't we have it as inferrable?

\prftree[r]{bad if} {b \in \mathsf{Bool}} {e_1 \in A} {e_2 \in B} {A =_\beta B} {\mathbf{if}\;b\;\mathbf{then}\;e_1\;\mathbf{else}\;e_2 \in A}

What problems will occur? Hint: Nest things.

This is an example why I don't think that calling inferrable terms eliminations is good idea. Some elimination forms have to be checkable.

The sum types, even Boolean, are however quite challenging.

#Bidirectional PTS

In this section I will describe a bidirectional pure type system. I have proved nothing about this system, but I have implemented it. I relies on PTS specification to be single sorted.

It is fun to instantiate the implementation to different specifications and see what is possible or not. A specific highlight is to see Hurkens Paradox to fail to type-check in one system, and fail to terminate in other.

Check and synth terms. We have the same five syntactical forms as previously, and two new ones to allow conversions between check and synth terms. Unlike McBride, type formation forms are synth terms. Also type in type annotation have to be synth term. Notice that lambda term don't have argument type annotation.

\begin{array}{lclr} t & {}\coloneqq{} & \lambda x.t & \quad \text{abstraction} \\ & {}\mid{} & \underline{e} & \quad \text {synth} \\ f,e,A,B & {}\coloneqq{} & x & \qquad \text{variable} \\ & {}\mid{}& s & \qquad \text{sort} \\ & {}\mid{} & f\; t & \qquad \text{application} \\ & {}\mid{} & \forall (x:A).B & \quad \text{dependent function space} \\ & {}\mid{} & t : A & \quad \text{type annotation} \end{array}

The typing rules are syntax directed, we have one rule for each syntactic form.

\begin{aligned} &\prftree[r]{abstraction} {x : A \vdash B \ni b} {\forall (x : A). B \ni \lambda x. b} \\ &\prftree[r]{conversion} {e \in B} {A =_\beta B} {A \ni \underline{e}} \\ &\prftree[r]{var} {\dashv x : A} {\vdash x \in A} \\ &\prftree[r]{axiom} {\mathcal{A}(s_1) = s_2} {s_1 \in s_2} \\ &\prftree[r]{product} {A \in s_1} {x : A \vdash B \in s_2} {\mathcal{R}(s_1,s_2) = s_3} {\forall (x:A).B \in s_3} \\ &\prftree[r]{application} {f \in \forall (x:A).B} {A \ni a} {f\; a \in B[x \mapsto a]} \\ &\prftree[r]{annotation} {A \in s} {A \ni t} {t : A \in A} \end{aligned}

The rules are not showing the evaluation steps. We could write the rules in more explicit way

\begin{aligned} &\prftree[r]{abstraction} {C \leadsto \forall (x : A). B} {x : A \vdash B \ni b} {C \ni \lambda x. b} \\ &\prftree[r]{application} {f \in C} {C \leadsto \forall (x:A).B} {A \ni a} {f\; a \in B[x \mapsto a]} \end{aligned}

but I find it nicer to avoid these extra steps.

In these rules we rely on \mathcal{A} and \mathcal{R} to be (partial) functions. Otherwise the type-checking algorithm won't work.

The final missing bit is to specify the \leadsto process. As already pointed out, one need to look for type annotations, and try to remove them.

If we forget the type of synth term, and then annotate it, this can be simplified. Also if you have checkable term which is annotated, but then coerced back can also be simplified:

\underline{e} : A \leadsto e \qquad \underline{t : A} \leadsto t

Function beta-redexes are more complicated than what you are probably used to:

(\lambda x. t_1 : \forall (x : A). B)\; t_2 \leadsto t_1[x \mapsto t_2 : A] : B[x \mapsto t_2 : A]

As function application is an inferrable, synth term, it have to reduce to a synth term, thus we annotate the result. In the substitutions, we cannot substitute just t_2 as variables are also synth terms, so we substitute annotated terms t_2 : A . But this all is a good thing. In function beta redex we replace big redex at big type, with smaller redexes at smaller types. Type annotations tell where redexes are.

You should be now be able to see how Lambda Pi paper is crucially relying on \star : \star axiom of its system, and how relatively easy it is to "fix" by changing Star constructor to Star Natural to implement the I_\omega system, for example.

#Internalizing Axiom and Rule, universe polymorphism

In as systems were \mathcal{A} and \mathcal{R} are (non-partial) functions, we can try to internalize them.

Internalizing means introducing them into the system itself, so they can be used from the inside. This way we can add universe polymorphism to the system. That is (almost) possible in I_\omega but isn't in \lambda^\to (not that it makes any sense to do so in the latter).

Agda does that, there are lsuc and _\sqcup_ functions on Levels.

We then can write universe polymorphic identity function

\begin{aligned} \mathsf{id} &: \forall (\ell : \mathsf{Level}). \forall (A : \star_\ell). \forall (x : A). A \\ \mathsf{id} & = \lambda \ell. \lambda A. \lambda x. x \end{aligned} \

If we want to type-check the type of that term, we need to add an axiom \mathsf{Level} : \star_0 to \mathcal{A} (it is so in Agda, Level : Set), then we'll need to synthesize the following:

\ell : \mathsf{Level} \vdash \forall (A : \star_\ell). \forall (x : A). A \in \mathord{?}\\

That matches the product rule again, and we need to check that \star_\ell is a valid sort:

\ell : \mathsf{Level} \vdash \star_\ell \in \mathord{?}

We can modify the axiom rule to match on this, and then we would need to make \mathcal{A} compute symbolically: \mathcal{A}(\ell) = \mathsf{lsuc}\;\ell .

Then we'll need to compute \mathcal{R}(\star_0, \star_{\mathsf{lsuc}\;\ell}) . What could it be?

Exercises

  1. Check what Agda returns for the type of universe polymorphic identity function. Hint: C-c C-d infers (deduces) type of the term. Types are terms.
  2. Based on the previous answer, how do you think I_\omega could be modified to support universe polymorphism (Answer to Agda's approach is on https://agda.readthedocs.io/en/latest/language/sort-system.html#agda-s-sort-system).

Another possible type to universe polymorphic identity function would be

\begin{aligned} \mathsf{id} &: \forall (\mathcal{U} : \mathord{?}). \forall (A : \mathcal{U}). \forall (x : A). A \\ \mathsf{id} & = \lambda \mathcal{U}. \lambda A. \lambda x. x \end{aligned} \

If you have done the exercises, you know what the should be in the place of the question mark \mathord{?} .

#Cumulatative universes

In the previous section we assumed that the system is single sorted. But what if it isn't. Instead of universe polymorphism we might want to have system with cumulative universes.

A rule we might want to have is

\prftree {A : \star_i} {i \prec j} {A : \star_j}

for some well founded ordering \prec . If we do so, we are required to make all type formation terms to be checkable, as they don't know a sort they have as their type.

\begin{aligned} &\prftree[r]{sort} {i \prec j} {\star_j \ni \star_i} \\ &\prftree[r]{product} {\star_i \ni A} {x : A \vdash \star_i \ni B} {\star_i \ni \forall (x:A).B} \end{aligned}

Because type formers are now checkable, also type annotation rule have to change.

\begin{aligned} &\prftree[r]{annotation} {\star_i \ni A} {A \ni t} {t : A \in A} \end{aligned}

and I don't exactly understand how this could work, as the level is not specified anywhere. There is some missing bit I cannot figure out.

My best understanding that Coq (and Idris?), for example, collect i \prec j constraints which occur during the type-checking process, and then tries to solve them. Yet I don't know how (and when) that work is done. When \prec is < , figuring out whether a solution exists is not hard (essentially, you need to check that there are no loops in the graph, topological ordering will give you that, and a way to assign levels).

#Challenges with sums

Before I conclude, let me briefly mention problem with sums.

Consider a type

\forall (b : \mathsf{Bool}). \forall (A : \mathbf{if}\;b\;\mathbf{then}\;B\;\mathbf{else}\;C). \forall (x : A). A

If types are checkable, and Boolean-elimination is checkable or if types are inferrable and Boolean-elimination is inferrable, the above type type-checks.

If types are checkable, and eliminations are inferrable, like in Conor system, we need to add a conversion

\forall (b : \mathsf{Bool}). \forall (A : \underline{\mathbf{if}\;b\;\mathbf{then}\;B\;\mathbf{else}\;C}). \forall (x : A). A

to make the term to type check.

In the last case, when types are inferrable, but Boolean elimination is checkable, we have to annotate the type of if expression.

\forall (b : \mathsf{Bool}). \forall (A : (\mathbf{if}\;b\;\mathbf{then}\;B\;\mathbf{else}\;C) : \star). \forall (x : A). A

Here we have an annotation. This is problematic.

In the meantime, consider nested if expressions

\mathbf{if}\; (\mathbf{if}\;b\;\mathbf{then}\;\mathsf{false}\;\mathbf{else}\;\mathsf{true})\; \mathbf{then}\;x\;\mathbf{else}\;y

If Boolean elimination is inferrable, then this is normal form. If it isn't, we have to add type annotation

\mathbf{if}\; (\mathbf{if}\;b\;\mathbf{then}\;\mathsf{false}\;\mathbf{else}\;\mathsf{true} : \mathsf{Bool})\; \mathbf{then}\;x\;\mathbf{else}\;y

And this can be reduced with case-of-case transformation to

\mathbf{if}\;b\; \mathbf{then}\;y\;\mathbf{else}\;x

Similarly, we can "reduce' a type example to

\forall (b : \mathsf{Bool}). \mathbf{if}\; b \;\mathbf{then}\; \forall (A : B). \forall (x : A). A : \star \;\mathbf{else}\; \forall (A : C). \forall (x : A). A : \star

From this point of view, we can try to check the B in \forall (x : A).B and have checkable Boolean elimination.

\prftree[r]{product} {A \in s_1} {(s_1,s_2,s_3) \in \mathcal{R}} {x : A \vdash s_2 \ni B} {s_3 \ni \forall (x:A).B}

In cumulative universes system, we can probably check with s_2 = \max(s_1,s_3) . This variant is new to me, and I haven't tried to implement it.

We will run into similar problem if we add dependent sum, i.e. \sum types. (I use \forall instead of \prod and therefore will use \exists instead of \sum ). If we use projections as eliminators, we will avoid problems as these eliminations can (should) be inferrable:

\prftree[r]{fst} {p \in \exists (x : A). B} {\mathsf{fst}\;p \in A} \qquad \prftree[r]{snd} {p \in \exists (x : A). B} {\mathsf{snd}\;p \in B[x \mapsto \mathsf{fst}\;p]}

But if we will add match (or split or case) elimination rule, it will be the same problem, as with Booleans.

\prftree[r]{split} {p \in \exists (x : A). B} {x : A, y : B \vdash C \ni c} {C \ni \mathbf{split}\;p\;\mathbf{as}\; (x,y) \mapsto c}

#Conclusion

I see the niceties of cumulative universes (in Coq you just write Type), especially compared to uglyness of universe polymorphism details business in Agda (see link in exercises).

I think that the bidirectional PTS I presented, with inferrable type formers, is simpler system and is good basis for considering extensions. However there are problems with finite sets (like Boolean) and dependendent sums.


Site proudly generated by Hakyll