Inferring recursive types

2022 SEP 09 TYPES

A standard Hindley-Milner-like type checker typically has a function

unify: Type × Type -> Maybe Subst

which attempts to find a substitution of unification variables such that applying it to both type produces the same type – effectively, it is a kind of type equivalence test.

Also typically, we have the so-called “occurs check” when unifying a unification variable with some other type:

let unify = function
  (* ... *)
  | (Var v, other) | (other, Var v) ->
      if occurs_in v other then fail "occurs check"
      else with_var v other
  (* ... *)

The purpose of this check is to prevent infinite loops in the typer. For instance, a program like this

let f () = f

would give us a constraints like

'a = unit -> 'a

And although the unify function handles this gracefully (by creating a substitution { 'a ==> unit -> 'a }), we'd run into trouble if we ever want to substitute out the type variables in the inferred types in the program:

let f : unit -> 'a = λ(). f
let f : unit -> unit -> 'a = λ(). f
let f : unit -> unit -> unit > 'a = λ(). f
-- etc.

What if, instead of erroring on an occurs check, we replaced the “faulty” type with a μ-type?

A tiny introduction to μ-types

μ-types (often spelt mu-types) are a way of modelling recursive types. A μ-type looks like

$$ \mu \alpha. T $$

Where the $\mu$ introduces a μ-type with the type variable $\alpha$ (similar to how $\lambda \alpha. x$ is used to denote a lambda parameterized by $\alpha$ with $x$ as the body).

Within the μ-type (inside the $T$ above), the variable $\alpha$ refers to the entire μ-type itself. For instance, in

$$ \mu \alpha. 1+\alpha $$

the $\alpha$ refers to $\mu\alpha. 1+\alpha$. We can unroll such a type by removing the $\mu$ and replacing the variable with the type we are unrolling. Unrolling the above once gives

$$ (1 + \alpha)[\alpha\mapsto\mu\alpha.1+\alpha] = 1 + (\mu\alpha.1+\alpha) $$

One useful way of thinking of μ-types is to replace the $\mu$ form with a recursive type definition like so:

type T = 1 + T in T

Note also that only the part “enclosed” by the μ is recursive; unrolling

$$ \mathrm{Int} \times \mu\alpha.1+\alpha $$

gives

$$ \mathrm{Int} \times (1+(\mu\alpha.1+\alpha)) $$

and not

$$ \mathrm{Int} \times (1 + (\mathrm{Int} \times \mu\alpha.1+\alpha)) $$

This is actually quite a simple modification to our unifier; instead of failing at an occurs check, we just generate a μ-type from the unification variable and the type:

let make_mu v ty =
  let rec inner name = function
    | Var w when v = w -> name
    | Var w -> Var w
    | Fun (t, u) -> Fun (inner name t, inner name u)
    (* ... *) in
  let name = fresh_name ()  in
  Mu (name, inner name ty)

let rec unify = function
  (* ... *)
  | (Var v, other) | (other, Var v) ->
      with_var v
        (if occurs_in v other then make_mu v other
         else other)
  (* ... *)

Naturally, for this to work, we need to introduce μ-types into the the type language —

type ty =
  | Var of var
  | Fun of ty * ty
  | Mu of name * ty
  | Name of name
  (* ... *)

— and update the unifier so it handles them. A fairly straightforward way is to treat them as equirecursive types and just unroll them once when attempting to unify. So to solve

'a -> 'b = μa. unit -> a

we just unroll the right hand side once and use our normal techniques

'a -> 'b = unit -> (μa. unit -> a)
'a = unit
'b = μa. unit -> a

We first need a function to unroll types

let rec unroll name = function
  | Name m when m = name -> ty
  | Name m -> Name m
  (* ... *)

Which we can apply “as a last resort” in our unifier.

let rec unify = function
  (* ... *)
  | (Mu (n, t), u) | (u, Mu (n, t)) -> unify (unroll n t, u)

Except, not quite. This change lets the unifier loop infinitely! An example program which could trigger it might be something like

let f () = f in
let g () = g in
if something then f else g

Since this requires that f and g have equal types, we may end up with a constraint

μa. unit -> a = μb. unit -> b

Which would just unroll both sides indefinitely. Instead, when unifying two recursive types, we don't want to unroll them but just check if they are alpha-equivalent (that is, their only difference is in the name of the parameter).

A simple way of doing this is to pass a list of names we assume are equal into the unification function, and unify two names if they are either identical or assumed identical. Then, when unifying two recursive types, we assume their parameters are identical and unify their bodies without doing any unrolling.¹

let unify t u =
  let rec inner assumed = function
    | (Name n, Name m) when n = m -> return ()
    | (Name n, Name m) ->
        if eql assumed n m then return ()
        else fail
    (* ... *)
    | (Mu (n, t), Mu (m, u)) ->
        inner ((n, m) :: assumed) (t, u)
    (* ... *) in
    inner [] (t, u)

I haven't delved deeply into Huet's algorithm for higher-order unification, but from what I can skim from here, this seems like it is similar to how that algorithm unifies lambda terms.

In general, this unification with μ-types seems similar to higher-order unification with μs in place of λs and the “application” being unrolling. If this is the case, then that may suggest that unification with μ-types is semi-decidable.