Kurbis

Tech blog

Parametric polymorphism in ML is broken, especially in what arithmetic operators are concerned. Let's explore the meaning of parametric polymorphism through different points of view and see how these are violated in ML.

Parametric polymorphism means any type instance

We say a function is polymorphic when we have a single piece of code, the function's body, that can be applied uniformly to values of different types. This means that the sematics of the function does not depend on the particular type instance that is picked for its type variables. For example, when we see a function with type

f : 'a -> 'a

we know that f has a single body (a single piece of code) and that this code does not depend on the particular instances that are picked for the type variable a. Therefore, a can be instantiated with any type, for example, int or double.

Now, this is not exactly true for the case of arithmetic operators, say +, is it? In fact, the operator + can only be instantiated with int or double, as all the other instances are illegal.

Parametric polymorphism means a single piece of code

Let's now sidestep from this problem and focus on the property of a single piece of code. As mentioned before, parametric polymorphism means that we have a single piece of code that can be instantiated with different types. Even though + is a primitive, let's think for a second how we would implement it. Let's imagine there are two more primitive functions, namely, plusInt and plusDouble with the types

plusInt : int -> int -> int
plusDouble : double -> double -> double

If we were to use these functions as building blocks for our polymorphic + operator, then we would not have a single piece of code for the operator's body. Instead, we would have two pieces of code, as shown in the following ML/Haskell-like example:

(+) : 'a -> 'a -> 'a

(+) : int -> int -> int
(+) x y = plusInt x y

(+) : double -> double -> double
(+) x y = plusDouble x y

This example is not ML, but I think it's a nice transition between ML and Haskell, as we can think of the two previous definitions as instances for the polymorphic operator +. In any case, this is not parametric because we do not have a single piece of code. This is in fact ad-hoc polymorphism, which in Haskell is achieved by means of typeclasses.

Let's try a different approach and try to build a single piece of code. We are going to see that, once again, it is not going to be parametric.

Parametric polymorphism means types are opaque

As mentioned before, parametric polymorphism means that a given code does not depend on the particular instances of its type variables. This means that whatever a function does to a polymorphically typed argument cannot depend on its type. So what functions can be applied to a polymorphically typed argument? The answer is easy: only polymorphic functions (including itself). This means that a function with type

f : 'a -> 'a
f x = ...

can either diverge or return its argument unchanged. But it cannot do anything else to argument x because everything other than keeping x for a while and returning it will have to depend on the type instance of x, thus making this function no longer polymorphic.

Coming back to the + operator. We cannot add two numbers if we do not know that they are numbers. But, for a moment, let's assume we can. The addition operator has the following type

(+) : 'a -> 'a -> 'a
(+) x y = ...

However, 1 + 2 returns 3. If we were to follow the type of + we would expect that + either diverges or returns x or returns y. Returning 3 is neither, therefore, + is not behaving polymorphically.

Tags: type theory