Parametric polymorphism in ML is broken, especially in what arithmetic operators are concerned. Let's explore the meaning of parametric polymorphism through different points of view and see how these are violated in ML.
Parametric polymorphism means any type instance
We say a function is polymorphic when we have a single piece of code, the function's body, that can be applied uniformly to values of different types. This means that the sematics of the function does not depend on the particular type instance that is picked for its type variables. For example, when we see a function with type
f : 'a -> 'a
we know that f has a single body (a single piece of code) and that this code
does not depend on the particular instances that are picked for the type
variable a. Therefore, a can be instantiated with any type, for example,
int or double.
Now, this is not exactly true for the case of arithmetic operators, say +, is
it? In fact, the operator + can only be instantiated with int or double,
as all the other instances are illegal.
Parametric polymorphism means a single piece of code
Let's now sidestep from this problem and focus on the property of a single piece
of code. As mentioned before, parametric polymorphism means that we have a
single piece of code that can be instantiated with different types. Even though
+ is a primitive, let's think for a second how we would implement it. Let's
imagine there are two more primitive functions, namely, plusInt and
plusDouble with the types
plusInt : int -> int -> int
plusDouble : double -> double -> double
If we were to use these functions as building blocks for our polymorphic +
operator, then we would not have a single piece of code for the operator's
body. Instead, we would have two pieces of code, as shown in the following
ML/Haskell-like example:
(+) : 'a -> 'a -> 'a
(+) : int -> int -> int
(+) x y = plusInt x y
(+) : double -> double -> double
(+) x y = plusDouble x y
This example is not ML, but I think it's a nice transition between ML and
Haskell, as we can think of the two previous definitions as instances for the
polymorphic operator +. In any case, this is not parametric because we do not
have a single piece of code. This is in fact ad-hoc polymorphism, which in
Haskell is achieved by means of typeclasses.
Let's try a different approach and try to build a single piece of code. We are going to see that, once again, it is not going to be parametric.
Parametric polymorphism means types are opaque
As mentioned before, parametric polymorphism means that a given code does not depend on the particular instances of its type variables. This means that whatever a function does to a polymorphically typed argument cannot depend on its type. So what functions can be applied to a polymorphically typed argument? The answer is easy: only polymorphic functions (including itself). This means that a function with type
f : 'a -> 'a
f x = ...
can either diverge or return its argument unchanged. But it cannot do anything
else to argument x because everything other than keeping x for a while and
returning it will have to depend on the type instance of x, thus making this
function no longer polymorphic.
Coming back to the + operator. We cannot add two numbers if we do not know
that they are numbers. But, for a moment, let's assume we can. The addition
operator has the following type
(+) : 'a -> 'a -> 'a
(+) x y = ...
However, 1 + 2 returns 3. If we were to follow the type of + we would
expect that + either diverges or returns x or returns y. Returning 3 is
neither, therefore, + is not behaving polymorphically.
Tags: type theory