In
Type Theory, an
aggregate type that
consists of
two subtypes. An
element of the
sum type consists of an element of
one of the two types, plus a
wrapper that tells us what sum type it is an element of. We
express the sum type as
t1+t2 and an element thereof as either
inlt2 x or
inrt1 y, where
x:t1 and
y:t2. Note that
inl abbreviates "
left injection" and
inr "
right injection".
This begs an example.
Consider the sum type int+real. We can have inlreal3 and inrint2.7, both of type int+real. Note that since an element of a sum type only contains an element of one of the types, in order to preserve type safety, we have to indicate the other type in the sum type. That is, we want to distinguish between inlbool3 and inlreal3; the former is of type int+bool and the latter int+real. Likewise, we want to distinguish between inlbool3 and inrbool3; the former is of type int+bool and the latter bool+int.
Note that the subtypes of a sum type do not have to be simple types. They can just as easily be product types, arrow types, or even other sum types. This last possibility lets us create a sum type that in effect has more than two subtypes. That is, we can nest: int+(real+(bool*bool)).
Sum types come into use, for example, when we want a function to return one of two types. For example, if we wanted a function to return either a bool or a pair of bools (a product type), we can have it return the type bool+(bool*bool).
A value of a type that is a sum type is (almost) useless by itself, what we are interested in is what's inside (the subtypes). In order to get to these values, we can case on which "side" of the injection the value is:
case s : int+bool
of inlbool n => n+1
| inrint b => if b then 4 else 7
(Note that both
branches of the case have to
evaluate to the
same type.)
When I said that the sum type is useless by itself, that was a bit of a white lie. Which side of the injection we're on does carry useful information, in fact exactly one bit of information. In the purest example of this, when we don't care what the subtypes are, what subtype should we use? That's right, the unit type, 1. Thus 1+1 is a sum type that is logically equivalent to bool. If we're on the left (inl1()), that's the same as, say, false, and if we're on the right (inr1()), that represents true. Casing on a value of type 1+1 is thus equivalent to an if-then-else statement:
if b then 4 else 7
and
case b : bool
of true => 4
| false => 7
and
case s : 1+1
of inl1() => 4
| inr1() => 7
are all
logically equivalent.
Of course, with a little syntactic sugar, we can create wrapper names that are less annoying to type and a bit more descriptive than our friends inl and inr. The best example of this I can think of is the ML datatype:
datatype IntorBoolPair = Int of int | BoolPair of bool * bool
case s : IntorBoolPair
of Int(n) => n > 5
| BoolPair((b1, b2)) => b1 andalso b2
and
datatype Bool = False of unit | True of unit
or just
datatype Bool = False | True
Here, the wrappers act like functions from the subtypes to the sum types:
Int : int -> IntorBoolPair
These examples are of course a bit
contrived, and you may be wondering what
actual practical use these things have. One of the best applications of the sum type is the
option,
defined as
datatype 'a option = SOME of 'a | NONE
Where
'a (pronounced "
alpha") is
any type, kindof like a
wildcard type.
Details and
uses of the
option type will be explained fully in the
option node, because this one is
long enough as it is. It explains why
NULL (in
C) is a
Bad Idea(tm) and how the
option eliminates most kinds of
segfaults.
There's also the list:
datatype 'a List = Nil | Cons of 'a * ('a List)
We start with the empty list
Nil as the base case and add on elements with
Cons.
So
Nil
Cons(4,Cons(1, Nil))
Cons(Nil, Nil)
are all
Lists. The first is a
plain old 'a List, the second is an
int List and the last (a bit
tricky) is an
'a List List.
Lists are pretty useful, and our friend case forces us to handle the empty list gracefully.