A Month of Haskell, Day 8 - Heterogeneous Lists

Posted on May 15, 2017 by Chris Lumens in month-of-haskell.

I previously talked about type classes and class constraints. Today I’d like to talk about something occassionally useful you can do with these two concepts - make lists that can hold more than one type.

Haskell’s lists are homogenous - they can only hold one type of thing. If you try to make a list with multiple types in it, the compiler will yell at you, though not in a very helpful way:

$ ghci
ghci> [1, 2, "x", "y"]

<interactive>:1:2: error:
    • No instance for (Num [Char]) arising from the literal ‘1’
    • In the expression: 1
      In the expression: [1, 2, "x", "y"]
      In an equation for ‘it’: it = [1, 2, "x", ....]

Usually this isn’t really a problem. However, what happens if you need to do something that’s a little more object oriented? What if you have several different record types that all have different elements but support the same types of operations? As long as you can ensure that the elements will only ever be accessed through the defined operations, you ought to be able to do so - the types are technically different, but they are all used the same way.

In my radio logging software, I support the concept of radio concepts. There are a ton of these, but the basic object is the same: Make contacts (called “QSOs” in radio jargon) with as many people as you can in a given time, exchange some piece of information with them, and write it down. The details of each contact vary from contest to contest. Consider these simplified types representing individual contacts from a couple different contests:

-- The New England QSO Party.
data NEQP = NEQP { neqpFreq :: Double, neqpMode :: String, neqpDate :: String, neqpTime :: String,
                   neqpCall :: String, neqpTheirCall :: String, neqpExchange :: String,
                   neqpTheirExchange :: String }

-- ARRL VHF contests.
data VHF = VHF { vhfFreq :: Double, vhfMode :: String, vhfDate :: String, vhfTime :: String,
                 vhfCall :: String, vhfTheirCall :: String, vhfGrid :: String, vhfTheirGrid :: String }

A list of these won’t work for the same reason as a list of integers and strings:

$ ghci
ghci> let xxx = [ NEQP 14.2 "SSB" "20170101" "1010" "AA1AA" "N1NN" "AL" "AK",
                  VHF 144.2 "SSB" "20170110" "1513" "AA1AA" "N1NN" "FN42" "FN43" ]

<interactive>:7:73: error:
    • Couldn't match expected type ‘NEQP’ with actual type ‘VHF’
    • In the expression:
        VHF 144.2 "SSB" "20170110" "1513" "AA1AA" "N1NN" "FN42" "FN43"
      In the expression:
        [NEQP 14.2 "SSB" "20170101" "1010" "AA1AA" "N1NN" "AL" "AK",
         VHF 144.2 "SSB" "20170110" "1513" "AA1AA" "N1NN" "FN42" "FN43"]
      In an equation for ‘xxx’:
            = [NEQP 14.2 "SSB" "20170101" "1010" "AA1AA" "N1NN" "AL" "AK",
               VHF 144.2 "SSB" "20170110" "1513" "AA1AA" "N1NN" "FN42" "FN43"]

For writing truly generic code, it would be nice to define some functions that could take an instance of any of these record types and perform the same operation on them. Most simply, you can imagine wanting to grab the time and date, or the exchanges, or something else. You might also be able to imagine more complicated operations. In my logging program, I use the printf function for converting these records with special formatting.

So hopefully you are on board with wanting to be able to make lists of things that all act the same, even if they don’t look the same. The trick is that instead of making a list of a specific type, we are making a list of a specific type class.

We start by enabling the ominously named ExistentialQuantification language extension. This is required to use the forall keyword in type signatures so we can define a box that hides the real type of each record away.

{-# LANGUAGE ExistentialQuantification -#}

Next, we define a new type class. All those record types I defined earlier will be made instances of that type class. This means that all records will support the same operations. In this example I’m only giving a single function that takes any QSO and converts it to a formatted string. You could also add various accessors to get at time and date, and so on. They’re not interesting to write about, though.

class QSO c where
    format :: c -> String

Next up, we define a special new data type. This is the box that hides the underlying record types.

data QSOBox = forall a. QSO a => QSOBox a

This is really the only tricky part of this whole technique. It defines a new type named QSOBox with a single constructor, also named QSOBox. It then uses the existential type keyword forall to put a class restriction on what can be a QSOBox.

Recall that data constructors typically look like this:

data Box = Box Int

Here, we’re defining a new type named Box with a single constructor named Box, and it can only hold an Int. A constructor is really just a function that takes whatever’s on its right and makes a new value of whatever type it’s part of. In fact, asking ghci about that shows:

$ ghci
ghci> data Box = Box Int
ghci> :t Box
Box :: Int -> Box

So if you pass 47 to Box, you get back a Box 47.

Our QSOBox is doing the same thing, but more generically. Instead of an Int, we’re saying that anything can be passed to QSOBox as long as it’s an instance of the QSO type class. We’re promising the compiler that everything that could ever be held in a QSOBox will act the same, and the compiler will enforce this for us.

If we defined a different Box data type whose constructor could hold anything that’s a member of the Eq type class, what would the type of the constructor be? Here’s what ghci says:

$ ghci -XExistentialQuantification
ghci> data Box = forall a. Eq a => Box a
ghci> :t Box
Box :: Eq a => a -> Box

That was a bit of a digression, but hopefully it makes things a little more clear. There’s really very little work left to be done. Next, I like to make the box we just defined an instance of the type class we defined earlier. This gets the unboxing done for us so there’s less tedious code to write everywhere else. The functions in the instance are just pass throughs to the same functions on the underlying types.

instance QSO QSOBox where
    format (QSOBox q) = format q

Finally, I make each record type an instance of the QSOBox class and give a definition for the format function:

import Text.Printf(printf)

instance QSO NEQP where
    format q = printf "%f %s %s %.4s %-13s %-13s %.10s %.10s"
                      (neqpFreq q) (neqpMode q) (neqpDate q) (neqpTime q)
                      (neqpCall q) (neqpTheirCall q) (neqpExchange q)
                      (neqpTheirExchange q)

instance QSO VHF where
    format q = printf "%f %s %s %.4s %-17s %-6s %-17s %-6s"
                      (vhfFreq q) (vhfMode q) (vhfDate q) (vhfTime q)
                      (vhfCall q) (vhfGrid q) (vhfTheirCall q) (vhfTheirGrid q)

Now we can make a list holding records from both types, print each out, and most anything else we want:

$ ghci -XExistentialQuantification
ghci> let lst = [ QSOBox $ NEQP 14.2 "SSB" "20170101" "1010" "AA1AA" "N1NN" "AL" "AK",
                  QSOBox $ VHF 144.2 "SSB" "20170110" "1513" "AA1AA" "N1NN" "FN42" "FN43" ]
ghci> mapM_ (putStrLn . format) lst
14.2 SSB 20170101 1010 AA1AA         N1NN          AL AK
144.2 SSB 20170110 1513 AA1AA             FN42   N1NN              FN43 

If you were doing this for real, I would put each data type into its own module. I would then make sure to only export the data type itself and not its contents. Exporting the contents means that client code can see into the type and get around these guarantees we’ve put in place. You can do this by leaving off the double dots in the module header:

module NEQP(NEQP) where