A Month of Haskell, Day 2 - Enforcing things with types

Posted on May 2, 2017 by Chris Lumens in month-of-haskell.

One of my large Haskell projects is an amateur radio logging program. Part of this program takes data from various sources (a user interface, doing network queries, etc.) and stores them in a database. This data includes callsigns, state, mode (FM, SSB, CW, and so forth), and exchanges during contests. At various points, I want to compare new data to what’s already in the database.

For a long time, I’d been just making sure that before I did any comparison, I uppercased the data. However that is prone to error - if I missed a spot, my comparison would fail. Instead, I could use the type system to enforce the fact that certain pieces of data need to be uppercased before any comparison could take place. That would eliminate any possibility of busted comparisons at compile time. This is exactly the sort of thing Haskell’s type system is great at doing.

My solution was to create a new string-like type and then use it throughout the program. The module that provides the type is not much code at all, so I’ll just start with it and then discuss various features. And here it is:

module UpperString(UpperString,
                   asText,
                   fromText,
                   getUpperString)
 where

import           Data.Char(toUpper)
import           Data.String(IsString(..))
import qualified Data.Text as T
import           Text.Printf(PrintfArg(..), formatString)

newtype UpperString = UpperString String
 deriving(Eq)

instance IsString UpperString where
    fromString x = UpperString $ map toUpper x

instance PrintfArg UpperString where
    formatArg = formatString . getUpperString

instance Show UpperString where
    show = getUpperString

asText :: UpperString -> T.Text
asText = T.pack . getUpperString

fromText :: T.Text -> UpperString
fromText = fromString . T.unpack

getUpperString :: UpperString -> String
getUpperString (UpperString s) = s

The main event is the newtype UpperString line. That defines a single constructor named UpperString that takes in a String and returns something of the UpperString type. I use newtype because it doesn’t have the runtime cost that data does while also creating a complete new type, not just an alias.

At the top of the module where I list what it exports, there is something very subtle going on - I only export the type, not the constructors. This prevents any code that uses this module from getting at the implementation of UpperString. All they can ever see is the type and its constructors, not what it’s made of.

Making an UpperString requires a little bit of Haskell trickery. Inside Data.String is a type class called IsString:

Type classes:

class IsString a where
    fromString :: String -> a

Any type that is an instance of that class must define a fromString function that converts a String into that type. That is exactly what I’ve done here. With more complete code I wouldn’t need the type annotation, but you can see that whatever is passed to it is automatically converted into an upper cased version of itself:

$ ghci
ghci> :m +UpperString
UpperString> :m +Data.String
UpperString Data.String> fromString "blah" :: UpperString
BLAH
UpperString Data.String>

Defining fromString also gets us one other very cool thing - we can use literal strings anywhere in a Haskell source file, enable the OverloadedStrings extension, and it will automatically convert to the right type:

$ ghci -XOverloadedStrings
ghci> :m +UpperString
UpperString> "blah"
BLAH
UpperString> :t "blah"
UpperString

As you can see, there’s no way to have an UpperString contain anything besides upper cased text.

UpperString automatically derives from the Eq class, which means it can be compared with the equals and not-equals operators.

$ ghci
ghci> :m +UpperString
UppperString> :m +Data.String
UpperString Data.String> let left = fromString "left" :: UpperString
UpperString Data.String> let right = fromString "right" :: UpperString
UpperString Data.String> left == right
False
UpperString Data.String> left /= right
True
UpperString Data.String> let left2 = fromString "Left" :: UpperString
UpperString Data.String> left == left2
True

As you can see, the case of the string being passed to fromString doesn’t matter, because whatever comes in as input gets turned into an upper case version of itself.

After that, there’s a couple more type class instances. The PrintfArg one is simply so an UpperString can be used in the printf function. And the Show one is so you can use the show function on one. Haskell knows how to invent Show instances automatically, but the result would look like UpperString "BLAH" which is not what I want. Other instances could be added, but I haven’t had a reason yet.

asText and fromText are helper functions to convert to and from Data.Text. They’re not strictly necessary, but they sure do make things easier. getUpperString is the other really interesting thing here. You may have noticed there was a way to put a string into an UpperString, but there was no way to get one back out. That’s what getUpperString does. Its implementation is also obvious and doesn’t need a lot of discussion.

And that’s really all there is to it. If you say that a value has the type of UpperString, the type system will enforce that. And any string you put into it will be forced into upper case. This means I’ve eliminated the possibility of ever comparing two strings that should be upper cased, but are not. There’s all sorts of places you can do this sort of thing in Haskell, and this is its strength. Use the type system to catch problems at compile time and eliminate entire classes of bugs.