generics-eot
tutorial¶
This tutorial is meant to be read alongside with the haddock comments in Generics.Eot. Its source is a compiled haskell file, so we have to get some language pragmas and imports out of the way first:
{-# LANGUAGE DefaultSignatures #-}
{-# LANGUAGE DeriveGeneric #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE MonoLocalBinds #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE TypeSynonymInstances #-}
{-# LANGUAGE UndecidableInstances #-}
module Generics.Eot.Tutorial where
import Data.Char
import Data.List
import Data.Typeable
import Generics.Eot
generics-eot
allows roughly three different kinds of operations:
- Accessing meta information about ADTs (
datatype
for names,Proxy
andEot
for field types). Example: Generation of database schemas for ADTs. - Deconstructing values generically (
toEot
). Example: Serialization to a binary format. - Constructing values of an ADT generically (
fromEot
). Example: Deserialization from a binary format.
Sometimes only one of the three forms is used but often multiple have to be
combined. For example serialization to JSON usually requires both datatype
and toEot
.
1st Example: Meta Information Without Types: Field Names¶
This simple function extracts the names of all field selectors and returns them as a list:
namesOfFields :: HasEot a => Proxy a -> [String]
namesOfFields proxy =
nub $
concatMap (fieldNames . fields) $
constructors $ datatype proxy
where
fieldNames :: Fields -> [String]
fieldNames fields = case fields of
Selectors names -> names
_ -> []
And here’s proof that it works (using doctest):
data A = A1 {
foo :: String,
bar :: Int
}
| A2 {
bar :: Int,
baz :: Bool
}
deriving (Generic, Show)
-- $ >>> namesOfFields (Proxy :: Proxy A)
-- ["foo","bar","baz"]
The Generic
instance: Don’t forget!!!¶
To be able to use generic functions that are written with generics-eot
you
need to derive an instance for GHC.Generics.Generic
(using DeriveGeneric
)
for your ADTs. This will automatically give you an instance for HasEot
.
When the instance for GHC.Generics.Generic
is missing the type error messages
are unfortunately very confusing and unhelpful. They go something like this:
Couldn't match type ‘GHC.Generics.Rep WithoutGeneric’
with ‘GHC.Generics.D1 c f’
The type variables ‘c’, ‘f’ are ambiguous
In the expression: namesOfFields (Proxy :: Proxy WithoutGeneric)
So don’t forget: you need a Generic
instance.
Eot
: Isomorphic representations¶
Part of the type class HasEot
is the type-level function Eot
that maps ADTs
to isomorphic types. These isomorphic types are always a combination of
Either
s, tuples and the uninhabited type Void
. For example this type:
data B = B1 Int | B2 String Bool | B3
deriving (Generic)
would be mapped to:
Either (Int, ()) (Either (String, (Bool, ())) (Either () Void))
Tip: Here’s how you can execute the type-level function Eot
in ghci
:
-- $ >>> :kind! Eot B
-- Eot B :: *
-- = Either (Int, ()) (Either ([Char], (Bool, ())) (Either () Void))
For the exact rules of this mapping see here: Eot.
If we have an ADT a
then we can convert values of type a
to this isomorphic
representation Eot a
with toEot
and we can convert in the other direction
with fromEot
. Generic functions always operate on these isomorphic
representations and then convert from or to the real ADTs with fromEot
and
toEot
.
These generic isomorphic types are referred to as “eot” – short for
“Either
s of tuples”.
2nd Example: Deconstructing Values: Serialization¶
We start by writing a function that operates on the eot representations. The eot representations follow simple patterns and always look similar, but they don’t look exactly the same for different ADTs. For this reason we have to use a type class:
class EotSerialize eot where
eotSerialize :: Int -- ^ The number of the constructor being passed in
-> eot -- ^ The eot representation
-> [Int] -- ^ A simple serialization format
Now we need to write instances for the types that occur in eot types. Usually these are:
Either this next
:- If as eot value we get
Left this
it means that the original value was constructed with the constructor that corresponds tothis
. In this case we put the number of the constructor into the output and continue with serializing the fields of typethis
. - If we get
Right rest
it means that one of the following constructors was the one that the original value was built with. We continue by increasing the constructor counter and serializing the value of typerest
.
Note that this results in
EotSerialize
class constraints for boththis
andrest
. If we write the correct instances for all eot types these constraints should always be fulfilled.- If as eot value we get
instance (EotSerialize this, EotSerialize next) =>
EotSerialize (Either this next) where
eotSerialize n (Left fields) = n : eotSerialize n fields
eotSerialize n (Right next) = eotSerialize (succ n) next
Void
: We need this instance to make the compiler happy, but it’ll never be used. If you look at the type you can also see that: an argument of typeVoid
cannot be constructed. (Often the functionData.Void.absurd
comes in handy for implementing these cases.generics-eot
re-exports both the typeData.Void.Void
andData.Void.absurd
for convenience.)
instance EotSerialize Void where
eotSerialize _n void = absurd void
(x, xs)
: Right-nested 2-tuples are used to encode all the fields for one specific constructor. Sox
is the current field andxs
are the remaining fields. To serialize this we serializex
(usingserialize
) and also write the length of the resulting list into the output. This will allow deserialization.Note: We could use
EotSerialize
to serialize the fields. But that would be a bit untrue to the spirit, since the fields are not eot types. Apart from that we might want to encode a field of e.g. typeEither a b
differently than the eot typeEither a b
. So we use a very similar but distinct type class calledSerialize
.The value of type
xs
contains the remaining fields and will be encoded recursively witheotSerialize
:
instance (Serialize x, EotSerialize xs) => EotSerialize (x, xs) where
eotSerialize n (x, xs) =
let xInts = serialize x
in length xInts : xInts ++ eotSerialize n xs
()
: Finally we need an instance for the unit type that marks the end of the fields encoded in 2-tuples. Since()
doesn’t carry any information, we can encode it as the empty list:
instance EotSerialize () where
eotSerialize _ () = []
This is the class Serialize
:
class Serialize a where
serialize :: a -> [Int]
We give serialize
a default implementation, but please ignore that for now.
It’ll be explained later in the section about
DefaultSignatures:
default serialize :: (HasEot a, EotSerialize (Eot a)) => a -> [Int]
serialize = genericSerialize
Serialize
is used to serialize every field of the used ADTs, so we need
instances for all of them:
instance Serialize Int where
serialize i = [i]
instance Serialize String where
serialize = map ord
instance Serialize Bool where
serialize True = [1]
serialize False = [0]
instance Serialize () where
serialize () = []
To tie everything together we provide a function genericSerialize
that
converts a value of some ADT into an eot value using toEot
and then uses
eotSerialize
to convert that eot value into a list of Int
s.
genericSerialize :: (HasEot a, EotSerialize (Eot a)) => a -> [Int]
genericSerialize = eotSerialize 0 . toEot
And it works too:
-- $ >>> genericSerialize (A1 "foo" 42)
-- [0,3,102,111,111,1,42]
-- >>> genericSerialize (A2 23 True)
-- [1,1,23,1,1]
3rd Example: Constructing Values: Deserialization¶
Deserialization works very similarly. It differs in that the functions turn
lists of Int
s into eot values.
Here’s the EotDeserialize
class with instances for:
Either this next
Void
(x, xs)
()
class EotDeserialize eot where
eotDeserialize :: [Int] -> eot
instance (EotDeserialize this, EotDeserialize next) =>
EotDeserialize (Either this next) where
eotDeserialize (0 : r) = Left $ eotDeserialize r
eotDeserialize (n : r) = Right $ eotDeserialize (pred n : r)
eotDeserialize [] = error "invalid input"
instance EotDeserialize Void where
eotDeserialize _ = error "invalid input"
instance (Deserialize x, EotDeserialize xs) =>
EotDeserialize (x, xs) where
eotDeserialize (len : r) =
let (this, rest) = splitAt len r
in (deserialize this, eotDeserialize rest)
eotDeserialize [] = error "invalid input"
instance EotDeserialize () where
eotDeserialize [] = ()
eotDeserialize (_ : _) = error "invalid input"
And here’s the Deserialize
class plus all instances to deserialize the
fields:
class Deserialize a where
deserialize :: [Int] -> a
instance Deserialize Int where
deserialize [n] = n
deserialize _ = error "invalid input"
instance Deserialize String where
deserialize = map chr
instance Deserialize () where
deserialize [] = ()
deserialize (_ : _) = error "invalid input"
instance Deserialize Bool where
deserialize [0] = False
deserialize [1] = True
deserialize _ = error "invalid input"
And here’s genericDeserialize
to tie it together. It uses
eotDeserialize
to convert a list of Int
s into an eot value and then
fromEot
to construct a value of the wanted ADT.
genericDeserialize :: (HasEot a, EotDeserialize (Eot a)) => [Int] -> a
genericDeserialize = fromEot . eotDeserialize
Here you can see it in action:
-- $ >>> genericDeserialize [0,3,102,111,111,1,42] :: A
-- A1 {foo = "foo", bar = 42}
-- >>> genericDeserialize [1,1,23,1,1] :: A
-- A2 {bar = 23, baz = True}
And it is the inverse of genericSerialize
:
-- $ >>> (genericDeserialize $ genericSerialize $ A1 "foo" 42) :: A
-- A1 {foo = "foo", bar = 42}
4th Example: Meta Information with types: generating SQL schemas¶
Accessing meta information including the types works very
similarly to deconstructing or constructing values. It uses the same
structure of type classes and instances for the eot-types. The difference is:
since we don’t want actual values of our ADT as input or output we operate on
Proxy
s of our eot-types.
As an example we’re going to implement a function that generates SQL statements that create tables that our ADTs would fit into. To be able to use nice names for the table and columns we’re going to traverse the type-less meta information (see 1st Example) at the same time.
(Note that the generated SQL statements are targeted at a fictional
database implementation that magically understands Haskell types like
Int
and String
, or rather [Char]
.)
Again we start off by writing a class that operates on the eot-types. Besides
the eot-type the class has an additional parameter, meta
, that will be
instantiated by the corresponding types used for untyped meta information.
class EotCreateTableStatement meta eot where
eotCreateTableStatement :: meta -> Proxy eot -> [String]
Our first instance is for the complete datatype. eot
is instantiated to
Either fields Void
. Note that this instance only works for ADTs with
exactly one constructor as we don’t support types with multiple constructors.
meta
is instantiated to Datatype
which is the type for meta information
for ADTs.
instance EotCreateTableStatement [String] fields =>
EotCreateTableStatement Datatype (Either fields Void) where
eotCreateTableStatement datatype Proxy = case datatype of
Datatype name [Constructor _ (Selectors fields)] ->
"CREATE TABLE " :
name :
" COLUMNS " :
"(" :
intercalate ", " (eotCreateTableStatement fields (Proxy :: Proxy fields)) :
");" :
[]
Datatype _ [Constructor name (NoSelectors _)] ->
error ("constructor " ++ name ++ " has no selectors, this is not supported")
Datatype name _ ->
error ("type " ++ name ++ " must have exactly one constructor")
The second instance is responsible for creating the parts of the SQL
statements that declare the columns. As such it has to traverse the fields
of our ADT. eot
is instantiated to the usual (x, xs)
. meta
is
instantiated to [String]
, representing the field names. The name of the
field type is obtained using typeRep
, therefore we need a Typeable x
constraint.
instance (Typeable x, EotCreateTableStatement [String] xs) =>
EotCreateTableStatement [String] (x, xs) where
eotCreateTableStatement (field : fields) Proxy =
(field ++ " " ++ show (typeRep (Proxy :: Proxy x))) :
eotCreateTableStatement fields (Proxy :: Proxy xs)
eotCreateTableStatement [] Proxy = error "impossible"
The last instance is for ()
. It’s needed as the base case for
traversing the fields and as such returns just an empty list.
instance EotCreateTableStatement [String] () where
eotCreateTableStatement [] Proxy = []
eotCreateTableStatement (_ : _) Proxy = error "impossible"
createTableStatement
ties everything together. It obtains the meta
information through datatype
passing a Proxy
for a
. And it creates a
Proxy
for the eot-type Proxy :: Proxy (Eot a)
. Then it calls
eotCreateTableStatement
and just concat
s the resulting snippets.
createTableStatement :: forall a . (HasEot a, EotCreateTableStatement Datatype (Eot a)) =>
Proxy a -> String
createTableStatement proxy =
concat $ eotCreateTableStatement (datatype proxy) (Proxy :: Proxy (Eot a))
As an example, we’re going to use Person
:
data Person
= Person {
name :: String,
age :: Int
}
deriving (Generic)
And here’s the created SQL statement:
-- $ >>> putStrLn $ createTableStatement (Proxy :: Proxy Person)
-- CREATE TABLE Person COLUMNS (name [Char], age Int);
If we try to use an ADT with multiple constructors, we get a type error due to a missing instance:
-- $ >>> putStrLn $ createTableStatement (Proxy :: Proxy A)
-- <BLANKLINE>
-- ...
-- • No instance for (EotCreateTableStatement
-- Datatype
-- (Either ([Char], (Int, ())) (Either (Int, (Bool, ())) Void)))
-- arising from a use of ‘createTableStatement’
-- ...
If we try to use it with an ADT with a single constructor but no selectors, we get a runtime error:
data NoSelectors
= NotSupported Int Bool
deriving (Generic)
-- $ >>> putStrLn $ createTableStatement (Proxy :: Proxy NoSelectors)
-- *** Exception: constructor NotSupported has no selectors, this is not supported
-- ...
DefaultSignatures¶
There is a GHC language extension called DefaultSignatures. In itself it has little to do with generic programming, but it makes a good companion.
How DefaultSignatures work¶
Imagine you have a type class called ToString
which allows to convert
values to String
s:
class ToString a where
toString :: a -> String
You can write instances manually, but you might be tempted to give the
following default implementation for toString
:
toString = show
The idea is that then you can just write down an empty ToString
instance:
instance ToString A
and you get to use toString
on values of type A
for free, because A
has
a Show
instance.
But that default implementation doesn’t work, because in the class declaration
we don’t have an instance for Show a
. ghc
says:
Could not deduce (Show a) arising from a use of ‘show’
from the context (ToString a)
One solution would be to make ToString
a subclass of Show
, but then we
cannot implement ToString
instances manually anymore for types that don’t
have a Show
instance. DefaultSignatures
provide a better solution. The
extension allows you to further narrow down the type for your default
implementation for class methods:
class ToString2 a where
toString2 :: a -> String
default toString2 :: Show a => a -> String
toString2 = show
Then writing down empty instances works for types that have a Show
instance:
instance ToString2 Int
-- $ >>> toString2 (42 :: Int)
-- "42"
Note: if you write down an empty ToString2
instances for a type that
does not have a Show
instance, the error message looks like this:
No instance for (Show NoShow)
This might be confusing especially since haddock docs don’t list the default
signatures or implementations and users of the class might be wondering why
Show
comes into play at all.
How to use DefaultSignatures
for generic programming¶
DefaultSignatures
are especially handy when doing generic programming.
Remember the type class Serialize
from the
second example? In that
example we used it to serialize the fields of our ADTs in the generic
serialization through genericSerialize
and EotSerialize
. We just assumed
that we would have a manual implementation for all field types. But we also
gave it a default implementation for serialize
in terms of
genericSerialize
:
default serialize :: (HasEot a, EotSerialize (Eot a)) => a -> [Int]
serialize = genericSerialize
Note that the default implementation has the same class constraints as
genericSerialize
.
Now we can write empty instances for custom ADTs:
data C
= C1 Int String
deriving (Generic)
instance Serialize C
You could say that by giving this empty instance we give our blessing to
use genericSerialize
for this type, but we don’t have to actually implement
anything. And it works:
-- $ >>> serialize (C1 42 "yay!")
-- [0,1,42,4,121,97,121,33]
Important is that we still have the option to implement instances manually
by overwriting the default implementation. This is needed for basic types
like Int
and Char
that don’t have useful generic representations. But it
also allows us to overwrite instances for ADTs manually. For example you may
want a certain type to be serialized in a special way that deviates from the
generic implementation or you may implement an instance manually for
performance gain.