Beam - A typesafe Haskell database interface

Posted on January 12, 2015 by Travis Athougies

haskell
web

I just uploaded a new package to github. It’s called Beam and it’s a type-safe database interface for Haskell.

Type safety and expressive power are two of the main selling points of Haskell. However, current Haskell database interface libraries (like Persistent and HaskellDB) are anything but. For instance, both make extended use of Template Haskell. Although a useful language feature, Template Haskell suffers from several disadvantages, namely its complexity and lack of type safety. Additionally, both fail to cover several common SQL use cases. For example, Persistent doesn’t even support foreign keys and joins.

It doesn’t have to be this way. Haskell is increasingly being used on the web, and in order to be a serious web language, Haskell needs a good database backend. Several new GHC extensions, such as Generics, Closed Type Families, and Default Signatures allow us to elegantly and succinctly express everything that Yesod and HaskellDB used Template Haskell for. In terms of power, nothing stops these libraries from fully realizing the power of SQL, but it does help to start from clean and simple abstractions.

To summarize, Beam’s primary features and differentiators are:

Declarative syntax - Beam schemas are defined declaratively at the type-level as POHT (plain old Haskell types). Type-level programming and generic deriving is used to automatically derive instances and provide proper type-checking.
Comprehensive - Beam aims to be a comprehensive library. This means that it will at least support all standard SQL features, including all types of joins and nested queries. This is incomplete as of now.
Leaky - Beam does not attempt to shield the user from the idiosyncracies of the underlying RDBMS. Users should use the RDBMS that they are most familiar with and that best meets the performance and feature requirements of the application. This leakiness also means that Beam can expose backed-specific functions to user queries. For example, PostgreSQL backend could easily provide PostgreSQL-specific functions to work with arrays, JSON, and GIS data. Because of Haskell, these functions could be type-checked statically at compile time.
Backend-agnostic - Although Beam does not try to standardize RDBMS behavior, the core does aim to be agnostic of the underlying backend. If you use only standard Beam features in your queries and standard data types in your schemas, then you should be able to swap the backend out anytime (including at run time) and have the same SQL generated (notice I said the same SQL – this does not mean that the results will be the same).
No Template Haskell - Beam does not make use of Template Haskell. Rather, it uses several advanced extensions, such as DeriveGeneric, DeriveDataTypeable, TypeFamilies, MultiParamTypeClasses, etc. Although this means that Beam is not 100% Haskell 2010, these syntax extensions embody the “spirit of Haskell” much better than Template Haskell. Additionally, unlike Template Haskell, these extensions are not so much tied to the structure of GHC.

Defining our first Beam database schema

In Beam, everything is done via plain old Haskell data types. Let’s define a simple todo list database in Beam, and then use this schema to make queries on a SQLite3 database. We begin by defining type-level names for our columns. We will have two tables in our schema: one for todo lists and one for todo items. A todo list will have two columns: a name and a description. A todo item will have three: a name, a description, and a foreign key to the list it belongs to. We will need to use the DeriveGeneric and DeriveDataTypeable extensions in order to allow our data types to play nicely with Beam.

Notice that these definition are given in plain old Haskell. No crufty Template Haskell DSLs here!

{-# LANGUAGE DeriveGeneric, DeriveDataTypeable, StandaloneDeriving, OverloadedStrings, FlexibleInstances #-}
module BeamExample where

import Database.Beam
import Data.Text (Text)
data TodoList column = TodoList
                    { todoListName        :: column Text
                    , todoListDescription :: column Text }
                deriving (Generic, Typeable)
data TodoItem column = TodoItem
                     { todoItemList        :: ForeignKey TodoList column
                     , todoItemName        :: column Text
                     , todoItemDescription :: column Text }
                       deriving (Generic, Typeable)

Column constructors

Notice that each of our types takes in a special column type argument. This is called a column constructor and is used by beam to co-opt our data type to play several different roles. Beam will use these data types in several different contexts, such as to set column options, to construct query clauses, and to store data.

Most of the time, we’ll be using the Column column constructor, which is defined as a simple newtype.

newtype Column = Column a

Because it’s a newtype, using Column has no runtime overhead, but it does mean that we will need to explicitly wrap and unwrap Column values. This can be done with the column and columnValue functions, which have the types.

column :: a -> Column a
columnValue :: Column a -> a

The other common column constructor is Nullable Column. This wraps the stored value with Maybe and lets you make nullable foreign keys. The column and columnValue functions are overloaded to work on this type as well.

column :: Maybe a -> Nullable Column a
columnValue :: Nullable Column a -> Maybe a

A note on deriving

Because of the complicated nature of our types, GHC won’t be able to derive Show instances using the regular deriving mechanism, but this is easily fixed with StandaloneDeriving. We only need to define instances of Show for our types parameterized with the special Column constructor.

deriving instance Show (TodoList Column)
deriving instance Show (TodoItem Column)

Interfacing with Beam

Now that we’ve defined our table data types, we need to define a few instances so that Beam can work its magic. For each of our table types, we’ll need to instantiate the Table type class.

instance Table TodoItem
instance Table TodoList

When GHC 7.10 comes out, we will be able to specify these Table instances as part of the deriving declaration with the DeriveAnyClass extension, but for now they have to be separate.

Because of the Generic instances, Beam can fully derive this class for us. The Table instance for a type controls how it’s mapped to SQL. Among other things, it determines its table name, the column names, and the column types and constraints. The default instance names the table after the Haskell type, and names the columns after the selector names. It also chooses an appropriate SQL type to hold the Haskell datatype. Finally, it adds a primary key column named “id.” You can override these instances if you’d like to rename a column, rename the table, or set the SQL type of a column, but most of the time, the default one is fine.

Voilà! That’s it! These type are ready to be used in Beam.

Querying the database

We’re almost ready to use these types in a real database. First though, we need to create a Database object with the right tables.

todoListDb :: Database
todoListDb = database_
           [ table_ (schema_ :: Simple TodoList)
           , table_ (schema_ :: Simple TodoItem) ]

Simple table is a type synonym for table Column, so Simple TodoList is simply our TodoList datatype with the standard Column constructor.

Now, we can use this object to allow Beam to automatically migrate a database to match this database schema. Save all your work in a Haskell file, fire up GHCi, and let’s begin.

> :load <name-of-file>
> import Database.Beam.Backend.Sqlite3
> beam <- openDatabase todoListDb (Sqlite3Settings "beam.db")

Creating some test data

Before we can query this, let’s add in some test data. We’ll add in two lists with two items each, and one empty todo list.

> :set -XOverloadedStrings
> :{
| let todoLists = [ TodoList (column "List 1") (column "Description for list 1")
|                 , TodoList (column "List 2") (column "Description for list 2")
|                 , TodoList (column "List 3") (column "Description for list 3") ]
| :}
> Success [list1, list2, list3] <- inBeamTxn beam $ mapM insert todoLists
...
> :{
| let todoItems = [ TodoItem (ref list1) (column "Item 1") (column "This is item 1 in list 1")
|                 , TodoItem (ref list1) (column "Item 2") (column "This is item 2 in list 1")
|                 , TodoItem (ref list2) (column "Item 1") (column "This is item 1 in list 2")
|                 , TodoItem (ref list2) (column "Item 2") (column "This is item 2 in list 2") ]
| :}
> inBeamTxn beam $ mapM insert todoItems
...

Our first query

First, let’s try to get all the todo lists.

> inBeamTxn beam $ queryList (all_ (of_ :: Simple TodoList))
[Entity (PK (Column 1) (TodoList (Column "List 1") (Column "Description for list 1")), ...]

We’re using the queryList function to get our results as a list. The normal query function returns a Source from the conduit package, which is usually easier and safer to work with when writing an application, but is not as intuitive when working on the command line.

To understand queries, let’s take a look at the types.

> :type (all_ (of_ :: Simple TodoList))
all_ (of_ :: Simple TodoList) :: Query (Entity TodoList Column)

All queries have types Query a. When run, queries of type Query a return rows of type a. The all_ query returns a Query that returns rows from a given table. The Entity type packages a table (the data types we defined) along with its phantom fields (defined in the type class). Phantom fields are fields that exist in the database but are not mapped to our data type. By default, we use the phantom fields to store the table’s primary key. Therefore, an Entity a Column, by default, stores the table a and its primary key.

Relationships

Now, let’s try to get all the TodoItems associated with list1.

> inBeamTxn beam $ queryList (todoItemList <-@ list1)

The f <-@ query combinator takes in an entity(query) and a selector(f) from another table that is a ForeignKey to that entity, and returns all elements of table where the ForeignKey points to the entity.

Next, let’s write a query to get the TodoItems along with their TodoLists.

> inBeamTxn beam $ queryList (all_ (of_ :: Simple TodoItem) ==> todoItemList)

The q ==> f combinator takes a query (q) and a selector for the table that references a ForeignKey. It returns a query that performs an inner join.

Suppose you wanted all TodoLists regardless of whether they had an associated TodoItem. In this case, we can simply use the right join selection

> inBeamTxn beam $ queryList (all_ (of_ :: Simple TodoList) <=? todoItemList)

For more complicated queries, see the example on github.

Known limitations

Beam is still in a very experimentl stage. You should not use Beam for production systems yet. While most SQL works correctly when dealing with tables, there are still known problems in dealing with projections involving standalone expressions. These will be fixed in time.

Additionally, the Beam API is still in a constant state of flux. In some ways, it’s a bit obtuse, since I basically designed it to match the way I think. However, this is probably unintuitive for many people. As more people use the library and more feedback is generated, the API will be adapted to make it more intuitive and understandable.

Lastly, Beam currently outputs all the SQL that it runs onto stdout, which is not very elegant, but helps me with debugging!

That being said, please leave your comments! I’m always looking for feedback.

Travis
Athougies

Travis
Athougies

Navigation

Tags

Beam - A typesafe Haskell database interface

Defining our first Beam database schema

Column constructors

A note on deriving

Interfacing with Beam

Querying the database

Creating some test data

Our first query

Relationships

Known limitations

TravisAthougies

TravisAthougies

Navigation

Tags

Beam - A typesafe Haskell database interface

Defining our first Beam database schema

Column constructors

A note on deriving

Interfacing with Beam

Querying the database

Creating some test data

Our first query

Relationships

Known limitations

Travis
Athougies

Travis
Athougies