Saturday, November 7, 2015

Dependently typed red-black tree in Haskell, part 3

We left off with the basic GADTs defined to capture a red-black tree, but we had no way to insert or delete values. To implement these operations, I will use zippers via some auxiliary GADTs (I have heard of another pattern, lenses, that, from what I can tell, solve a similar problem to zippers, though I don't yet understand lenses. If lenses, or something else, could solve this problem more elegantly, I'd appreciate any references).

Zippers bread crumbs allow us to move up and down the tree to manipulate the state deep inside the tree.
type RB a n = Either (Red a n) (Black a n)

type RBCrumb a n = Either (RedCrumb a n) (BlackCrumb a (S n))

data RedCrumb a n where
  RootCrumb :: RedCrumb a n
  RedCrumb :: a -> (Black a n) -> (BlackCrumb a (S n)) -> RedCrumb a n

data BlackCrumb a n where
  BlackCrumb :: a -> (RB a n) -> (RBCrumb a (S n)) -> BlackCrumb a (S n)

type RBZip a n = Either (RedZip a n) (BlackZip a n)

data RedZip a n where
  RedZip :: a -> (Black a n) -> (Black a n) -> (BlackCrumb a (S n)) -> RedZip a n

data BlackZip a n where
  LeafZip :: (RBCrumb a n) -> BlackZip a Z
  Black2Zip :: a -> (Black a n) -> (Black a n) -> (RBCrumb a (S n)) -> BlackZip a (S n)
  Black3Zip :: a -> (Red a n) -> (Black a n) -> (RBCrumb a (S n)) -> BlackZip a (S n)
  Black4Zip :: a -> (Red a n) -> (Red a n) -> (RBCrumb a (S n)) -> BlackZip a (S n)
The zipper mirrors the structure of the tree, with the added "crumb" that acts like the parent pointer in an imperative implementation. The crumb is somewhat dual to the tree: there is no leaf crumb since you can never be below a leaf; instead, we have a red "root" crumb, with no associated value, that lies above the root element of the tree.
A couple auxiliary functions for converting nodes into zippers.
toBlackZip :: Black a n -> RBCrumb a n -> BlackZip a n
toBlackZip Leaf c = LeafZip $ c
toBlackZip (Black2 v l r) c = Black2Zip v l r c
toBlackZip (Black3 v l r) c = Black3Zip v l r c
toBlackZip (Black4 v l r) c = Black4Zip v l r c

toRedZip :: Red a n -> BlackCrumb a (S n) -> RedZip a n
toRedZip (Red v l r) c = RedZip v l r c
Next step is functions to find the leaf where a new value would fit in. Nothing to exciting, most interesting part was seeing the type checker catch my bugs.
findLeafB :: (Ord a) => a -> BlackZip a n -> BlackZip a Z
findLeafB u z@(LeafZip c) = z
findLeafB u (Black2Zip v l r c)
  | u < v     = findLeafB u $ toBlackZip l (Right $ BlackCrumb v (Right r) c)
  | otherwise = findLeafB u $ toBlackZip r (Right $ BlackCrumb v (Right l) c)
findLeafB u (Black3Zip v l r c)
  | u < v     = findLeafR u . toRedZip l $ BlackCrumb v (Right r) c
  | otherwise = findLeafB u $ toBlackZip r (Right $ BlackCrumb v (Left l) c)
findLeafB u (Black4Zip v l r c)
  | u < v     = findLeafR u . toRedZip l $ BlackCrumb v (Left r) c
  | otherwise = findLeafR u $ toRedZip r $ BlackCrumb v (Left r) c

findLeafR :: (Ord a) => a -> RedZip a n -> BlackZip a Z
findLeafR u (RedZip v l r c)
  | u < v     = findLeafB u $ toBlackZip l (Left $ RedCrumb v r c)
  | otherwise = findLeafB u $ toBlackZip r (Left $ RedCrumb v l c)
Now, to insert the new value, we replace the leaf zipper with a red zipper containing the new value
inject :: a -> BlackZip a Z -> RedZip a Z
inject v (LeafZip c) = RedZip v Leaf Leaf c
But this won't type-check! The parent of a red node must be black (which is enforced by the type system), while the parent of a black node, like a leaf, can be either red or black. To handle this, I will need to augment the red zipper to allow some illegal, transient states. However, since these extra states are not part of the node GADTs, it is still impossible to construct an illegal tree!

Wednesday, November 4, 2015

Dependently typed red-black tree in Haskell, part 2

Red-black constraint

The basic red-black constraint can easily be captured by a pair of mutually recursive ADTs
type RedBlack a = Either (Red a) (Black a)

data Red a = Red a (Black a) (Black a)

data Black a = Leaf |
               Black a (RedBlack a) (RedBlack a)

However, we also wanted this to be a left leaning red-black tree, meaning all 3-nodes lean left. This can also be build into the ADTs
data Red a = Red a (Black a) (Black a)

data Black a = Leaf |
               Black2 a (Black a) (Black a) |
               Black3 a (Red a) (Black a) |
               Black4 a (Red a) (Red a)

Black depth constraint

Black depth is the fun part, where we parameterize the ADT by a natural number to track the depth. First we'll need to switch to GADTs, so we need to enable the GHC extension. Additionally, we need StandaloneDeriving to get things like Show for GADTs.
{-# LANGUAGE GADTs, StandaloneDeriving #-}

data Red a where
  Red :: a -> (Black a) -> (Black a) -> Red a

data Black a where
  Leaf :: Black a
  Black2 :: a -> (Black a) -> (Black a) -> Black a
  Black3 :: a -> (Red a) -> (Black a) -> Black a
  Black4 :: a -> (Red a) -> (Red a) -> Black a

deriving instance Show a => Show (Black a)
deriving instance Show a => Show (Red a)

Now to parameterize by black depth, n
{-# LANGUAGE DataKinds, GADTs, StandaloneDeriving #-}

module Page2 (Black(Leaf)) where

import Data.Type.Natural

data Red a n where
  Red :: a -> (Black a n) -> (Black a n) -> Red a n

data Black a n where
  Leaf :: Black a Z
  Black2 :: a -> (Black a n) -> (Black a n) -> Black a (S n)
  Black3 :: a -> (Red a n) -> (Black a n) -> Black a (S n)
  Black4 :: a -> (Red a n) -> (Red a n) -> Black a (S n)

deriving instance Show a => Show (Black a n)
deriving instance Show a => Show (Red a n)

A couple interesting points here. Note how we explicitly declare that leaves have a black depth of zero (Z). Red nodes have the same black depth as their children. Black nodes are one level deeper than their children (children, the second and third arguments of the constructors, have a depth of n, while current node has a depth of (S n), ie, n+1).

One more detail, the depth parameter is a pain because it changes on inserts and deletes. Even worse, it only changes on some inserts and deletes (and is, in general, path dependent). Luckily, we can hide the depth behind another GADT.

data Root a where
  Root :: Black a n -> Root a

And that's about it. Given these definitions, it is impossible to construct a tree that violates any of these constraints. That just leaves the binary search tree constraint, ie, proving that all the tree operations maintain sort order. However, this is tricky, especially given GHC's limited support for dependent types.

Last step is to implement insert and delete.

Monday, November 2, 2015

Dependently typed red-black tree in Haskell, part 1

Introduction

Dependent types, and the idea of baking both algorithmic and business constraints into the type system are very exciting to me. However, beyond a few toy examples of Peano arithmetic and vectors, I haven't found a lot of examples online. So in the spirit of scratching my own itch, I am going to working through implementing a left-leaning red-black tree, capturing as many of the constraints as possible in the type system.

A warning at the outset, I am relatively new to Haskell and very new to dependent types, so I welcome any feedback on both my Haskell, if it is not idiomatic,  my use of dependent types and my choice of library dependencies. This series will be based entirely on GHC, and I will make liberal use of GHC extensions.

Find my Github repository at:
https://github.com/farrellm/dependent-rbtree

Dependencies

Note that GHC does have type-level natural numbers built in1. However, in my experimentation, these seem not to be useful for more complex use cases. Rather I will be using the following libraries:

  • singletons - Seems to be the de factor standard for dependent types in GHC, I depend on it transitively
  • type-natural - A powerful library for type level natural numbers implemented in with the singletons library
Also, I will be using stack, rather than cabal, to manage the project. At the moment, stack seems to be a saner tool for managing Hackage dependencies.

Overview

Red-black trees have several defining properties2:
  1. Binary search tree
  2. Alternating red-black properties
  3. Consistent black-height
Left-leaning red-black trees add:
  1. Require that all 3-nodes lean left.
Property 1 is beyond the scope of this blog; at some future time I may attempt to port this to Idris, at which time I will attempt 1.

Property 2 can be easily captured using ADTs.

Property 3 is my primary target with dependent types.

Property 4 I haven't thought about enough yet :-)