Friday, June 17, 2016

Dependently typed red-black tree in Haskell, part 4

A rather large jump from page 3 to 4. The tl;dr, page 4 implements insert. The slightly longer version: there are several bugs in page 3, and insert requires implementing many cases.

Unfortunately, I did not keep notes through the process, and really, no blog post could capture the frustration and awe of implementing insert. Having the compiler say "No, don't do that; you're violating the black count," is very cool.

First, a few big bugs in page 3. Can you spot them?
data RedZip n a where
        RedZip ::
          a -> Black n a -> Black n a -> BlackCrumb (S n) a -> RedZip n a
        TempZip ::
          a -> Black n a -> Black n a -> RedCrumb n a -> RedZip n a

data BlackZip n a where
        LeafZip :: RBCrumb Z a -> BlackZip Z a
        Black2Zip ::
          a -> Black n a -> Black n a -> RBCrumb (S n) a -> BlackZip (S n) a
        Black3Zip ::
          a -> Red n a -> Black n a -> RBCrumb (S n) a -> BlackZip (S n) a

There are three:
  1. TempZip was missing. TempZip captures the idea of consecutive red nodes, which may occur during insertion. Note that since ultimately, the zip must be unwound into nodes, this has no effect on the integrity of the red-black constraint.
  2. Black4Zip is unnecessary. Turns out I should have read all of the Sedgewick paper [1] before I started, but Black4 can be converted to Red.
  3. LeafZip was wrong. This one is quite subtle. In the constructor, we need (RBCrumb Z a), rather than (RBCrumb n a) -- the difference is we know a something extra about the leaf zipper. In particular, we know exactly the black depth of the parent, and the type system needs that information to handle all the cases.
Additionally, there were several problems with the "crumbs," most important being I was missing the right/left information. For details, see Page4/Types.hs

Its hard to say much about the insert implementation. Overall, it's a hundred lines of very dense ADT destructruing and reconstruction. Through the process, I was able to significantly simplify the code by eliminating several "crumb" and "zipper" constructors. Mostly, it was fun and frustrating having the type system yell at me for trying to violate red-black and black-count constraints.

A closing remark: QuickCheck is amazing. The basic property we are looking for from a red-black tree is that a traversal yields the elements in sorted order. This is trivial to assert with QuickCheck (see Lib.hs). Importantly, this allows us to verify the final constraint, that we have a binary search tree. The QuickCheck one line-er caught several bugs in the afore mentioned dense ADT code.

[1] Sedgewick, R.: Left-leaning red-black trees, http://www.cs.princeton.edu/rs/talks/LLRB/LLRB.pdf

Saturday, November 7, 2015

Dependently typed red-black tree in Haskell, part 3

We left off with the basic GADTs defined to capture a red-black tree, but we had no way to insert or delete values. To implement these operations, I will use zippers via some auxiliary GADTs (I have heard of another pattern, lenses, that, from what I can tell, solve a similar problem to zippers, though I don't yet understand lenses. If lenses, or something else, could solve this problem more elegantly, I'd appreciate any references).

Zippers bread crumbs allow us to move up and down the tree to manipulate the state deep inside the tree.
type RB a n = Either (Red a n) (Black a n)

type RBCrumb a n = Either (RedCrumb a n) (BlackCrumb a (S n))

data RedCrumb a n where
  RootCrumb :: RedCrumb a n
  RedCrumb :: a -> (Black a n) -> (BlackCrumb a (S n)) -> RedCrumb a n

data BlackCrumb a n where
  BlackCrumb :: a -> (RB a n) -> (RBCrumb a (S n)) -> BlackCrumb a (S n)

type RBZip a n = Either (RedZip a n) (BlackZip a n)

data RedZip a n where
  RedZip :: a -> (Black a n) -> (Black a n) -> (BlackCrumb a (S n)) -> RedZip a n

data BlackZip a n where
  LeafZip :: (RBCrumb a n) -> BlackZip a Z
  Black2Zip :: a -> (Black a n) -> (Black a n) -> (RBCrumb a (S n)) -> BlackZip a (S n)
  Black3Zip :: a -> (Red a n) -> (Black a n) -> (RBCrumb a (S n)) -> BlackZip a (S n)
  Black4Zip :: a -> (Red a n) -> (Red a n) -> (RBCrumb a (S n)) -> BlackZip a (S n)
The zipper mirrors the structure of the tree, with the added "crumb" that acts like the parent pointer in an imperative implementation. The crumb is somewhat dual to the tree: there is no leaf crumb since you can never be below a leaf; instead, we have a red "root" crumb, with no associated value, that lies above the root element of the tree.
A couple auxiliary functions for converting nodes into zippers.
toBlackZip :: Black a n -> RBCrumb a n -> BlackZip a n
toBlackZip Leaf c = LeafZip $ c
toBlackZip (Black2 v l r) c = Black2Zip v l r c
toBlackZip (Black3 v l r) c = Black3Zip v l r c
toBlackZip (Black4 v l r) c = Black4Zip v l r c

toRedZip :: Red a n -> BlackCrumb a (S n) -> RedZip a n
toRedZip (Red v l r) c = RedZip v l r c
Next step is functions to find the leaf where a new value would fit in. Nothing to exciting, most interesting part was seeing the type checker catch my bugs.
findLeafB :: (Ord a) => a -> BlackZip a n -> BlackZip a Z
findLeafB u z@(LeafZip c) = z
findLeafB u (Black2Zip v l r c)
  | u < v     = findLeafB u $ toBlackZip l (Right $ BlackCrumb v (Right r) c)
  | otherwise = findLeafB u $ toBlackZip r (Right $ BlackCrumb v (Right l) c)
findLeafB u (Black3Zip v l r c)
  | u < v     = findLeafR u . toRedZip l $ BlackCrumb v (Right r) c
  | otherwise = findLeafB u $ toBlackZip r (Right $ BlackCrumb v (Left l) c)
findLeafB u (Black4Zip v l r c)
  | u < v     = findLeafR u . toRedZip l $ BlackCrumb v (Left r) c
  | otherwise = findLeafR u $ toRedZip r $ BlackCrumb v (Left r) c

findLeafR :: (Ord a) => a -> RedZip a n -> BlackZip a Z
findLeafR u (RedZip v l r c)
  | u < v     = findLeafB u $ toBlackZip l (Left $ RedCrumb v r c)
  | otherwise = findLeafB u $ toBlackZip r (Left $ RedCrumb v l c)
Now, to insert the new value, we replace the leaf zipper with a red zipper containing the new value
inject :: a -> BlackZip a Z -> RedZip a Z
inject v (LeafZip c) = RedZip v Leaf Leaf c
But this won't type-check! The parent of a red node must be black (which is enforced by the type system), while the parent of a black node, like a leaf, can be either red or black. To handle this, I will need to augment the red zipper to allow some illegal, transient states. However, since these extra states are not part of the node GADTs, it is still impossible to construct an illegal tree!

Wednesday, November 4, 2015

Dependently typed red-black tree in Haskell, part 2

Red-black constraint

The basic red-black constraint can easily be captured by a pair of mutually recursive ADTs
type RedBlack a = Either (Red a) (Black a)

data Red a = Red a (Black a) (Black a)

data Black a = Leaf |
               Black a (RedBlack a) (RedBlack a)

However, we also wanted this to be a left leaning red-black tree, meaning all 3-nodes lean left. This can also be build into the ADTs
data Red a = Red a (Black a) (Black a)

data Black a = Leaf |
               Black2 a (Black a) (Black a) |
               Black3 a (Red a) (Black a) |
               Black4 a (Red a) (Red a)

Black depth constraint

Black depth is the fun part, where we parameterize the ADT by a natural number to track the depth. First we'll need to switch to GADTs, so we need to enable the GHC extension. Additionally, we need StandaloneDeriving to get things like Show for GADTs.
{-# LANGUAGE GADTs, StandaloneDeriving #-}

data Red a where
  Red :: a -> (Black a) -> (Black a) -> Red a

data Black a where
  Leaf :: Black a
  Black2 :: a -> (Black a) -> (Black a) -> Black a
  Black3 :: a -> (Red a) -> (Black a) -> Black a
  Black4 :: a -> (Red a) -> (Red a) -> Black a

deriving instance Show a => Show (Black a)
deriving instance Show a => Show (Red a)

Now to parameterize by black depth, n
{-# LANGUAGE DataKinds, GADTs, StandaloneDeriving #-}

module Page2 (Black(Leaf)) where

import Data.Type.Natural

data Red a n where
  Red :: a -> (Black a n) -> (Black a n) -> Red a n

data Black a n where
  Leaf :: Black a Z
  Black2 :: a -> (Black a n) -> (Black a n) -> Black a (S n)
  Black3 :: a -> (Red a n) -> (Black a n) -> Black a (S n)
  Black4 :: a -> (Red a n) -> (Red a n) -> Black a (S n)

deriving instance Show a => Show (Black a n)
deriving instance Show a => Show (Red a n)

A couple interesting points here. Note how we explicitly declare that leaves have a black depth of zero (Z). Red nodes have the same black depth as their children. Black nodes are one level deeper than their children (children, the second and third arguments of the constructors, have a depth of n, while current node has a depth of (S n), ie, n+1).

One more detail, the depth parameter is a pain because it changes on inserts and deletes. Even worse, it only changes on some inserts and deletes (and is, in general, path dependent). Luckily, we can hide the depth behind another GADT.

data Root a where
  Root :: Black a n -> Root a

And that's about it. Given these definitions, it is impossible to construct a tree that violates any of these constraints. That just leaves the binary search tree constraint, ie, proving that all the tree operations maintain sort order. However, this is tricky, especially given GHC's limited support for dependent types.

Last step is to implement insert and delete.

Monday, November 2, 2015

Dependently typed red-black tree in Haskell, part 1

Introduction

Dependent types, and the idea of baking both algorithmic and business constraints into the type system are very exciting to me. However, beyond a few toy examples of Peano arithmetic and vectors, I haven't found a lot of examples online. So in the spirit of scratching my own itch, I am going to working through implementing a left-leaning red-black tree, capturing as many of the constraints as possible in the type system.

A warning at the outset, I am relatively new to Haskell and very new to dependent types, so I welcome any feedback on both my Haskell, if it is not idiomatic,  my use of dependent types and my choice of library dependencies. This series will be based entirely on GHC, and I will make liberal use of GHC extensions.

Find my Github repository at:
https://github.com/farrellm/dependent-rbtree

Dependencies

Note that GHC does have type-level natural numbers built in1. However, in my experimentation, these seem not to be useful for more complex use cases. Rather I will be using the following libraries:

  • singletons - Seems to be the de factor standard for dependent types in GHC, I depend on it transitively
  • type-natural - A powerful library for type level natural numbers implemented in with the singletons library
Also, I will be using stack, rather than cabal, to manage the project. At the moment, stack seems to be a saner tool for managing Hackage dependencies.

Overview

Red-black trees have several defining properties2:
  1. Binary search tree
  2. Alternating red-black properties
  3. Consistent black-height
Left-leaning red-black trees add:
  1. Require that all 3-nodes lean left.
Property 1 is beyond the scope of this blog; at some future time I may attempt to port this to Idris, at which time I will attempt 1.

Property 2 can be easily captured using ADTs.

Property 3 is my primary target with dependent types.

Property 4 I haven't thought about enough yet :-)

Tuesday, April 2, 2013

Clojure on Windows


Mostly this is stolen from http://onbeyondlambda.blogspot.com/2012/01/setting-up-clojure-emacs-on-windows.html with some updates where appropriate.
  1. Install Java 7 JDK
  2. Use environment variables:
    JAVA_HOME = C:\Program Files\Java\jdk1.7.0_03Path = %Path%;%Java_Home%\bin;c:\lein
    JAVA_CMD = 
    %java_home%/bin/java.exe
  3.  Make directory c:\lein
  4. Extract to c:\lein
    http://www.paehl.com/open_source/?download=curl_729_0_ssl.ziphttps://raw.github.com/technomancy/leiningen/stable/bin/lein.bat
  5. Open a command prompt (win-r, "cmd")
  6. Run "lein self-install". Test install was successful by running "lein repl"
  7. Download Emacs. Extract it anywhere you like, and run "bin/runemacs"
    http://mirror.clarkson.edu/gnu/gnu/emacs/windows/emacs-24.3-bin-i386.zip
  8. Open ~/.emacs (<Ctrl>-X <Ctrl>-F ~/.emacs <Enter>) and add the following to enable an extra package repository, then restart:
    ;;; ELPA
    (require 'package)
    (add-to-list 'package-archives
        '("marmalade" . "http://marmalade-repo.org/packages/"))
  9. Install package "nrepl" (<Alt>-x package-install <Enter> nrepl <Enter>). This will also install "clojure-mode" as it is a dependency of nrepl.

Now, on to scad-clj!

  1. Download and install Openscad
    https://openscad.googlecode.com/files/OpenSCAD-2013.01-Installer.exe
  2. Download the project from github and extract anywhere you like
    https://github.com/farrellm/scad-clj/archive/master.zip
  3. Open a sample design, eg, "scad-clj-master/src/scad_clj/designs/cage.clj"
  4. Start a REPL (<Alt>-x nrepl-jack-in)

Friday, January 6, 2012

Tail recursion in Python

Admittedly, not the most "pythonic" code, whatever that means...

#!/usr/bin/python

from inspect import isfunction

recur = 0

def loop(func):
  def inner(a,b):
    global recur
    old_recur = recur
    recur = lambda a,b: lambda:func(a,b)
    ret = func(a,b)
    while isfunction(ret):
      ret = ret()
    recur = old_recur
    return ret
  return inner

Which creates a nice decorator:

@loop
def foo(n, a):
  if n == 0:
    return a
  else:
    return recur(n-1,a*n)

print(foo(6,1))

Thursday, January 5, 2012

Tail recursion in Perl

I finally get to use local! This really should be wrapped up in some type of module, but it is getting late.

#!/usr/bin/perl -w

$recur = 0;

sub loop($) {
    my ($func) = @_;

    sub {
        local $recur = sub {
            my @args = @_;
            sub { $func->(@args); }
        };

        my $ret = $func->(@_);
        while ('CODE' eq ref($ret)) {
           $ret = $ret->(); 
        }
        $ret;
    };
}


Example usage:
my $foo = loop sub {
    my ($n, $a) = @_;
    if ($n == 0) {
        $a;
    }
    else {
        $recur->($n-1, $a*$n);
    }
};

print $foo->(999999,1) . "\n";