Folding the unfoldable

Posted on 2022-01-25 by Oleg Grenrus

You may be aware of Foldable type-class. It’s quite useful one. For example, instead of writing your own sum1 as

sum' :: Num a => [a] -> a
sum' = Data.List.foldl' (+) 0

you may generalize it to an arbitrary Foldable2:

sum' :: (Foldable f, Num a) => f a -> a
sum' = Data.Foldable.foldl' (+) 0

And the everything would be great...

... except if your data comes in unboxed vector. You may try to use that generic sum algorithm:

values :: U.Vector Double
values = U.fromList [1,2,3,4,5,6,7,7,7]

result :: Double
result = sum' values

and then GHC says, without further explanation:

No instance for (Foldable U.Vector) arising from a use of ‘sum'’

"Why not?!" you wonder.

Unboxed vectors are backed by bytearrays, so you need an Unbox instance to be able to read (or write) any values from there. (That’s different from e.g. Set, which is Foldable, as you can walk the structure of Set without having Ord instance for the elements).

Bummer.

One idea is to

data Bundle a where
    Bundle :: U.Unbox a => U.Vector a -> Bundle a

When the Unbox instance is next to the data, we will be able to write Foldable instance: pattern match on the Bundle, use the "local" instance to fold. However, people have told me, that sometimes it doesn’t work that well: GHC may not specialize things, even the dictionary is (almost) right there. Though in my small experiments it did:

sumU :: (Num a, U.Unbox a) => U.Vector a -> a
sumU xs = sum' (Bundle xs)

produced nice loops.

Yet, having to bundle instance feels somehow wrong. Distant data type contexts vibes, brr..

There is another way to make Foldable work, with a

data Hack a b where
    Hack :: U.Vector a -> Hack a a

This is a two type-parameter wrapper, but the types are always the same! (I wish that could be a newtype). The Foldable instance is simply:

instance U.Unbox a => Foldable (Hack a) where
    foldr f z (Hack v)  = U.foldr f z v
    foldl' f z (Hack v) = U.foldl' f z v

    ...

and specialized sum' for unboxed vector looks the same as with Bundle:

sumU :: (Num a, U.Unbox a) => U.Vector a -> a
sumU xs = sum' (Hack xs)

but now Unbox instance comes from the "outside": it’s required by Foldable (Hack a) instance, not to wrap vector in Hack. When GHC sees just Foldable (Hack X) ... it could already start simplifying stuff, if it knows something about X (i.e. its Unbox instance), without waiting to see what the members of the instance are applied to!

We could write also write

{-# SPECIALIZE instance Foldable (UV Double) #-}

to force GHC do some work in advance. We couldn’t with Bundle approach.

Is this Hack terrible or terrific? I’m not sure, yet.

Anyhow, that’s all I have this time. This (just a little) tongue-in-cheek post is "inspired" by the fact that statistics package wants unboxed vectors everywhere, for "performance" reasons, and that is soooo inconvenient.

Please, use Foldable for inputs you will fold over anyway. (Asking for a selector function, like foldMap would avoid creating intermediate structures!). People can choose to Bundle or Hack their way around to provide unboxed (or storable) vectors or primarrays to your algorithm, and others don’t need to suffer when they play with your lib in the GHCi.

P.S. I leave this here:

data HackText a where
    HackText :: Text -> HackText Char

P.P.S. I know there is MonoFoldable, and lens with its Folds and a lot of other stuff. But Foldable is right there, in our Prelude!

-- e.g. with optics' Each:
data O s a b where
    O :: s -> O s a a

instance Each i s s a a => Foldable (O s a) where
    foldMap f  (O x) = foldMapOf each f   x
    foldr  f z (O x) = foldrOf   each f z x
    foldl' f z (O x) = foldlOf'  each f z x

    {-# SPECIALIZE instance Foldable (O (U.Vector Double) Double) #-}

-- works too
sumO :: (Num a, U.Unbox a) => U.Vector a -> a
sumO xs = sum' (O xs)

  1. it’s already that way in base, check yourself https://hackage.haskell.org/package/base-4.16.0.0/docs/src/GHC.List.html#sum↩︎

  2. Though you probably should write it using strict foldMap' as in base https://hackage.haskell.org/package/base-4.16.0.0/docs/src/Data.Foldable.html#sum to let container decide how to do it best↩︎


Site proudly generated by Hakyll